CN111461302B - Data processing method, device and storage medium based on convolutional neural network - Google Patents

Data processing method, device and storage medium based on convolutional neural network Download PDF

Info

Publication number
CN111461302B
CN111461302B CN202010237669.8A CN202010237669A CN111461302B CN 111461302 B CN111461302 B CN 111461302B CN 202010237669 A CN202010237669 A CN 202010237669A CN 111461302 B CN111461302 B CN 111461302B
Authority
CN
China
Prior art keywords
ratio
weight
scaling
weights
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010237669.8A
Other languages
Chinese (zh)
Other versions
CN111461302A (en
Inventor
郭晖
张楠赓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canaan Bright Sight Co Ltd
Original Assignee
Canaan Bright Sight Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canaan Bright Sight Co Ltd filed Critical Canaan Bright Sight Co Ltd
Priority to CN202010237669.8A priority Critical patent/CN111461302B/en
Publication of CN111461302A publication Critical patent/CN111461302A/en
Application granted granted Critical
Publication of CN111461302B publication Critical patent/CN111461302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a data processing method based on a convolutional neural network, relates to the technical field of neural networks, and can solve the technical problem that data processing precision is affected due to hardware limitation. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; scaling the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights; combining the scaling sub-weights to obtain scaling weights; and carrying out global quantization on the scaling weight, so that the scaling weight after global quantization is used as a convolution kernel to be applied to a convolution layer of a convolution neural network, and carrying out data processing.

Description

Data processing method, device and storage medium based on convolutional neural network
Technical Field
The present application relates to the field of neural networks, and in particular, to a data processing method, apparatus and storage medium based on a convolutional neural network.
Background
The convolutional neural network (Convolutional Neural Network, CNN) refers to a feed-forward neural network (Feedforward Neural Network) that contains convolutional calculations and has a deep structure, which is one of the representative algorithms for deep learning (DEEP LEARNING). The architecture of convolutional neural networks generally includes an input layer, an hidden layer, and an output layer. The input layer of convolutional neural networks can typically process one-or multi-dimensional data; the hidden layer generally comprises a convolution layer, a pooling layer and a full connection layer, and performs operations such as convolution on the data output by the input layer; the output layer is often used for realizing the result output of the convolutional neural network, for example, for the image classification problem, the output layer can be designed to output the center coordinates, the size, the classification and the like of the object, for example, the classification result corresponding to each pixel in the image can be directly output.
In the data processing process of convolutional neural networks, very large input and output bandwidths are often required. In order to solve the above-mentioned problem, the data processing process involved in the convolutional neural network can be replaced by a fixed-point operation by a floating-point operation, so as to reduce the requirement of the input and output bandwidth of the data processing process. In the data processing process of the convolutional neural network, the corresponding convolutional process can be realized after the weights are quantized respectively according to the output channels in the data processing process corresponding to the convolutional layers. However, due to the influence of hardware performance such as KENDRYTE K AI chip, the convolution kernels obtained by respectively quantizing the weights according to the output channels generally influence the data processing precision of the convolutional neural network due to the same offset number, so that the accuracy of the corresponding output result in the implementation process such as image classification is reduced.
Disclosure of Invention
The application provides a data processing method, equipment and a storage medium based on a convolutional neural network, which are used for solving the technical problem that the data processing precision is affected due to hardware limitation.
In order to solve the problems, the technical scheme provided by the application is as follows:
In a first aspect, an embodiment of the present application provides a data processing method based on a convolutional neural network. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; scaling the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights; combining the scaling sub-weights to obtain scaling weights; and carrying out global quantization on the scaling weight, and carrying out data processing by using the globally quantized scaling weight as a convolution kernel to be applied to a convolution layer of the convolution neural network.
In one implementation, scaling the plurality of initial sub-weights and obtaining a plurality of scaled sub-weights may be implemented as: obtaining a scaling factor of each initial sub-weight; and scaling each initial sub-weight according to a corresponding scaling factor to obtain scaling sub-weights corresponding to each initial sub-weight.
In one implementation, the scaling factor for each initial sub-weight is obtained, which may be implemented as: acquiring a first maximum value and a first minimum value in each element of the initial weight; the following steps are performed for each initial sub-weight to obtain a scaling factor for each initial sub-weight: acquiring a second maximum value and a second minimum value in each element of the initial sub-weight; determining the ratio of the first maximum value to the second maximum value as a first ratio, and determining the ratio of the first minimum value to the second minimum value as a second ratio; and determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
In one implementation, determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio may be implemented as: if at least one negative number exists in the first ratio and the second ratio, determining that the scaling factor is the maximum value in the first ratio and the second ratio; and if the negative number does not exist in the first ratio and the second ratio, determining the scaling factor as the minimum value in the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In a second aspect, an embodiment of the present application provides a data processing apparatus based on a convolutional neural network. The device comprises:
And the acquisition module is used for acquiring the number of the output channels.
The processing module is used for segmenting the initial weights according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels.
And the processing module is also used for scaling the plurality of initial sub-weights respectively and obtaining a plurality of scaling sub-weights.
And the processing module is also used for combining the scaling sub-weights to obtain the scaling weight.
And the processing module is also used for carrying out global quantization on the scaling weight so as to apply the scaling weight after global quantization as a convolution kernel to a convolution layer of the convolution neural network for data processing.
In one implementation, the processing module is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to a corresponding scaling factor to obtain scaling sub-weights corresponding to each initial sub-weight.
In one implementation, the obtaining module is further configured to obtain a first maximum value and a first minimum value in each element of the initial weight. The following steps are performed for each initial sub-weight to obtain a scaling factor for each initial sub-weight: the acquisition module is further used for acquiring a second maximum value and a second minimum value in each element of the initial sub-weights. The processing module is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. And the processing module is also used for determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
The processing module is further configured to determine that the scaling factor is a maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio; and if the negative number does not exist in the first ratio and the second ratio, determining the scaling factor as the minimum value in the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In a third aspect, the present application provides a data processing apparatus based on a convolutional neural network, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect and any of its various possible implementations when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of the first aspect and any of its various possible implementations.
Compared with the condition that the data processing precision is affected due to the limitation of hardware in the prior art, in the embodiment of the application, the corresponding data processing process of the convolution layer adopts a process similar to the process of realizing corresponding convolution after respectively quantizing weights according to output channels. The method comprises the steps of dividing initial weights according to an output channel, processing all divided initial sub-weights, combining the scaled sub-weights obtained through processing, and globally quantizing the scaled weights obtained through combining, and applying the scaled weights to a data processing process of a convolution layer. Therefore, the original limited hardware is not required to be replaced, and the obtained result is close to the result obtained by processing after respectively quantizing the weights according to the output channels. Thus, the data processing precision can be improved under the condition that fixed point operation is adopted to replace floating point operation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flowchart of a data processing method based on convolutional neural network according to an embodiment of the present application;
FIG. 2 is a flowchart II of a data processing method based on convolutional neural network according to an embodiment of the present application;
FIG. 3 is a flowchart III of a data processing method based on convolutional neural network according to an embodiment of the present application;
fig. 4 is a flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a data processing apparatus based on convolutional neural network according to an embodiment of the present application;
Fig. 6 is a schematic diagram of a data processing device based on a convolutional neural network according to an embodiment of the present application.
Detailed Description
In order to more clearly illustrate the general inventive concept, a detailed description is given below by way of example with reference to the accompanying drawings.
The embodiment of the application provides a data processing method based on a convolutional neural network. The method can be applied to scenes involving convolutional neural networks, such as image classification, speech recognition, document analysis, and the like. By adopting the technical scheme provided by the embodiment of the application, the accuracy of data processing can be effectively improved.
For example, for an application scenario of image classification, by adopting the technical scheme provided by the embodiment of the application, under the condition that floating point operation is replaced by fixed point operation, the convolution kernel is scaled according to an output channel, global quantization is performed on the scaled convolution kernel, so that the convolution kernel finally used for convolution operation is obtained, and then the convolution operation is performed by using the finally obtained convolution kernel, so that the accuracy of data processing is improved. That is, when the technical scheme provided by the embodiment of the application is adopted to train the image classification model, more representative image features can be extracted, and the accuracy of feature recognition is higher, so that the image classification model obtained by training can obtain more accurate image classification results in the subsequent image processing process.
It should be noted that, in the actual application process, if convolution operation is still involved in executing image classification, a similar implementation method of the technical scheme provided by the embodiment of the application may also be adopted to obtain a corresponding image classification result. The image classification result obtained in this way is also improved in processing accuracy by the adjustment of the operation process.
Similarly, for the scenes related to the convolutional neural network, such as voice recognition, document analysis, and the like, corresponding effects similar to those obtained by adopting the technical scheme provided by the embodiment of the application for the image classification scene can be achieved, and the actual effects brought in the implementation process are not repeated herein, and can be referred to the content described above.
The technical scheme provided by the embodiment of the application is further described below in connection with corresponding execution steps of the data processing method based on the convolutional neural network provided by the embodiment of the application. As shown in fig. 1, the method may include S101 to S104.
S101, acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels.
In the embodiment of the application, the initial weight is determined to be divided into a plurality of initial sub-weights according to the number of output channels. Considering that the technical scheme provided by the embodiment of the application is to scale the initial weight and global quantization after scaling based on the output channels, in one implementation manner of the embodiment of the application, the initial weight can be segmented according to the number of the output channels. I.e. the number of initial sub-weights after segmentation is the same as the number of output channels. For example, the number of output channels is 3, and then after the initial weights are split, 3 initial sub-weights can be obtained.
Of course, in the implementation process, if at least part of the initial sub-weights after the segmentation have the same condition, at least part of the same plurality of initial sub-weights may also be scaled in the same manner. That is, for these initial sub-weights scaled in the same manner, a scaling factor may be determined to scale, thereby reducing the resources consumed in computing the scaling factor.
S102, scaling the initial sub-weights respectively to obtain scaled sub-weights.
In the process of scaling the plurality of initial sub-weights, at least part of the initial sub-weights can be scaled, i.e. all the initial sub-weights can be scaled or part of the initial sub-weights in the plurality of initial sub-weights can be scaled, so as to obtain corresponding scaled sub-weights. For the case that part of the initial sub-weights are scaled, which initial sub-weights are scaled and which initial sub-weights are not scaled, the scaling method is not limited in the embodiment of the application, and specifically can be adjusted according to factors such as the precision of the data processing process.
S103, combining the scaling sub-weights to obtain the scaling weight.
For the case of scaling of a part of the initial sub-weights in the plurality of initial sub-weights, the merging process in S103 includes merging of the non-scaled initial sub-weights and the scaled sub-weights obtained by scaling. The merging mode of the scaling weights can be obtained by an inverse execution mode of dividing the scaling weights into a plurality of initial sub-weights according to the initial weights. For the specific implementation process of splitting and merging the initial weights to obtain the scaling weights, reference may be made to the matrix splitting and merging modes in the prior art, which are not described herein.
And S104, performing global quantization on the scaling weight, and performing data processing by using the globally quantized scaling weight as a convolution kernel to be applied to a convolution layer of the convolution neural network.
In addition, in order to ensure that, according to the initial weight, a scaling weight matching the initial weight can be obtained, in one implementation manner of the embodiment of the present application, the initial weight and the scaling weight with the same expression form can be obtained by ensuring the consistency of the expression forms of the initial sub-weight and the scaling sub-weight. That is, in an implementation manner of the embodiment of the present application, the dimension of the initial sub-weight may be the same as the dimension of the scaling sub-weight corresponding to the initial sub-weight, and the dimension of the initial weight may be the same as the dimension of the convolution kernel.
Compared with the condition that the data processing precision is affected due to the limitation of hardware in the prior art, in the embodiment of the application, the corresponding data processing process of the convolution layer adopts a process similar to the process of realizing corresponding convolution after respectively quantizing weights according to output channels. The method comprises the steps of dividing initial weights according to an output channel, processing all divided initial sub-weights, combining the scaled sub-weights obtained through processing, and globally quantizing the scaled weights obtained through combining, and applying the scaled weights to a data processing process of a convolution layer. Therefore, the original limited hardware is not required to be replaced, and the obtained result is close to the result obtained by processing after respectively quantizing the weights according to the output channels. Thus, the data processing precision can be improved under the condition that fixed point operation is adopted to replace floating point operation.
In the process of scaling each initial sub-weight, a corresponding scaling factor can be obtained for each initial sub-weight, and then the corresponding initial sub-weight is scaled according to the scaling factor, so that the obtained scaling sub-weight has more pertinence. Thus, on the basis of the implementation shown in fig. 1, it can also be implemented as the implementation shown in fig. 2. Wherein, S102 scales the plurality of initial sub-weights, and obtains a plurality of scaled sub-weights, which may be implemented as S201 and S202.
S201, obtaining a scaling factor of each initial sub-weight.
S202, scaling each initial sub-weight according to a corresponding scaling factor to obtain scaling sub-weights corresponding to each initial sub-weight.
The scaling factor is used as a parameter for scaling the initial sub-weight, and in one implementation manner of the embodiment of the present application, the calculation manner of the scaling factor may be determined according to a size relationship between the element of the initial weight and the element of the initial sub-weight, and the like. Thus, on the basis of the implementation shown in fig. 2, it can also be implemented as the implementation shown in fig. 3. Wherein, S201 obtains a scaling factor of each initial sub-weight, which may be implemented as S301, and a plurality of sets of S302 to S304. I.e. S302 to S304 are performed for each initial sub-weight to get a scaling factor for each initial sub-weight.
S301, acquiring a first maximum value and a first minimum value in each element of the initial weight.
S302, obtaining a second maximum value and a second minimum value in each element of the initial sub-weights.
S303, determining the ratio of the first maximum value to the second maximum value as a first ratio, and determining the ratio of the first minimum value to the second minimum value as a second ratio.
S304, determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
The implementation shown in fig. 4 may also be implemented on the basis of the implementation shown in fig. 3. Wherein, S304 determines, according to the type of the first ratio and/or the second ratio, that the scaling factor is the reciprocal of the first ratio or the reciprocal of the second ratio, which may be implemented as S401 or S402.
S401, if at least one negative number exists in the first ratio and the second ratio, determining that the scaling factor is the maximum value in the first ratio and the second ratio.
S402, if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value in the first ratio and the second ratio.
The technical scheme provided by the embodiment of the application is explained below with reference to specific examples.
The method comprises the steps of obtaining the number C of output channels of initial weights W, dividing the initial weight W1 into C initial sub-weights, scaling each initial sub-weight W2 i according to a scaling factor corresponding to the initial sub-weight W to obtain corresponding scaling sub-weights W3 i, and combining the scaling sub-weights to obtain scaling weights W 4 for data processing. Wherein, the value of i is an integer greater than or equal to 1 and less than or equal to C, and C is an integer greater than 1.
Taking the example of obtaining the corresponding scaling sub-weight W3 1 according to the initial sub-weight W2 1, the minimum value L (i.e., the first minimum value) and the maximum value U (i.e., the first maximum value) of the initial weights, i.e., l=min (W) and u=max (W) are predetermined. The minimum value L 1 (i.e., the second minimum value) and the maximum value U 1 (i.e., the second maximum value) of the initial sub-weights W2 1 to be scaled, i.e., L 1 =min (W) and U 1 =max (W), are found. The weight scaling parameters S 1 (i.e., the first ratio) and S 2 (i.e., the second ratio) are calculated. Wherein S 1=U/U1,S2=L/L1. If at least one of S 1 and S 2 is negative, determining that the weight scaling factor S is the largest weight scaling parameter among the weight scaling parameters, i.e., s=max (S 1,S2); otherwise, the weight scaling factor S is determined as the smallest weight scaling parameter among the weight scaling parameters, i.e., s=min (S 1,S2). After obtaining the weight scaling factor, the scaled weight W3 1=SW21 is obtained. According to the implementation manner, W3 1、W32、……、W3C is obtained respectively. And combining the scaling sub-weights to obtain a scaling weight W 4. In the above implementation process, the multiplication factor M 1 corresponding to the initial sub-weight W2 1 is the inverse of the weight scaling factor S, i.e., the multiplication factor M 1 =1/S.
It should be noted that, during the actual data processing, the min and max functions may include, but are not limited to, a conventional manner of obtaining the minimum value and the maximum value, and may also include a corresponding algorithm, such as KLD, to achieve the parameter obtaining, which is not limited to the specific implementation.
The embodiment of the application provides data processing equipment based on a convolutional neural network. As shown in fig. 5, the convolutional neural network-based data processing device 50 may include:
an obtaining module 51, configured to obtain the number of output channels.
The processing module 52 is configured to segment the initial weights according to the number of output channels acquired by the acquiring module 51, so as to obtain a plurality of initial sub-weights corresponding to the number of output channels.
The processing module 52 is further configured to scale the plurality of initial sub-weights, and obtain a plurality of scaled sub-weights.
The processing module 52 is further configured to combine the plurality of scaling sub-weights to obtain a scaling weight.
The processing module 52 is further configured to globally quantize the scaling weights, so as to apply the globally quantized scaling weights as a convolution kernel to a convolution layer of the convolutional neural network for data processing.
In one implementation, the processing module 52 is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to a corresponding scaling factor to obtain scaling sub-weights corresponding to each initial sub-weight.
In one implementation, the obtaining module 51 is further configured to obtain a first maximum value and a first minimum value of the elements of the initial weight.
The following steps are performed for each initial sub-weight to obtain a scaling factor for each initial sub-weight:
the obtaining module 51 is further configured to obtain a second maximum value and a second minimum value in each element of the initial sub-weight.
The processing module 52 is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. The processing module 52 is further configured to determine the scaling factor to be the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
The processing module 52 is further configured to determine that the scaling factor is a maximum value of the first ratio and the second ratio if there is at least one negative value in the first ratio and the second ratio; and if the negative number does not exist in the first ratio and the second ratio, determining the scaling factor as the minimum value in the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In one implementation, the convolutional neural network-based data processing device 50 may further include at least one of a communication module 53, a storage module 54, and a display module 55.
Wherein the communication module 53 may be configured to implement data interaction between the above-mentioned modules and/or support data interaction between the convolutional neural network-based data processing device 50 and a device such as a server, other processing device, etc.; a storage module 54 for storing contents and the like required for the above-described plurality of modules in realizing the respective functions; the display module 55 may be used to display the progress of data processing, the operation status of the data processing device 50 based on the convolutional neural network, and the like. In the embodiment of the present application, the content, format, etc. stored in the storage module are not limited.
In an embodiment of the present application, the obtaining module 51 and the communication module 53 may be implemented as communication interfaces, the processing module 52 may be implemented as a processor and/or a controller, the storage module 54 may be implemented as a memory, and the display module 55 may be implemented as a display.
Fig. 6 is a schematic structural diagram of another data processing device based on a convolutional neural network according to an embodiment of the present application. The convolutional neural network-based data processing device 60 may include a communication interface 61, a processor 62. In one implementation, the convolutional neural network-based data processing device 60 may also include one or more of a memory 63 and a display 64. Among other things, communication interface 61, processor 62, memory 63, and display 64 may be implemented via bus 65. The functions of the foregoing components may refer to the descriptions of the functions of the foregoing modules, which are not repeated herein.
It should be noted that, referring to fig. 5 and fig. 6, the data processing apparatus based on the convolutional neural network according to the embodiment of the present application may include more or less modules and components than those shown in the drawings, which are not limited herein.
The application provides a data processing device based on a convolutional neural network, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program to implement the method of any one of the above possible implementations.
The application provides a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of any of the various possible implementations described above.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a physical embodiment, the description is relatively simple, as it is substantially similar to a method embodiment, as relevant to see a section of the description of the method embodiment.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of function in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (5)

1. A data processing method based on a convolutional neural network, the method comprising:
Acquiring the number of output channels of a pre-trained image classification model, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels;
for each initial sub-weight, taking the ratio of the maximum value in each element of the initial weight to the maximum value in each element of the initial sub-weight as a first ratio, and taking the ratio of the minimum value in each element of the initial weight to the minimum value in each element of the initial sub-weight as a second ratio; if the first ratio and/or the second ratio are negative numbers, determining that the scaling factor is the maximum value of the first ratio and the second ratio; if the first ratio and the second ratio are not negative numbers, determining that the scaling factor is the minimum value of the first ratio and the second ratio;
scaling each initial sub-weight according to a corresponding scaling factor to obtain a plurality of scaling sub-weights; combining the scaling sub-weights to obtain scaling weights;
And carrying out global quantization on the scaling weight, and carrying out data processing by using the globally quantized scaling weight as a convolution kernel to be applied to a convolution layer of the image classification model.
2. The method of claim 1, wherein the dimension of an initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, the dimension of the initial weight being the same as the dimension of the convolution kernel.
3. A convolutional neural network-based data processing device, the device comprising:
The acquisition module is used for acquiring the number of output channels of the pre-trained image classification model;
The processing module is used for segmenting the initial weights according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels;
The processing module is further configured to, for each initial sub-weight, use a ratio of a maximum value in each element of the initial weight to a maximum value in each element of the initial sub-weight as a first ratio, and use a ratio of a minimum value in each element of the initial weight to a minimum value in each element of the initial sub-weight as a second ratio; if the first ratio and/or the second ratio are negative numbers, determining that the scaling factor is the maximum value of the first ratio and the second ratio; if the first ratio and the second ratio are not negative numbers, determining that the scaling factor is the minimum value of the first ratio and the second ratio;
the processing module is further configured to scale each initial sub-weight according to a corresponding scaling factor to obtain a plurality of scaled sub-weights; combining the scaling sub-weights to obtain scaling weights;
The processing module is further configured to globally quantize the scaling weight, so that the globally quantized scaling weight is used as a convolution kernel to be applied to a convolution layer of the image classification model, and data processing is performed.
4. A device according to claim 3, characterized in that the dimension of the initial sub-weight is the same as the dimension of its corresponding scaling sub-weight, said dimension of the initial weight being the same as the dimension of the convolution kernel.
5. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of claim 1 or 2.
CN202010237669.8A 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network Active CN111461302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010237669.8A CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010237669.8A CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111461302A CN111461302A (en) 2020-07-28
CN111461302B true CN111461302B (en) 2024-04-19

Family

ID=71681615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010237669.8A Active CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111461302B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN113780523B (en) * 2021-08-27 2024-03-29 深圳云天励飞技术股份有限公司 Image processing method, device, terminal equipment and storage medium
CN114298280A (en) * 2021-12-29 2022-04-08 杭州海康威视数字技术股份有限公司 Data processing method, network training method, electronic device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053028A (en) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 Data fixed point processing method, device, electronic equipment and computer storage media
CN108304919A (en) * 2018-01-29 2018-07-20 百度在线网络技术(北京)有限公司 Method and apparatus for generating convolutional neural networks
CN109902803A (en) * 2019-01-31 2019-06-18 东软睿驰汽车技术(沈阳)有限公司 A kind of method and system of neural network parameter quantization
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
CN110322008A (en) * 2019-07-10 2019-10-11 杭州嘉楠耘智信息科技有限公司 Residual convolution neural network-based quantization processing method and device
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
KR20190130443A (en) * 2018-05-14 2019-11-22 삼성전자주식회사 Method and apparatus for quantization of neural network
CN110826685A (en) * 2018-08-08 2020-02-21 华为技术有限公司 Method and device for convolution calculation of neural network

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150372805A1 (en) * 2014-06-23 2015-12-24 Qualcomm Incorporated Asynchronous pulse modulation for threshold-based signal coding
US10878273B2 (en) * 2017-07-06 2020-12-29 Texas Instruments Incorporated Dynamic quantization for deep neural network inference system and method
US11755901B2 (en) * 2017-12-28 2023-09-12 Intel Corporation Dynamic quantization of neural networks
US10678508B2 (en) * 2018-03-23 2020-06-09 Amazon Technologies, Inc. Accelerated quantized multiply-and-add operations

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053028A (en) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 Data fixed point processing method, device, electronic equipment and computer storage media
CN108304919A (en) * 2018-01-29 2018-07-20 百度在线网络技术(北京)有限公司 Method and apparatus for generating convolutional neural networks
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
KR20190130443A (en) * 2018-05-14 2019-11-22 삼성전자주식회사 Method and apparatus for quantization of neural network
CN110826685A (en) * 2018-08-08 2020-02-21 华为技术有限公司 Method and device for convolution calculation of neural network
CN109902803A (en) * 2019-01-31 2019-06-18 东软睿驰汽车技术(沈阳)有限公司 A kind of method and system of neural network parameter quantization
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
CN110322008A (en) * 2019-07-10 2019-10-11 杭州嘉楠耘智信息科技有限公司 Residual convolution neural network-based quantization processing method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种改进的BP神经网络算法与应用;张月琴等;计算机技术与发展;20120810;第22卷(第08期);第163-166页 *
基于位量化的深度神经网络加速与压缩研究;牟帅;中国优秀硕士学位论文全文数据库 (信息科技辑);20180615(第06期);第I138-1290页 *
基于权重量化与信息压缩的车载图像超分辨率重建;许德智等;计算机应用;20190830;第39卷(第12期);第3644-3649页 *

Also Published As

Publication number Publication date
CN111461302A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
CN111461302B (en) Data processing method, device and storage medium based on convolutional neural network
US20240104378A1 (en) Dynamic quantization of neural networks
CN110929865B (en) Network quantification method, service processing method and related product
CN109002889B (en) Adaptive iterative convolution neural network model compression method
WO2019238029A1 (en) Convolutional neural network system, and method for quantifying convolutional neural network
CN110969251B (en) Neural network model quantification method and device based on label-free data
KR20180073118A (en) Convolutional neural network processing method and apparatus
TW202119293A (en) Method and system of quantizing artificial neural network and arti ficial neural network apparatus
US10863206B2 (en) Content-weighted deep residual learning for video in-loop filtering
TWI744724B (en) Method of processing convolution neural network
CN109978144B (en) Model compression method and system
WO2019001323A1 (en) Signal processing system and method
CN110781686A (en) Statement similarity calculation method and device and computer equipment
JP2016218513A (en) Neural network and computer program therefor
US20220004849A1 (en) Image processing neural networks with dynamic filter activation
JP2022512211A (en) Image processing methods, equipment, in-vehicle computing platforms, electronic devices and systems
CN114781618A (en) Neural network quantization processing method, device, equipment and readable storage medium
CN114861907A (en) Data calculation method, device, storage medium and equipment
CN112966592A (en) Hand key point detection method, device, equipment and medium
CN112418388A (en) Method and device for realizing deep convolutional neural network processing
CN113673532B (en) Target detection method and device based on quantitative model
CN111614358B (en) Feature extraction method, system, equipment and storage medium based on multichannel quantization
CN113313253A (en) Neural network compression method, data processing device and computer equipment
CN114139678A (en) Convolutional neural network quantization method and device, electronic equipment and storage medium
CN113159297A (en) Neural network compression method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201211

Address after: Room 206, 2 / F, building C, phase I, Zhongguancun Software Park, No. 8, Dongbei Wangxi Road, Haidian District, Beijing 100094

Applicant after: Canaan Bright Sight Co.,Ltd.

Address before: 310000 Room 1203, 12/F, Building 4, No. 9, Jiuhuan Road, Jianggan District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou Canaan Creative Information Technology Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant