CN112183711B - Calculation method and system of convolutional neural network using pixel channel scrambling - Google Patents

Calculation method and system of convolutional neural network using pixel channel scrambling Download PDF

Info

Publication number
CN112183711B
CN112183711B CN201910586166.9A CN201910586166A CN112183711B CN 112183711 B CN112183711 B CN 112183711B CN 201910586166 A CN201910586166 A CN 201910586166A CN 112183711 B CN112183711 B CN 112183711B
Authority
CN
China
Prior art keywords
values
value
convolution
scrambling
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910586166.9A
Other languages
Chinese (zh)
Other versions
CN112183711A (en
Inventor
吴俊樟
陈世泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Realtek Semiconductor Corp
Original Assignee
Realtek Semiconductor Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Realtek Semiconductor Corp filed Critical Realtek Semiconductor Corp
Priority to CN201910586166.9A priority Critical patent/CN112183711B/en
Publication of CN112183711A publication Critical patent/CN112183711A/en
Application granted granted Critical
Publication of CN112183711B publication Critical patent/CN112183711B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

In the calculation method, an operation system is used for receiving original input values, before convolution operation is carried out, pixel scrambling is carried out on the original input values, the original input values are separated into a plurality of groups of values, the dimensionality of each group of values is reduced, channel scrambling is carried out on the plurality of groups of values, the values participating in the convolution operation are selected to form a plurality of groups of new input values, the dimensionality of the input values can be effectively reduced by discarding unselected values, then a convolution kernel is set, and then convolution operation is carried out on the convolution kernel and the plurality of groups of new input values through a multiplication accumulator, so that a plurality of groups of output values are formed.

Description

Calculation method and system of convolutional neural network using pixel channel scrambling
Technical Field
The present application relates to a data processing technology using convolutional neural network, and more particularly, to a method and system for operating convolutional neural network, which reduces the amount of computation and the storage space by the pre-operation of scrambling pixels and channels while maintaining the accuracy of recognition.
Background
In the field of artificial intelligence (Artificial Intelligence, AI), a machine learning (machine learning) technology is applied, and in machine learning, a convolutional neural network (Convolutional Neural Network, abbreviated as CNN) is a feed-forward neural network, and particularly, the technology can be applied in the field of image processing, in particular, image recognition, object detection, image segmentation and other processes.
The development of convolutional neural network models and algorithms has been very advanced in recent years, however, although convolutional neural networks have high accuracy in image feature extraction and recognition, the convolutional neural networks are difficult to implement in hardware because of the large calculation amount and the layer-by-layer operation characteristic.
In recent years, various researches have proposed a neural network suitable for hardware computation, such as a depth-separable convolution (depth-wise separable convolution) and a transition convolution (Shift Convolution) of Mobile Net, and various developments have been made to reduce the computation amount and the storage space of a model while maintaining the original accuracy.
Because the model operation based on the convolutional neural network is very large, the prior art is known to operate by a cloud server or a computer host, and if the model operation is applied to an artificial intelligence internet of things (AIOT) product, the image data can be transmitted to the cloud server for operation, so that the problem of large operation amount is solved.
In order to simultaneously maintain accuracy and reduce the number of model parameters and the amount of computation, the convolution operation in the prior art, such as squezenet (2016), can be not changed, but the original larger convolution kernel (convolution kernel) is disassembled into a plurality of modules so as to reduce the parameter storage. The prior art, such as MobileNet v1 (2017) and MobileNet v2 (2018), approximates the convolution operation of original k by a depth-separable convolution (depth-wise separable convolution) module, which is a depth-wise convolution followed by a point-wise convolution (point-wise convolution). Further, as in the prior art shift net (2018), shift-convolution (shift-convolution) is used instead of depth convolution, so that more parameter storage amounts and convolution operation amounts can be reduced.
Disclosure of Invention
The application discloses a calculation method and a system using a convolution neural network with pixel channel scrambling, wherein before convolution operation is executed, the front operation of pixel scrambling (pixel scrambling) and channel scrambling (channel scrambling) is executed on input values, so that the dimension of the height, width and depth of the input values can be reduced, and the calculation amount and the memory usage amount of the system are reduced under the condition that the parameter amounts are the same.
According to an embodiment, a method for calculating a convolutional neural network using pixel channel scrambling includes receiving an original input value, which may be image data, having a height, a width, and a first number of depth values, with an operation system, performing a pixel scrambling on the original input value with a processor of the operation system, and separating the original input value into a plurality of sets of values to reduce dimensions of the sets of values; and then, a channel scrambling is carried out on the values, the values participating in convolution operation can be selected from a plurality of groups of values respectively, a plurality of groups of new input values are formed, and the values are temporarily stored in a memory.
Next, a plurality of sets of convolution kernels corresponding to the new input values are set, and according to an embodiment, a second number of convolution kernels may be included, each convolution kernel verifying a filter. Then, a convolution operation is performed with a second number of convolution kernels and a plurality of new sets of input values by a multiply-accumulator in the processor to form a plurality of sets of output values having the second number.
Wherein, when the original input value has a value of a first number of depths, a plurality of sets of new input values smaller than the first number of depths can be formed once the pixel scrambling and the channel scrambling are performed.
Preferably, the original input values are image data, and after convolution operation is performed by the operation system to extract image features, a plurality of sets of feature maps with the second number of depths can be formed. And the image characteristic map with the second number of depths can be formed by performing inverse pixel scrambling operation on the plurality of groups of output values with the second number.
Preferably, the generated image feature map is used to identify the original input value.
Preferably, the height, width and depth of each convolution kernel performing the convolution operation are any positive integer.
According to an embodiment of a system for performing a method for performing a convolutional neural network using pixel channel scrambling, the system comprises a processor, and a communication circuit and a memory electrically connected to the processor, wherein the method for performing a convolutional neural network using pixel channel scrambling is performed.
Further, the computing system may form a cloud system for providing services for performing image recognition using the computing method of the convolutional neural network with pixel channel scrambling.
Furthermore, the algorithm can also realize an independent circuit system, which is suitable for a specific system to execute image recognition by using the algorithm of the convolutional neural network with the scrambled pixel channels.
For a further understanding of the nature and the technical aspects of the present application, reference should be made to the following detailed description of the application and the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the application.
Drawings
FIGS. 1 (A) -1 (C) are diagrams showing a point-wise convolution operation;
FIG. 2 shows a schematic diagram of a convolution operation of a filter with a location in an input value;
FIG. 3 is a schematic diagram of an exemplary algorithm of convolutional neural network using pixel channel scrambling;
FIG. 4 is a flowchart of an embodiment of a method of calculating a convolutional neural network using pixel channel scrambling;
FIG. 5 shows a schematic diagram of an embodiment of an computing system implementing a method of computing a convolutional neural network using pixel channel scrambling;
FIGS. 6 (A) -6 (C) are diagrams showing embodiments of performing pixel scrambling using an algorithm of a convolutional neural network for pixel channel scrambling;
FIGS. 7 (A) -7 (C) are diagrams showing embodiments of a method of performing channel scrambling using a convolutional neural network for pixel channel scrambling;
FIGS. 8 (A) -8 (C) are diagrams showing embodiments of a method of performing convolution operations using a convolutional neural network with pixel channel scrambling;
fig. 9 (a) and 9 (B) are diagrams showing an embodiment of inverse pixel scrambling using an algorithm of a convolutional neural network of pixel channel scrambling.
Symbol description
First quantity C1
Second quantity C2
Third quantity C1'
Screen 20
Input value 22
Output value 24
Screen 30
Screener numbers 301-316
Input values a, b, c, d
Values a1 to a4, b1 to b4, c1 to c4, d1 to d4
Algorithm 50
Processor 501
Communication circuit 505
Memory 503
Network 52
Terminal 511,512,513
First group of input values I_A
Second group of input values I_B
Third group of input values I_C
Fourth group of input values I_D
A first group of input values I_A'
A second set of input values I_B'
Third group of input values I_C'
Fourth group of input values I_D'
First group of output values O_A
Second set of output values O_B
Third set of output values O_C
Fourth set of output values O_D
Calculation flow of convolutional neural network using pixel channel scrambling in steps S401-S413
S401 obtaining an original input value (H W C1)
S403 performs pixel scrambling (H/2*W/2C 1)
S405 performs channel scrambling (I_ A, I _ B, I _ C, I _D)
S407 is discarded to form a plurality of sets of input values (I_A ', I_B', I_C ', I_D')
S409 set convolution kernel (C2 filters)
S411 performs a convolution operation (H/2*W/2C 2)
S413 performs inverse pixel scrambling (h×w×c2)
Detailed Description
The following embodiments of the present application are described in terms of specific examples, and those skilled in the art will appreciate the advantages and effects of the present application from the disclosure herein. The application is capable of other and different embodiments and its several details are capable of modifications and various other uses and applications, all of which are obvious from the description, without departing from the spirit of the application. It is to be noted that the drawings of the present application are merely schematic illustrations, and are not drawn to actual dimensions. The following embodiments will further illustrate the related art content of the present application in detail, but the disclosure is not intended to limit the scope of the present application.
It will be understood that, although the terms "first," "second," "third," etc. may be used herein to describe various elements or signals, these elements or signals should not be limited by these terms. These terms are used primarily to distinguish one element from another element or signal from another signal. In addition, the term "or" as used herein shall include any one or combination of more of the associated listed items as the case may be.
The Convolutional Neural Network (CNN) has great achievements in image recognition application, and an image processing method based on the convolutional neural network is developed successively, however, in the fully-connected neural network, each neuron between two adjacent layers is connected with each other, when the feature dimension of an input layer becomes very high, the parameters required to be trained by the neural network are very large, and the operation amount is also very large, so that the development of the convolutional neural network is divided into two aspects, namely, the accuracy is further improved, and the operation of a network model is compressed and accelerated.
Because the model operation amount based on the convolutional neural network is very large, the disclosed algorithm method and system using the convolutional neural network (pixel-channel shuffle convolution neural network) with the scrambled pixel channels are provided, the purpose of the algorithm method and system is to simultaneously maintain the accuracy, and to reduce the model parameter amount and operation amount, the method uses a depth-separable convolutional module as the depth convolutional (depth-wise convolutional), and replaces the existing point-wise convolutional (point-wise convolutional) with the disclosed convolutional neural network with the scrambled pixel channels, so that the calculation amount can be reduced, for example, experiments under specific environments can be performed, and the result is that the reducible calculation amount and the memory use amount become one fourth of the traditional point-wise convolutional.
Taking the image recognition and detection function as an example, the embodiment of the calculation method using the convolutional neural network with the pixel channel scrambling according to the present application can reduce the channel calculation using the feature map (feature map) in the Convolutional Neural Network (CNN) and combine the pixel scrambling (pixel scrambling) and the channel scrambling (channel scrambling) to reduce the hardware calculation, including reducing the size of the memory.
A schematic diagram of a point-wise convolution operation described with reference to fig. 1 (a) -1 (C).
The input layer shown in fig. 1 (a) is a point-wise convolution operation, and is shown as a cube, wherein a first layer input value represented by the designation a, b, C, d is shown, and is shown as a cube formed of a height (H), a width (W), and a depth (C1), wherein the depth (C1) represents the number of convolution kernels (first number C1) of the input layer.
Fig. 1 (B) schematically shows a 1*1 filter implemented by a convolution kernel, which in this example shows a second number of C2 filters, where the convolution operation is performed by scanning the filters one by one at a step (stride) setting with the previous input layer (fig. 1 (a)) of the filter, and in the process, multiplying and adding the filters together to obtain the output value as shown in fig. 1 (C).
The output layer shown in fig. 1 (C) is a cube formed by a high (H), a wide (W) and a deep (C2), and the depth (C2) is the same number of feature maps generated by comparing the number of filters (the second number C2), so as to display the number of output values, where h×w×c2 represents the magnitude of the output values.
The convolution kernel realizes a screening mechanism, as shown in fig. 1 (B), each parameter in the convolution kernel is equivalent to a weight parameter in the neural network and is connected with a corresponding local pixel, and for example, the moving window scanning calculation is to multiply each parameter of the convolution kernel with a corresponding local pixel value one by one, and finally sum the parameters to obtain a result on the convolution layer. Features in the image can be extracted using convolution kernels and feature mapping (mapping) can be performed.
For example, when the input value (input data) and a filter are convolved, as shown in fig. 1 (B), the size of the filter is 1*1, the depth is 16 (the first number C1), and the output feature map (feature map) is an output value with a size of h×w×1 after multiplying the input value by the filter (1×1×16). Similarly, when C2 filters are proposed (fig. 1 (B)), C2 feature maps are generated, and the feature maps are combined to form a cube as shown in fig. 1 (C). That is, the input value and the filter are convolved to form the output layer pattern of fig. 1 (C), and the size after combination is h×w×c2, that is, the size of the output value (output data).
According to the convolution operation, there is a number (second number C2, as shown in fig. 1 (B)) of filters (convolution kernel), each filter has a number of values (first number C1, as shown in fig. 1 (B)) and the same number of values (first number C1, as shown in fig. 1 (a)), as shown in fig. 16, at each position in the input value, respectively, for multiplication, and finally taking the sum, the second number C2 of filters forms a second number C2 of feature maps through the convolution operation, and the feature maps with the size h×w×c2 as shown in fig. 1 (C) are combined to form the output value of the convolution operation.
Fig. 2 shows a schematic diagram of a convolution operation of a filter with one of the input values, shown as a filter 20, having a first number C1 of values, which may be any value as desired, in this example shown as 16, and a convolution operation of the input value 22, shown schematically as a-position in fig. 1 (a), having a first number C1 of values, shown as 16. According to an embodiment of the algorithm using the convolutional neural network with the pixel channel scrambling, the concept is that each position in the input value 22 is not multiplied by all the values of the first number C1 in the filter 20, but is multiplied according to a specific rule, so that the output value 24 is generated by multiplying different geometric positions in the input value 22 and different values of the filter 20, and as a result, the operation amount can be reduced.
With continued reference to FIG. 3, an exemplary algorithm using a convolutional neural network with pixel channel scrambling is shown, which illustrates a2 x 2 input value, each having a first number C1, as shown, of 16, in this example, each block is numbered a, b, C, d in order, which may be referred to as a pixel in the input image to be processed by the system. The convolution operation is provided with a filter 30, wherein the number of layers is denoted by filter numbers 301 to 316, and in order to reduce the operation amount, the filter 30 sets a multiplication and summation rule according to a requirement for reducing the operation amount.
In this example, the input values are 2×2 blocks, and in the embodiment, 4 consecutive pixels are used as a group, so that the filter 30 can be set to 4 groups at intervals, wherein the filter numbers 301, 305, 309 and 313 are shown as a first group of filters, the numbers 302, 306, 310 and 314 are shown as a second group of filters, the numbers 303, 307, 311 and 315 are shown as a third group of filters, the numbers 304, 308, 312 and 316 are shown as a fourth group of filters, and the convolution operation is performed with the input values of a, b, c, d respectively according to the sequence instead of the convolution operation with all the input values, so that the operation amount can be reduced. The grouping rules of the filters are stored in a memory of the system.
For example, the system extracts a first number C1 (16 in this example) of the input values a according to a rule (e.g., 4 intervals), and then takes out the values a1, a2, a3, and a4 to form a first set of input values (i_a), and the input values selected according to the rule are registered in the memory of the system and convolved with the first set of filters (filter numbers 301, 305, 309, and 313). The unselected values in the input value a are discarded, so that the operand is effectively reduced, which is shown to be reduced to one fourth of the original operand. Wherein, for a first set of input values (i_a), the convolution operation will multiply-add with the corresponding position filter (first set of filters): (value a1 multiplied by number 301 filter) + (value a2 multiplied by number 301 filter) + (value a3 multiplied by number 301 filter) + (value a4 multiplied by number 301 filter) =first output value; (value a1 multiplied by number 305 filter) + (value a2 multiplied by number 305 filter) + (value a3 multiplied by number 305 filter) + (value a4 multiplied by number 305 filter) =second output value; (value a1 multiplied by number 309 filter) + (value a2 multiplied by number 309 filter) + (value a3 multiplied by number 309 filter) + (value a4 multiplied by number 309 filter) =third output value; (value a1 multiplied by number 313 filter) + (value a2 multiplied by number 313 filter) + (value a3 multiplied by number 313 filter) + (value a4 multiplied by number 313 filter) =fourth output value. The first output value, the second output value, the third output value and the fourth output value obtained by convolution operation of the input value a form a first group of output values (O_A).
Similarly, the system takes a first number C1 (16 in this example) of the input values B out of the values B1, B2, B3, and B4 according to a rule (e.g., at intervals of 4), forms a second set of input values (i_b), and temporarily stores the selected input values in the memory of the system, and then performs a convolution operation with a second set of filters (filter numbers 302, 306, 310, and 314). Likewise, unselected ones of the input values b will be discarded. Wherein, for a second set of input values (i_b), the convolution operation will multiply-add with the corresponding position filter (second set of filters): (value b1 multiplied by number 302 filter) + (value b2 multiplied by number 302 filter) + (value b3 multiplied by number 302 filter) + (value b4 multiplied by number 302 filter) =first output value; (value b1 multiplied by number 306 filter) + (value b2 multiplied by number 306 filter) + (value b3 multiplied by number 306 filter) + (value b4 multiplied by number 306 filter) =second output value; (value b1 multiplied by number 310 filter) + (value b2 multiplied by number 310 filter) + (value b3 multiplied by number 310 filter) + (value b4 multiplied by number 310 filter) =third output value; (value b1 multiplied by number 314 filter) + (value b2 multiplied by number 314 filter) + (value b3 multiplied by number 314 filter) + (value b4 multiplied by number 314 filter) =fourth output value. The first output value, the second output value, the third output value and the fourth output value obtained by convolution operation of the input value B form a second group of output values (O_B).
Similarly, the system takes a first number C1 (16 in this example) of the input values C out of the first number C1, C2, C3 and C4 according to a rule (e.g., at intervals of 4), forms a third set of input values (i_c), and temporarily stores the selected input values in the memory of the system, and then performs a convolution operation with a third set of filters (filter numbers 303, 307, 311 and 315). The unselected ones of the input values c will be discarded. Wherein, for a third set of input values (i_c), the convolution operation will multiply-add with the corresponding position filter (third set of filters): (value c1 multiplied by number 303 filter) + (value c2 multiplied by number 303 filter) + (value c3 multiplied by number 303 filter) + (value c4 multiplied by number 303 filter) =first output value; (value c1 multiplied by number 307 filter) + (value c2 multiplied by number 307 filter) + (value c3 multiplied by number 307 filter) + (value c4 multiplied by number 307 filter) =second output value; (value c1 multiplied by number 311 filter) + (value c2 multiplied by number 311 filter) + (value c3 multiplied by number 311 filter) + (value c4 multiplied by number 311 filter) =third output value; (value c1 multiplied by number 315 filter) + (value c2 multiplied by number 315 filter) + (value c3 multiplied by number 315 filter) + (value c4 multiplied by number 315 filter) =fourth output value. The first output value, the second output value, the third output value and the fourth output value obtained by convolution operation of the input value C form a third group of output values (O_C).
Similarly, the system takes a first number C1 (16 in this example) of the input values D out of the values D1, D2, D3, and D4 according to a rule (e.g., at intervals of 4), forms a fourth set of input values (i_d), and temporarily stores the selected input values in the memory of the system, and then performs a convolution operation with a fourth set of filters (filter numbers 304, 308, 312, and 316). The unselected ones of the input values d will be discarded. Wherein, for the fourth set of input values (i_d), the convolution operation will multiply-add with the corresponding position filter (fourth set of filters): (value d1 multiplied by number 304 filter) + (value d2 multiplied by number 304 filter) + (value d3 multiplied by number 304 filter) + (value d4 multiplied by number 304 filter) =first output value; (value d1 multiplied by number 308 filter) + (value d2 multiplied by number 308 filter) + (value d3 multiplied by number 308 filter) + (value d4 multiplied by number 308 filter) =second output value; (value d1 multiplied by number 312 filter) + (value d2 multiplied by number 312 filter) + (value d3 multiplied by number 312 filter) + (value d4 multiplied by number 312 filter) =third output value; (value d1 multiplied by number 316 filter) + (value d2 multiplied by number 316 filter) + (value d3 multiplied by number 316 filter) + (value d4 multiplied by number 316 filter) =fourth output value. The first output value, the second output value, the third output value and the fourth output value obtained by convolution operation of the input value c form a fourth group of output values (O_D).
The example of fig. 3 shows that the input values (a, B, C, D) selected according to the specific rule form a first set of input values (i_a), a second set of input values (i_b), a third set of input values (i_c) and a fourth set of input values (i_d), and then the selected input values are convolved to form output values of the first set of output values (o_a), the second set of output values (o_b), the third set of output values (o_c) and the fourth set of output values (o_d).
According to the above example, it is known that when the convolution operation is performed, not all the input values are operated, and the values other than the values participating in the convolution operation are selected according to the specific rule from the input values, so that the calculation amount can be effectively reduced.
According to the foregoing exemplary concept, the method for calculating a convolutional neural network using pixel channel scrambling according to the present application is different from the conventional point-by-point convolution operation, and the method breaks down the point-by-point convolution operation applied therein into a plurality of operations including pixel scrambling (pixel scrambling), channel scrambling (channel scrambling), point-by-point convolution operation (point-wise scrambling) and inverse pixel scrambling (inverse pixel shuffle).
As described in the steps of fig. 4 and the operation system of fig. 5, the system for executing the operation method using the convolutional neural network with the scrambled pixel channels may be an operation system 50 for performing image processing, where the operation system 50 is provided with a processor 501, a communication circuit 505 and a memory 503, and the circuit elements are electrically connected, and the steps of pixel scrambling, channel scrambling operation, convolution operation and subsequent reverse pixel scrambling in the operation method are performed by the operation capability of the processor 501, and in particular, the convolution operation is performed by a multiply-accumulate (multiply-accumulate) for performing multiply-add operation in the processor 501. The calculation amount can be effectively reduced by the calculation method, so that the corresponding hardware requirements, such as a multiplication accumulator and a memory, are also effectively reduced.
It should be noted that, the computing system 50 may be a cloud system in addition to a general computer system, and may be configured to receive the image data transmitted by each terminal 511,512,513 through the network 52, and provide a service for performing image recognition by using the computing method of the convolutional neural network with the pixel channel scrambling. In another embodiment, the algorithm 50 may also implement a separate circuit system, such as an Integrated Circuit (IC), suitable for use in a particular system to perform image recognition using the algorithm of the convolutional neural network with pixel channel scrambling.
According to one embodiment, the algorithm 50 processes an input image for image recognition purposes, and the convolutional neural network algorithm can obtain a feature of a graph from pixels in the image, wherein the feature covers the correlation between pixels in addition to each pixel. The flow of the related method can be described by the following description of the drawings, such as the exemplary diagrams shown in fig. 6 (a) to 9 (B), and the flow chart of the algorithm using the convolutional neural network with the scrambled pixel channels described in fig. 4 is referred to, in particular, it can be understood by these examples why the algorithm using the convolutional neural network with the scrambled pixel channels can reduce the calculation amount and approach the result of the huge convolution calculation amount originally.
Fig. 6 (a) -6 (C) show an embodiment of performing pixel scrambling in an algorithm using a convolutional neural network for pixel channel scrambling.
Fig. 6 (a) shows an original input value represented by a cube of height (H), width (W) and depth (a first number C1, e.g., 16), and 4 sets of input values, each set having the value of the first number C1, are arranged to be represented by input values a, b, C, d. In step S401 of fig. 4, the algorithm receives an original input value with a size of h×w×c1, wherein the original input value may be image data having a height, a width and a depth.
The algorithm performs a scrambling (scrambling) operation on the original input values, such as pixel scrambling (pixel scrambling), and in step S403 of fig. 4, the original input values are separated into a plurality of sets of values according to the requirement by the processor of the algorithm, so that each set of values has a reduced dimension with respect to the original input values, and the height and width can be reduced, but the depth may not be changed. As shown in fig. 6 (B), the original input values are separated into 4 groups, forming 4 groups of cubes of height (H/2), width (W/2) and depth (first number C1), each of which has a halved height and width. The representation may then be as shown in fig. 6 (C), wherein a first set of input values (i_a), a second set of input values (i_b), a third set of input values (i_c) and a fourth set of input values (i_d) of the first number C1 (feature map) are displayed. The data generated in the operation procedure can be temporarily stored in the memory to wait for the next step to fetch the data.
Next, as shown in step S405 of fig. 4, the algorithm performs channel scrambling (channel scrambling) on the plurality of sets of values formed by the pixel scrambling, and the algorithm using the convolutional neural network of the pixel channel scrambling is shown in fig. 7 (a) -7 (C).
FIG. 7 (A) shows the input values with reduced height and width to half after the pixel scrambling procedure, and then the values participating in the convolution operation are selected from each of the multiple sets of values according to a rule to form multiple new input values, which can be temporarily stored in the memory of the system. The figure shows that in the first set of input values (i_a) the values are selected at intervals according to the design of the filter, for example one for every 4 values, i.e. a feature map is taken in which 4k+1, where k is 0,1,2,3. Similarly, one of the values is selected from the second set of input values (i_b) for every 4 values, i.e., a feature map is taken in which 4k+2 sheets are taken, where k is 0,1,2,3; selecting one of the values from every 4 values in the third set of input values (i_c), i.e. taking a feature map of 4k+3, where k is 0,1,2,3; and selecting one from every 4 values in the fourth set of input values (I_D), namely taking a characteristic diagram of 4k+4, wherein k is 0,1,2 and 3.
Fig. 7 (B) shows that the values selected from the first set of input values (i_a), the second set of input values (i_b), the third set of input values (i_c), the fourth set of input values (i_d) are rearranged in the previous layers.
Fig. 7 (C) shows that the input values of the first set of input values (i_a), the second set of input values (i_b), the third set of input values (i_c) and the fourth set of input values (i_d) are discarded, and the input values of each set of input values originally having 16 values (feature images) are reduced to 4 values, and the feature image number is reduced (by one fourth) after the channel scrambling procedure, so as to form new first set of input values (i_a '), second set of input values (i_b'), third set of input values (i_c ') and fourth set of input values (i_d'). In step S407 of fig. 4, the operation system performs channel scrambling, discards the unselected values, and then the value depth dimension is reduced to a third number C1', and after pixel scrambling and channel scrambling, sets of new input values (depth is the third number C1') smaller than the first number C1 depth are formed, which is shown as a quarter (e.g. 4) of the original data, relative to the original input values (depth is the first number C1), to form sets of input values (i_a ', i_b', i_c ', i_d') participating in the convolution operation.
Fig. 8 (a) -8 (C) are diagrams next showing an embodiment of performing convolution operations in an algorithm of a convolutional neural network using pixel channel scrambling.
Fig. 8 (a) shows that the first set of input values (i_a '), the second set of input values (i_b '), the third set of input values (i_c ') and the fourth set of input values (i_d '), each of which has been reduced in number by pixel scrambling and channel scrambling, are reduced in depth to a third number C1', and the filters (depth corresponds to the third number C1 ') implemented by the four sets of convolution kernels shown in fig. 8 (B) respectively perform convolution operations to set a convolution kernel (filter) corresponding to each new input value, and a second number (C2) of filters (1×1×4) are set as required, each set of filters is removed from the original filter according to rules, and the convolution kernel depth is only one fourth (third number C1 '). As shown in step S409 of fig. 4, the corresponding convolution kernels are set according to the multiple sets of input values (i_a ', i_b', i_c ', i_d') participating in the convolution operation, and the depth is also reduced, so as to implement the second number C2 of filters.
In step S411 of fig. 4, the convolution operation is performed by the multiply-accumulate device of the processor in the operation system with the convolution kernel set in step S409 and a plurality of new input values, as shown in fig. 8 (C), and four output values are generated after the convolution operation, which are the first set of output values (o_a), the second set of output values (o_b), the third set of output values (o_c) and the fourth set of output values (o_d), respectively. The height and width of each group of output values are H/2 and W/2 respectively, and the depth is the second number C2 of convolution kernels. The plurality of sets of output values are extracted from the original input values (e.g., image data), such as a plurality of feature maps having a second number of C2 depths. Similarly, the feature maps may be temporarily stored in the memory.
After the system completes the convolution operation to generate the sets of output values, the next steps are shown in fig. 9 (a) and 9 (B) to perform an operation as inverse pixel scrambling (inverse pixel shuffle), as shown in step S413 of fig. 4. Fig. 9 (a) shows a first set of output values (o_a), a second set of output values (o_b), a third set of output values (o_c) and a fourth set of output values (o_d) obtained from the memory by the system, wherein each set of output values respectively covers the components of each classification value (a, B, C, D) in the original input values, and each set of output values can be inversely combined into a final output value with a height H, a width W and a depth of a second number C2 according to the original designed input value number sequence a, B, C, D, that is, an image feature diagram of the original input image data is obtained by the calculation method of the convolutional neural network by performing the pixel channel scrambling by the calculation system, as shown in fig. 9 (B). It should be noted that the image feature map is an image feature extracted by the algorithm according to the original input value (image data), that is, the image feature map generated finally, and can provide a specific system for identifying the original image.
It should be noted that, according to an embodiment of the algorithm using the convolutional neural network with the pixel channel scrambling, the input value applied to the pixel scrambling can be adjusted according to the requirement, the convolution kernel for performing the convolution operation can be changed arbitrarily according to the requirement, and the height, width and depth can be any positive integer. The values are the same before and after the final output value and the initial input value, and the parameter amounts are the same, but the number of multiply-accumulator (multiply-accumulator) needed by the system to perform the multiply-add operation is lower.
In summary, according to the embodiments of the method and system for computing a convolutional neural network using pixel channel scrambling, in data processing technology using the convolutional neural network, the computation amount and the storage space are reduced by the pre-processing of pixel scrambling (pixel scrambling) and channel scrambling (channel scrambling), while the accuracy of the convolutional computation is still maintained.
The above disclosure is only a preferred embodiment of the present application and is not intended to limit the claims of the present application, so that all equivalent technical changes made by the application of the specification and the drawings of the present application are included in the claims of the present application.

Claims (10)

1. A method of computing a convolutional neural network using pixel channel scrambling, comprising:
receiving an original input value by an operation system, wherein the original input value is a value with a height, a width and a depth;
performing, by a processor of the computing system, a pixel scrambling on the original input value, and separating the original input value into a plurality of groups of values to reduce dimensions of the groups of values;
executing a channel scrambling on the multiple groups of values by the processor, and respectively selecting the values participating in a convolution operation from the multiple groups of values to form multiple groups of new input values;
setting convolution kernels corresponding to the multiple groups of new input values, wherein the convolution kernels comprise a second number of convolution kernels, and each convolution kernel realizes a filter; and
the convolution operation is performed with the second number of convolution kernels and the plurality of sets of new input values by a multiply-accumulator in the processor to form a plurality of sets of output values having the second number.
2. The method of claim 1, wherein the original input value is an image data, and the computing system performs the convolution operation to extract image features to form a feature map having a plurality of sets of depths with a second number.
3. The method of claim 2, wherein the second number of the plurality of sets of output values is processed by a reverse pixel scrambling algorithm to form an image feature map having the second number of depths.
4. The method of claim 1, wherein the new input values are formed by the channel scrambling, and values of the sets of values not selected to participate in the convolution operation are further discarded.
5. The method of claim 1, wherein the original input values have values of a first number of depths, and the pixel scrambling and the channel scrambling form the plurality of new input values of less than the first number of depths.
6. The method of claim 5, wherein the original input value is an image data, and the computing system performs the convolution operation to extract image features to form a feature map having a plurality of sets of depths with a second number.
7. The method of claim 6, wherein the second number of the plurality of sets of output values is processed by a reverse pixel scrambling algorithm to form an image feature map having the second number of depths.
8. The method of claim 7, wherein the image feature map is used to identify the original input values.
9. The algorithm using a convolutional neural network of pixel channel scrambling of any of claims 1-8, wherein the height, width and depth of each convolution kernel performing the convolution operation is any positive integer.
10. An algorithm using a convolutional neural network of pixel channel scrambling, comprising:
a processor, and a communication circuit and a memory electrically connected with the processor;
wherein, the algorithm of the convolution neural network using the pixel channel scrambling is executed by the processor, comprising:
receiving an original input value, wherein the original input value is a value with a height, a width and a depth;
performing a pixel scrambling on the original input value, separating the original input value into a plurality of sets of values to reduce the dimension of each set of values;
performing a channel scrambling on the plurality of sets of values, and respectively selecting values participating in a convolution operation from the plurality of sets of values to form a plurality of sets of new input values;
setting convolution kernels corresponding to the multiple groups of new input values, wherein the convolution kernels comprise a second number of convolution kernels, and each convolution kernel realizes a filter; and
the convolution operation is performed with the second number of convolution kernels and the plurality of sets of new input values by a multiply-accumulator in the processor to form a plurality of sets of output values having the second number.
CN201910586166.9A 2019-07-01 2019-07-01 Calculation method and system of convolutional neural network using pixel channel scrambling Active CN112183711B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910586166.9A CN112183711B (en) 2019-07-01 2019-07-01 Calculation method and system of convolutional neural network using pixel channel scrambling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910586166.9A CN112183711B (en) 2019-07-01 2019-07-01 Calculation method and system of convolutional neural network using pixel channel scrambling

Publications (2)

Publication Number Publication Date
CN112183711A CN112183711A (en) 2021-01-05
CN112183711B true CN112183711B (en) 2023-09-12

Family

ID=73915660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910586166.9A Active CN112183711B (en) 2019-07-01 2019-07-01 Calculation method and system of convolutional neural network using pixel channel scrambling

Country Status (1)

Country Link
CN (1) CN112183711B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809426A (en) * 2014-01-27 2015-07-29 日本电气株式会社 Convolutional neural network training method and target identification method and device
WO2018052586A1 (en) * 2016-09-14 2018-03-22 Konica Minolta Laboratory U.S.A., Inc. Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN108460742A (en) * 2018-03-14 2018-08-28 日照职业技术学院 A kind of image recovery method based on BP neural network
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN109360192A (en) * 2018-09-25 2019-02-19 郑州大学西亚斯国际学院 A kind of Internet of Things field crop leaf diseases detection method based on full convolutional network
EP3499428A1 (en) * 2017-12-18 2019-06-19 Nanjing Horizon Robotics Technology Co., Ltd. Method and electronic device for convolution calculation in neutral network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017338783B2 (en) * 2016-10-04 2022-02-10 Magic Leap, Inc. Efficient data layouts for convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809426A (en) * 2014-01-27 2015-07-29 日本电气株式会社 Convolutional neural network training method and target identification method and device
WO2018052586A1 (en) * 2016-09-14 2018-03-22 Konica Minolta Laboratory U.S.A., Inc. Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
EP3499428A1 (en) * 2017-12-18 2019-06-19 Nanjing Horizon Robotics Technology Co., Ltd. Method and electronic device for convolution calculation in neutral network
CN108460742A (en) * 2018-03-14 2018-08-28 日照职业技术学院 A kind of image recovery method based on BP neural network
CN109344883A (en) * 2018-09-13 2019-02-15 西京学院 Fruit tree diseases and pests recognition methods under a kind of complex background based on empty convolution
CN109360192A (en) * 2018-09-25 2019-02-19 郑州大学西亚斯国际学院 A kind of Internet of Things field crop leaf diseases detection method based on full convolutional network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于稀疏卷积核的卷积神经网络研究及其应用;叶会娟;刘向阳;;信息技术(10);全文 *

Also Published As

Publication number Publication date
CN112183711A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
Graham et al. Levit: a vision transformer in convnet's clothing for faster inference
CN109522874B (en) Human body action recognition method and device, terminal equipment and storage medium
CN110188239B (en) Double-current video classification method and device based on cross-mode attention mechanism
CN110050267B (en) System and method for data management
US11907826B2 (en) Electronic apparatus for operating machine learning and method for operating machine learning
TWI719512B (en) Method and system for algorithm using pixel-channel shuffle convolution neural network
WO2020186703A1 (en) Convolutional neural network-based image processing method and image processing apparatus
WO2018052586A1 (en) Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN110033003A (en) Image partition method and image processing apparatus
CN109784372B (en) Target classification method based on convolutional neural network
WO2023146523A1 (en) Event-based extraction of features in a convolutional spiking neural network
CN109964250A (en) For analyzing the method and system of the image in convolutional neural networks
Chang et al. An efficient implementation of 2D convolution in CNN
CN112561027A (en) Neural network architecture searching method, image processing method, device and storage medium
CN112613581A (en) Image recognition method, system, computer equipment and storage medium
CN110059815B (en) Artificial intelligence reasoning computing equipment
CN109996023A (en) Image processing method and device
CN111062854B (en) Method, device, terminal and storage medium for detecting watermark
CN113159232A (en) Three-dimensional target classification and segmentation method
CN114821058A (en) Image semantic segmentation method and device, electronic equipment and storage medium
Jasitha et al. Venation based plant leaves classification using GoogLeNet and VGG
CN112749576B (en) Image recognition method and device, computing equipment and computer storage medium
CN111145196A (en) Image segmentation method and device and server
CN114049491A (en) Fingerprint segmentation model training method, fingerprint segmentation device, fingerprint segmentation equipment and fingerprint segmentation medium
CN111414823B (en) Human body characteristic point detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant