CN112766474B

CN112766474B - Method, device, medium and electronic equipment for realizing convolution operation

Info

Publication number: CN112766474B
Application number: CN201911066229.4A
Authority: CN
Inventors: 王振江; 李德林; 张祎男
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2019-11-04
Filing date: 2019-11-04
Publication date: 2024-03-22
Anticipated expiration: 2039-11-04
Also published as: CN112766474A

Abstract

Disclosed are a method, an apparatus, a medium, and an electronic device for implementing convolution operation, wherein the method includes: acquiring input features of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the spatial resolution of the input features is n1 multiplied by n11, the weight matrix is a weight matrix of m1 multiplied by m1, and m1 is a non-zero even number; obtaining a feature to be operated and a weight matrix to be operated after row-column expansion according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 multiplied by n22, the weight matrix to be operated is m2 multiplied by m2, and m2 is an odd number larger than m 1; performing convolution operation on the weight matrix to be operated and the characteristics to be operated through a data processor to obtain a convolution operation result; and obtaining the output characteristics of the convolution layer according to the convolution operation result. The present disclosure may utilize a data processor to implement various types of convolution operations, thereby facilitating the implementation of rich convolution operations.

Description

Method, device, medium and electronic equipment for realizing convolution operation

Technical Field

The present disclosure relates to computer vision technology, and more particularly, to a method for implementing convolution operation, an apparatus for implementing convolution operation, a storage medium, and an electronic device.

Background

Computer vision techniques often do not leave convolution operations. For example, in a neural network such as CNN (Convolutional Neural Networks, convolutional neural network), RPN (Region Proposal Network, regional candidate neural network), and RNN (Recurrent Neural Network, cyclic neural network) used in the computer vision technology, at least one convolutional layer is generally included, each convolutional layer corresponds to a convolutional kernel of a preset size, each convolutional layer may perform a convolutional operation on an input feature of the convolutional layer based on the corresponding convolutional kernel, to form a new feature, and take the new feature as an output feature of the convolutional layer, so as to implement the convolutional operation of the convolutional layer.

At present, how to make a hardware unit support multiple types of convolution operations is a technical problem that is worth focusing on.

Disclosure of Invention

The present disclosure has been made in order to solve the above technical problems. Embodiments of the present disclosure provide a method, apparatus, storage medium, and electronic device for implementing convolution operations.

According to an aspect of an embodiment of the present disclosure, there is provided a method for implementing a convolution operation, the method including: acquiring an input characteristic of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the spatial resolution of the input characteristic is n1 multiplied by n11, the weight matrix is a weight matrix of m1 multiplied by m1, n1 and n11 are positive integers, and m1 is a non-zero even number; obtaining a feature to be operated and a weight matrix to be operated after row-column expansion according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 multiplied by n22, the weight matrix to be operated is a weight matrix of m2 multiplied by m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 multiplied by m2 is the size of the weight matrix supported by a data processor; performing convolution operation on the weight matrix to be operated and the characteristics to be operated through the data processor to obtain a convolution operation result; and obtaining the output characteristics of the convolution layer according to the convolution operation result.

According to another aspect of an embodiment of the present disclosure, there is provided a method for implementing a convolution operation, including: acquiring input characteristics of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the weight matrix is a weight matrix of m3 multiplied by m3, and m3 is an odd number; under the condition that a data processor does not support the characteristic filling requirement of the convolution layer, obtaining a weight matrix to be operated after row-column expansion according to the weight matrix, wherein the weight matrix to be operated is a weight matrix of m4 multiplied by m4, and the m4 multiplied by m4 is the size of the weight matrix supported by the data processor; performing convolution operation based on feature filling supported by a data processor on the weight matrix to be operated and the input features through the data processor to obtain a convolution operation result; and obtaining the output characteristics of the convolution layer according to the convolution operation result.

According to yet another aspect of an embodiment of the present disclosure, there is provided an apparatus for implementing a convolution operation, the apparatus including: the first acquisition module is used for acquiring the input characteristics of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the spatial resolution of the input characteristics is n1 multiplied by n11, the weight matrix is a weight matrix of m1 multiplied by m1, n1 and n11 are positive integers, and m1 is a non-zero even number; the second acquisition module is used for acquiring the characteristics to be operated and the weight matrix to be operated after the row and column expansion according to the input characteristics and the weight matrix acquired by the first acquisition module, wherein the spatial resolution of the characteristics to be operated is n2 multiplied by n22, the weight matrix to be operated is a weight matrix of m2 multiplied by m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 multiplied by m2 is the size of the weight matrix supported by the data processor; the first acquisition operation result module is used for performing convolution operation on the weight matrix to be operated and the feature to be operated, which are obtained by the second acquisition module, through the data processor to obtain a convolution operation result; the first acquisition output characteristic module is used for acquiring the output characteristics of the convolution layer according to the convolution operation result acquired by the first acquisition operation result module.

According to still another aspect of an embodiment of the present disclosure, there is provided an apparatus for implementing a convolution operation, the apparatus including: the third acquisition module is used for acquiring the input characteristics of the convolution layer and the weight matrix of the convolution kernel corresponding to the convolution layer, wherein the weight matrix is a weight matrix of m3 multiplied by m3, and m3 is an odd number; a fourth obtaining module, configured to obtain a weight matrix to be operated after row-column expansion according to the weight matrix obtained by the third obtaining module, where the weight matrix to be operated is a weight matrix of m4×m4, and m4×m4 is a size of the weight matrix supported by the data processor, where the data processor does not support a feature filling requirement of the convolutional layer; the second acquisition operation result module is used for performing convolution operation based on feature filling supported by the data processor on the weight matrix of m4 multiplied by m4 acquired by the fourth acquisition module and the input features acquired by the third acquisition module through the data processor to acquire a convolution operation result; and the second acquisition output characteristic module is used for acquiring the output characteristics of the convolution layer according to the convolution operation result acquired by the second acquisition operation result module.

According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for implementing the above method.

According to still another aspect of the embodiments of the present disclosure, there is provided an electronic device including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method described above.

Based on the method and the device for realizing convolution operation provided by the embodiments of the present disclosure, for the weight matrix with even number of rows and columns, the present disclosure may make the weight matrix of the convolution kernel be the weight matrix with odd number of rows and columns by performing row-column expansion on the weight matrix of the convolution layer, so that the data processor supporting the odd weight matrix may perform convolution operation on the input feature; by performing row-column expansion on the input features of the convolution layer, the output features of the convolution layer may be obtained from the results of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can complete convolution operation based on even convolution kernels by using the data processor supporting the odd convolution kernels, thereby being beneficial to realizing various convolution operations by using the data processor supporting the odd convolution kernels and further being beneficial to enriching the realization modes of the convolution operations.

Based on the method and the device for realizing convolution operation provided by the embodiments of the present disclosure, in the case that the data processor does not support the feature filling requirement of the convolution layer, the present disclosure makes the data processor perform the convolution operation on the input feature of the convolution layer by using the weight matrix after the expansion by performing row-column expansion on the weight matrix of the convolution core, based on the feature filling supported by the weight matrix, so that the output feature of the convolution layer can be obtained based on the result of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can realize the convolution operation of the convolution layer with the corresponding feature filling requirement by using the data processor which does not support the feature filling requirement of the convolution layer, thereby being beneficial to realizing various convolution operations by using the data processor and enriching the realization modes of the convolution operation.

The technical scheme of the present disclosure is described in further detail below through the accompanying drawings and examples.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing embodiments thereof in more detail with reference to the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, not to limit the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 is a schematic illustration of a scenario in which the present disclosure is applicable;

FIG. 2 is a flow chart of one embodiment of a method of the present disclosure for implementing convolution operations;

FIG. 3 is a schematic diagram of a first example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 4 is a schematic diagram of a second example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 5 is a schematic diagram of a third example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 6 is a schematic diagram of a fourth example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 7 is a schematic diagram of a fifth example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 8 is a flow chart of another embodiment of a method of the present disclosure for performing convolution operations;

FIG. 9 is a schematic diagram of an example of a convolution operation implementing a convolution layer of the present disclosure;

FIG. 10 is a schematic diagram of an embodiment of an apparatus for performing convolution operations of the present disclosure;

FIG. 11 is a schematic diagram of another embodiment of an apparatus for performing convolution operations of the present disclosure;

fig. 12 is a block diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present disclosure and not all of the embodiments of the present disclosure, and that the present disclosure is not limited by the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.

It will be appreciated by those of skill in the art that the terms "first," "second," etc. in embodiments of the present disclosure are used merely to distinguish between different steps, devices or modules, etc., and do not represent any particular technical meaning nor necessarily logical order between them.

It should also be understood that in embodiments of the present disclosure, "plurality" may refer to two or more, and "at least one" may refer to one, two or more.

It should also be appreciated that any component, data, or structure referred to in the presently disclosed embodiments may be generally understood as one or more without explicit limitation or the contrary in the context.

In addition, the term "and/or" in this disclosure is merely an association relationship describing an association object, and indicates that three relationships may exist, such as a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the front and rear association objects are an or relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and that the same or similar features may be referred to each other, and for brevity, will not be described in detail.

Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

Embodiments of the present disclosure are applicable to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with the terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.

Summary of the disclosure

In the process of implementing the present disclosure, the inventors found that, in general, the size of the convolution kernel in the width and height directions is odd×odd, for example, 1×1 or 3×3 or 5×5 or 7×7 or 11×11, etc., in consideration of factors such as the invariance of the size before and after convolution and the convolution anchor point. The convolution kernels having dimensions odd x odd in the width and height directions may be referred to as odd convolution kernels. Some hardware units support convolution operations of odd convolution kernels, but not even convolution kernels, due to factors such as power consumption and area of the hardware units. For example, when the size of the convolution kernel in the width and height directions is 4×4, some hardware units cannot perform a corresponding convolution operation for the convolution layer. If the convolution operation of the even convolution kernel can be realized by using the hardware unit supporting the convolution operation of the odd convolution kernel, the realization mode of the rich convolution operation is facilitated. Wherein the hardware unit may include, but is not limited to: CPU (Central Processing Unit ), BPU (Brain Processing Unit, brain processing unit), GPU (Graphics Processing Unit ) or FPGA (Field Programmable Gate Array, field programmable gate array), etc.

In addition, in the process of performing the convolution operation by the hardware unit, the input features of the convolution layer may be subjected to a filling process, for example, in consideration of factors such as a reduction in spatial resolution of the output features of the convolution layer obtained after the convolution operation or a loss of edge feature information of the input features of the convolution layer, so as to avoid the convolution operation. Some limitations exist in feature filling supported by some hardware units due to factors such as power consumption and area of the hardware units. Typically, the feature fill supported by the hardware unit is related to the size of the convolution kernel. For example, when the size of the convolution kernel corresponding to the convolution layer is 5×5, some hardware units support feature filling only for both cases 0 and 2; for another example, some hardware units only support feature filling for both 0 and 3 cases when the size of the convolution kernel corresponding to the convolution layer is 7×7. Wherein 0 indicates no filling. If a convolution operation based on feature filling of more cases (e.g., 0, 1, and 2, and further e.g., 0, 1, 2, and 3) can be implemented with hardware units supporting feature filling of both cases, an implementation of rich convolution operations is advantageous. Likewise, the hardware units may include, but are not limited to: CPU, BPU, GPU or FPGA, etc.

Exemplary overview

The technical scheme for realizing convolution operation provided by the disclosure can be suitable for various scenes. An example is shown in fig. 1.

In fig. 1, in the process of implementing the convolution operation by using the data processor 100, the data processor 100 is triggered to execute each instruction in the corresponding instruction string, for example, after the data processor 100 receives the predetermined interrupt signal, each instruction is sequentially executed according to the sequence of each instruction in the instruction string corresponding to the predetermined interrupt signal.

An example of the sequential execution of instructions by data processor 100 may be: the data processor 100 reads a weight matrix of a convolution kernel corresponding to a current convolution layer from the DDR SDRAM (Double Data RateSynchronous Dynamic Random Access Memory, double-rate synchronous dynamic Random Access Memory) 101 according to a first preset address, reads an input feature of the current convolution layer from the SRAM (Static Random-Access Memory) 102 according to a second preset address, and then performs a multiply-add operation on the currently read weight matrix and the input feature, and stores a result of the multiply-add operation in the SRAM102 according to a third preset address, wherein an output feature of the current convolution layer can be formed by the result of the multiply-add operation, thereby implementing the convolution operation of the current convolution layer.

Exemplary method

FIG. 2 is a flow chart of one embodiment of a method of the present disclosure for implementing convolution operations. The method as shown in fig. 2 includes: s200, S201, S202, and S203. The steps are described separately below.

S200, acquiring input features of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer.

The convolutional layer in this disclosure is typically one layer in a neural network. The input features of the convolutional layer may include, but are not limited to: feature Map (Feature Map) of an image, feature vector of audio, and the like. The convolution layer in the present disclosure is used to represent performing convolution operation on its input features to further extract features from the input features, where the extracted features form its output features. The convolution layers in this disclosure correspond to convolution kernels. The size of the convolution kernel corresponding to the convolution layer is the size of the weight matrix of the convolution kernel. For example, the size of the convolution kernel is 4×4×c, and the size of the weight matrix of the convolution kernel is 4×4×c. Wherein C represents the number of channels. Each element in the weight matrix of the convolution kernel corresponding to a convolution layer may refer to a weight that each neuron in the convolution layer links with a corresponding neuron in a next layer in the neural network, respectively.

The spatial resolution of the input features in the present disclosure is, for example, n1×n11, where n1 and n11 are positive integers, i.e., n1 and n11 are integers greater than zero. The weight matrix of the convolution kernel in the present disclosure is a weight matrix of m1×m1, that is, the size of the weight matrix is m1×m1, where m1 is a non-zero even number. For example, m1 may be 2, 4, 6, or the like. The present disclosure is schematically described with reference to the spatial resolution of the input feature being n1×n11, and the spatial resolution of the input feature may be any size in the width and height directions without being limited by the size of n1×n11. n1 may be equal to n 11. In the following embodiments, description will be made mainly taking n1×n11 as n1×n1 as an example. The m1×m1 refers to the length and width of the weight matrix. Since the number of channels of each element of the weight matrix of the convolution kernel is the same in the width and height directions, in the description of the present disclosure, the description of the number of channels of the weight matrix is omitted. Since the number of channels of each element in the spatial resolution (w×h, width direction and height direction of the input feature) of the input feature of the convolution layer is the same, in the description of the present disclosure, the description of the number of channels of the input feature of the convolution layer is omitted. Also, since the number of channels of each element in the spatial resolution of the output feature of the convolution layer is the same, in the description of the present disclosure, the description of the number of channels of the output feature of the convolution layer is omitted.

S201, obtaining the feature to be operated and the weight matrix to be operated after the row and column expansion according to the input feature and the weight matrix.

The feature to be operated after the row and column expansion in the present disclosure is a new input feature obtained by performing row and column expansion on the input feature of the convolution layer by the pointer. For example, at least one row and at least one column are added to the input features of the convolutional layer to form the feature to be operated on. The number of rows added for the input features of the convolutional layer is typically the same as the number of columns added for the input features of the convolutional layer. That is, if the spatial resolution of the feature to be operated is n2×n22, n2 is an integer greater than n1, and n22 is an integer greater than n 11. n2 may be equal to n 22. In the following embodiments, description will be made mainly taking n2×n22 as n2×n2 as an example. In addition, the present disclosure typically adds rows and columns, respectively, on the outermost sides of the input features. The outermost side of the input feature, such as the left side of the leftmost column of the input feature, the right side of the rightmost column of the input feature, the upper side of the uppermost row of the input feature, or the lower side of the lowermost row of the input feature. The number of channels of each element in the rows and columns added in the feature to be computed by the present disclosure is the same as the number of channels of the elements in the input feature.

The weight matrix to be operated after the row and column expansion in the present disclosure is a new weight matrix obtained by performing row and column expansion on the weight matrix of the convolution kernel corresponding to the convolution layer by the pointer. For example, at least one row and at least one column are added to the weight matrix of the convolution kernel, thereby forming a weight matrix to be operated. The number of rows added to the weight matrix of the convolution kernel is typically the same as the number of columns added to the weight matrix. That is, if the size of the weight matrix to be operated is m2×m2, m2 is an integer greater than m1, and m2 is an odd number, and further, m2×m2 is the size of the convolution kernel supported by the data processor. In addition, the present disclosure typically adds rows and columns, respectively, at the outermost side of the weight matrix of the convolution kernel. The values of the elements in the rows and columns added at the outermost side of the weight matrix are typically: the predetermined value of the convolution operation result of the input feature and the weight matrix is not changed. The outermost side of the weight matrix, such as the left side of the leftmost column of the weight matrix, the right side of the rightmost column of the weight matrix, the upper side of the uppermost row of the weight matrix, or the lower side of the lowermost row of the weight matrix. The number of channels of each element in the rows and columns added in the weight matrix to be operated by the present disclosure is the same as the number of channels of the elements in the weight matrix.

S202, performing convolution operation on the weight matrix to be operated and the feature to be operated through a data processor to obtain a convolution operation result.

A data processor in the present disclosure may refer to a hardware unit having data computing capabilities, for example, the data processor may include, but is not limited to: CPU, BPU, GPU or FPGA, etc.

The convolution operation in the present disclosure may refer to an operation performed to implement feature extraction of a feature to be operated on, for example, the data processor performs a multiply-add operation or the like on the weight matrix to be operated on and the feature to be operated on. In addition, the step size of the convolution operation in the present disclosure may be 1 or other value.

S203, obtaining the output characteristics of the convolution layer according to the convolution operation result.

The convolution operation result in the present disclosure may refer to an operation result of a multiply-add operation or the like based on the weight matrix to be operated and the feature to be operated. The output feature of the convolution layer in the present disclosure may be a convolution operation result obtained by performing a convolution operation on the weight matrix of the convolution kernel corresponding to the convolution layer and the input feature of the convolution layer by the pointer. Although the weight matrix to be operated and the feature to be operated in the disclosure are the results after the row and column expansion, the convolution operation result based on the weight matrix to be operated and the feature to be operated may contain redundant content (for example, the feature filling requirement exists in the convolution layer), or may not contain redundant content (for example, the feature filling requirement does not exist in the convolution layer). Under the condition that the convolution operation result does not contain redundant contents, the convolution operation result can be directly used as the output characteristic of the convolution layer. Under the condition that the convolution operation result contains redundant contents, the method and the device can remove redundant contents in the convolution operation result based on the weight matrix to be operated and the characteristics to be operated, so that the output characteristics of the convolution layer are obtained, namely, the convolution operation result based on the weight matrix and the input characteristics of the convolution kernel corresponding to the convolution layer is obtained.

For weight matrixes with even numbers of rows and columns (namely even number convolution kernels), the weight matrixes of the convolution layers are subjected to row-column expansion, so that the weight matrixes of the convolution kernels become weight matrixes with odd numbers of rows and columns (namely odd number convolution kernels), and a data processor supporting the odd number weight matrixes can execute convolution operation aiming at input features; by performing row-column expansion on the input features of the convolution layer, the output features of the convolution layer may be obtained from the results of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can complete convolution operation based on even convolution kernels by using the data processor supporting the odd convolution kernels, thereby being beneficial to realizing various convolution operations by using the data processor and further being beneficial to enriching the realization modes of the convolution operations.

In an alternative example, the manner in which the present disclosure obtains the rank-extended feature-to-be-operated and weight-to-be-operated matrix is typically: and obtaining the input feature to be operated and the weight matrix to be operated after the row and column expansion by adopting a mode of adding at least one expansion row and at least one expansion column in the same direction of the input feature and the weight matrix. Wherein the same orientation may refer to the same position of the input feature and weight matrix. That is, the same orientation of the input feature and the weight matrix may be the uppermost side of the input feature and the uppermost side of the weight matrix, the leftmost side of the input feature and the leftmost side of the weight matrix, the bottommost side of the input feature and the bottommost side of the weight matrix, or the rightmost side of the input feature and the rightmost side of the weight matrix.

According to the convolution processing method and the convolution processing device, the extension rows and the extension columns are added to the same directions of the input feature and the weight matrix, so that the convolution operation result of the input feature and the weight matrix is identical to the convolution operation result of the weight matrix to be operated and the feature to be operated, and the output feature of the convolution layer can be conveniently obtained.

In one optional example, the present disclosure adds the same number of rows of extension rows and the same number of columns of extension columns, respectively, at the same orientation of the input feature and weight matrix. The number of rows of extended rows and the number of columns of extended columns added in the input feature and weight matrix of the present disclosure may be the same. For example, the present disclosure adds a-row extension rows in the same orientation of the input feature and weight matrix, respectively, and b-column extension columns in the same orientation of the weight matrix of the input feature, respectively. Wherein the values of a and b may be the same. In addition, the values of a and b should be as small as possible to avoid the data processor from performing excessive convolution operations, thereby reducing the calculation amount of the data processor in the process of performing convolution operations.

According to the method and the device, the extended rows with the same number of rows and the extended columns with the same number of columns are added to the same directions of the input feature and the weight matrix, so that the convolution operation result of the input feature and the weight matrix is identical to the convolution operation result of the weight matrix to be operated and the feature to be operated, and the output feature of the convolution layer can be conveniently obtained.

In an alternative example, in the case where the data processor does not support the feature filling requirement of the convolutional layer in the disclosure, the disclosure should obtain the input feature to be operated and the weight matrix to be operated after the row-column expansion in a manner of adding the same number of expansion rows (such as one row of expansion rows) and the same number of expansion columns (such as one column of expansion columns) on the upper side of the uppermost row and the left side of the leftmost column of the input feature and weight matrix, respectively. The feature fill requirement of the convolutional layer in this disclosure may refer to padding in the neural network. The padding in the neural network is usually directed to the input feature of the convolutional layer, for example, a circle of predetermined feature values (e.g., 0) are filled in the outermost side of the input feature of the convolutional layer, so that the spatial resolution of the input feature of the convolutional layer is changed from n1×n1 to (n1+2a) × (n1+2a).

Alternatively, assume that the size of the convolution kernel corresponding to the convolution layer is m1×m1 and that the feature fill supported by the data processor is limited, i.e., the feature fill supported by the data processor is 0 or one-half (m 1-1). If the feature fill requirement of a convolutional layer is an integer between 0 and one half of (m 1-1), then the data processor does not support the feature fill requirement of the convolutional layer. Under the above assumed situation, the present disclosure should obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding the same number of expansion rows and the same number of expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the input feature and weight matrix, respectively. At this time, the data processor may perform convolution operation on the weight matrix to be operated and the feature to be operated based on the feature filling (e.g., half of (m 1-1)) supported by the data processor when performing convolution operation.

According to the method and the device, when the data processor does not support the characteristic filling requirements of the convolution layer in the method, the input characteristics to be operated and the weight matrix to be operated after row-column expansion are obtained by adding the same number of expansion rows and the same number of expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the input characteristics and the weight matrix, so that the convolution operation results based on the characteristic filling requirements of the convolution layer of the input characteristics and the weight matrix are completely displayed in the preset areas of the convolution operation results of the weight matrix to be operated and the characteristics to be operated, and the output characteristics based on the characteristic filling requirements of the convolution layer are conveniently obtained.

In an alternative example, since part of rows and part of columns in the convolution operation results of the weight matrix to be operated and the feature to be operated in the present disclosure are formed by the feature filling of the data processor for the input feature to be operated, there are redundant rows and columns in the convolution operation results of the weight matrix to be operated and the feature to be operated based on the feature filling. That is, part of the rows and part of the columns in the feature-filling-based convolution operation results of the weight matrix to be operated and the feature to be operated are the feature-filling-based convolution operation results of the weight matrix and the input feature, so that the present disclosure can intercept part of the rows and part of the columns from the feature-filling-based convolution operation results of the weight matrix to be operated and the feature to be operated, and take the intercepted part of the rows and part of the columns as the output features of the convolution layer.

The method and the device take the partial rows and partial columns which are cut from the weight matrix to be operated and the convolution operation result based on feature filling of the feature to be operated as the output features of the convolution layer, so that the convolution operation of even convolution kernels can be realized by using the data processor, and the convolution operation of the feature filling requirements which are not supported by the data processor can be realized, thereby being beneficial to further enriching the realization modes of the convolution operation.

In an optional example, the values of each element in each extension row and each extension column in the weight matrix to be operated after the row and column extension obtained in the present disclosure may be zero, and the values of each element in each extension row and each extension column in the feature to be operated after the row and column extension may be arbitrary values. According to the method and the device, the values of the elements in the extension rows and the extension columns in the weight matrix to be operated are set to be zero, and the values of the elements in the extension rows and the extension columns in the feature to be operated are set to be any values (including zero), so that the convolution operation results of the input feature and the weight matrix can be completely displayed in the convolution operation results of the weight matrix to be operated and the feature to be operated, and the output feature of the convolution layer can be conveniently and rapidly obtained.

In one optional example, the number of extended rows in the present disclosure may be: the difference between the number of rows of the weight matrix of the first convolution kernel supported by the data processor and the number of rows of the weight matrix of the convolution kernel corresponding to the convolution layer. The number of columns of the extended columns in the present disclosure may be: the difference between the number of columns of the weight matrix of the first convolution kernel supported by the data processor and the number of columns of the weight matrix of the convolution kernel corresponding to the convolution layer. The weight matrix of the first convolution kernel is as follows: the weight matrix of each weight matrix supported by the data processor is larger than the weight matrix of the convolution kernel corresponding to the convolution layer, and the weight matrix closest to the weight matrix of the convolution kernel corresponding to the convolution layer in size.

In the foregoing example, the values of a and b in the foregoing example may be determined according to the size of the convolution kernel supported by the data processor and the size of the convolution kernel corresponding to the convolution layer, for example, assuming that the sizes of the convolution kernels supported by the data processor are a1×a1, a2×a2, and a3×a3, where a1, a2, and a3 are all odd numbers, and assuming that the sizes of the convolution kernels corresponding to the convolution layer are a4×a4, the values of a and b in the present disclosure may be the smallest positive numbers in a1-a4, a2-a4, and a3-a 4. In general, the values of a and b may be 1.

The method and the device determine the number of rows of the extended rows and the number of columns of the extended columns by utilizing the number of rows and the number of columns of the weight matrix of the first convolution kernel supported by the data processor, are beneficial to avoiding the data processor from executing unnecessary convolution operation and are beneficial to avoiding storing unnecessary convolution operation results, so that the method and the device are beneficial to saving calculation resources and cache resources of the data processor.

The present disclosure uses the row-column expanded feature to be operated and the weight matrix to be operated to implement an example of convolution operation of the convolution layer, as shown in fig. 3.

In fig. 3, it is assumed that the size of the convolution kernel supported by the data processor includes 5×5. Assume that the spatial resolution of the input feature 300 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 301 of the convolution kernel corresponding to the convolution layer is 4×4. It is further assumed that the convolutional layer has no feature fill requirement.

Under the assumption, the present disclosure may add a column of expansion columns on the rightmost side of the input feature 300 in fig. 3, where the values of the elements in the expansion columns may be arbitrary values; adding a row of expansion lines at the lowest side of the input feature 300, wherein the values of the elements of the expansion lines can be arbitrary values; the present disclosure thus obtains features 302 to be operated on with a spatial resolution of 8 x 8. Likewise, in the present disclosure, a column of expansion columns may be added at the rightmost side of the weight matrix 301 in fig. 3, where the values of the elements in the expansion columns are all zero; adding a row of extension rows at the lowest side of the weight matrix 301, wherein the values of all elements in the extension rows are zero; the present disclosure thus obtains a 5×5-sized weight matrix to be operated 303. It can be seen that the convolution operation (304 in fig. 3 indicates a multiply-add operation) of the input feature 300 and the weight matrix 301 in the present disclosure is changed to the convolution operation of the feature 302 and the weight matrix 303.

The convolution result 306 of the feature to be operated 302 and the weight matrix to be operated 303 (i.e., the convolution result with a step size of 1) is identical to the convolution result 305 of the input feature 300 and the weight matrix 301 (also the convolution result with a step size of 1). Thus, the present disclosure may directly take the convolution operation 306 as an output feature of the convolution layer.

The present disclosure uses the row-column expanded feature to be operated and the weight matrix to be operated to implement another example of convolution operation of the convolution layer, as shown in fig. 4.

In fig. 4, it is assumed that the size of the convolution kernel supported by the data processor includes 5×5. Assume that the spatial resolution of the input feature 400 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 401 of the convolution kernel corresponding to the convolution layer is 4×4. It is further assumed that the convolutional layer has no feature fill requirement.

Under the assumption, the present disclosure may add a column of expansion columns on the rightmost side of the input feature 400 in fig. 4, where the values of the elements in the expansion columns may be arbitrary values; adding a row of expansion lines at the uppermost side of the input feature 400, wherein the values of the elements of the expansion lines can be arbitrary values; the present disclosure thus obtains features to be operated on 402 with a spatial resolution of 8 x 8. Likewise, in the present disclosure, a column of expansion columns may be added at the rightmost side of the weight matrix 401 in fig. 4, where the values of the elements in the expansion columns are all zero; adding a row of extension rows at the uppermost side of the weight matrix 401, wherein the values of all elements in the extension rows are zero; the present disclosure thus obtains a 5×5 size matrix of weights to be operated 403. It can be seen that the convolution operation (404 in fig. 4 indicates a multiply-add operation) of the input feature 400 and the weight matrix 401 in the present disclosure is changed to the convolution operation of the feature to be operated 402 and the weight matrix to be operated 403.

The convolution result 406 of the feature to be operated 402 and the weight matrix to be operated 403 (i.e., the convolution result with a step size of 1) is identical to the convolution result 405 of the input feature 400 and the weight matrix 401 (also the convolution result with a step size of 1). Thus, the present disclosure may directly take the convolution operation result 406 as an output feature of the convolution layer.

The present disclosure uses the rank-extended feature to be operated and the weight matrix to be operated to implement yet another example of the convolution operation of the convolution layer, as shown in fig. 5.

In fig. 5, it is assumed that the size of the convolution kernel supported by the data processor includes 5×5. Assume that the spatial resolution of the input feature 500 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 501 of the convolution kernel corresponding to the convolution layer is 4×4. It is further assumed that the convolutional layer has no feature fill requirement.

Under the assumption, the present disclosure may add a column of expansion columns to the leftmost side of the input feature 500 in fig. 5, where the values of the elements in the expansion columns may be arbitrary values; adding a row of expansion lines at the uppermost side of the input feature 500, wherein the values of the elements of the expansion lines can be arbitrary values; the present disclosure thus obtains features to be operated 502 with a spatial resolution of 8 x 8. Likewise, in the disclosure, a column of expansion columns may be added at the leftmost side of the weight matrix 501 in fig. 5, where the values of the elements in the expansion columns are all zero; adding a row of extension rows at the uppermost side of the weight matrix 501, wherein the values of all elements in the extension rows are zero; the present disclosure thus obtains a 5×5 size matrix of weights to be operated 503. As can be seen, the convolution operation (504 in fig. 5 indicates a multiply-add operation) of the input feature 500 and the weight matrix 501 in the present disclosure is changed to the convolution operation of the feature 502 and the weight matrix 503.

The convolution result 506 of the feature 502 to be operated and the weight matrix 503 to be operated (i.e., the convolution result with a step size of 1) is identical to the convolution result 505 of the input feature 500 and the weight matrix 501 (also the convolution result with a step size of 1). Thus, the present disclosure may directly take the convolution operation result 506 as an output feature of the convolution layer.

The present disclosure uses the rank-expanded feature to be operated and the weight matrix to be operated to implement yet another example of the convolution operation of the convolution layer, as shown in fig. 6.

In fig. 6, it is assumed that the size of the convolution kernel supported by the data processor includes 5×5. Assume that the spatial resolution of the input feature 600 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 601 of the convolution kernel corresponding to the convolution layer is 4×4. It is further assumed that the convolutional layer has no feature fill requirement.

Under the above assumption, the present disclosure may add a column of expansion columns to the leftmost side of the input feature 600 in fig. 6, where the values of the elements in the expansion columns may be arbitrary values; and adding a row of extension lines at the lowest side of the input feature 600, wherein the values of the elements of the extension lines can be arbitrary values; the present disclosure thus obtains features 602 to be operated on with a spatial resolution of 8 x 8. Likewise, in the present disclosure, a column of expansion columns may be added at the leftmost side of the weight matrix 601 in fig. 6, where the values of the elements in the expansion columns are all zero; adding a row of extension rows at the lowest side of the weight matrix 601, wherein the values of all elements in the extension rows are zero; the present disclosure thus obtains a 5×5 size weight matrix to be operated 603. It can be seen that the convolution operation (604 in fig. 6 indicates a multiply-add operation) of the input feature 600 and the weight matrix 601 in the present disclosure is changed to the convolution operation of the feature 602 and the weight matrix 603.

The convolution result 606 of the feature to be operated 602 and the weight matrix to be operated 603 (i.e., the convolution result with a step size of 1) is identical to the convolution result 605 of the input feature 600 and the weight matrix 601 (also the convolution result with a step size of 1). Thus, the present disclosure may directly take the convolution operation result 606 as an output feature of the convolution layer.

The present disclosure uses the rank-extended feature to be operated and the weight matrix to be operated to implement yet another example of the convolution operation of the convolution layer, as shown in fig. 7.

In fig. 7, it is assumed that the size of the convolution kernel supported by the data processor includes 5×5. Assume that the spatial resolution of the input feature 700 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 701 of the convolution kernel corresponding to the convolution layer is 4×4. Assume again that the feature fill requirement of the convolutional layer is 1 and the feature fill supported by the data processor is limited, i.e., the feature fill supported by the data processor is 0 and 2.

Under the assumption, the present disclosure may add a column of expansion columns to the leftmost side of the input feature 700 of fig. 7, where the values of the elements in the expansion columns may be arbitrary values; adding a row of expansion lines at the uppermost side of the input feature 700, wherein the values of the elements of the expansion lines can be arbitrary values; the present disclosure thus obtains features to be operated 702 with a spatial resolution of 8 x 8. Likewise, in the disclosure, a column of expansion columns may be added at the leftmost side of the weight matrix 701 in fig. 7, where the values of the elements in the expansion columns are all zero; adding a row of extension rows at the uppermost side of the weight matrix 701, wherein the values of all elements in the extension rows are zero; the present disclosure thus obtains a 5×5 size weight matrix to be operated 703. As can be seen, the convolution operation (704 in fig. 7 represents the multiply-add operation) of the input feature 700 and the weight matrix 701 in the present disclosure is changed to the convolution operation of the feature 702 and the weight matrix 703.

The convolution result 706 of the feature to be operated 702 and the weight matrix to be operated 703 (i.e. the convolution result with a step size of 1) is different from the convolution result 705 of the input feature 700 and the weight matrix 701 (also the convolution result with a step size of 1, the outermost circle in the convolution operation 705 is formed due to the feature filling requirement of the convolution layer). Specifically, the spatial resolution of the convolution operation result 706 is 8×8, and the spatial resolution of the convolution operation result 705 is 6×6. The present disclosure may intercept the contents of row 2, column 2 through column 7 from the convolution operation result 706 and take the intercepted contents as the output features of the convolution layer. That is, the present disclosure may obtain the output characteristics of the convolution layer after removing the outermost wrap of content of the convolution operation 706.

Fig. 8 is a flow chart of yet another embodiment of a method of the present disclosure for implementing convolution operations. The method as shown in fig. 8 includes: s800, S801, S802, and S803. The steps are described separately below.

S800, acquiring input characteristics of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer.

The convolutional layer in this disclosure is typically one layer in a neural network. The input features of the convolutional layer may include, but are not limited to: feature Map (Feature Map) of an image, feature vector of audio, and the like. The convolution layer in the present disclosure is used to represent performing convolution operation on its input features to further extract features from the input features, where the extracted features form its output features. The convolution layers in this disclosure have convolution kernels. The size of the convolution kernel corresponding to the convolution layer is the size of the weight matrix of the convolution kernel. For example, if the size of the convolution kernel is 5×5, the size of the weight matrix of the convolution kernel is 5×5. Each element in the weight matrix of the convolution kernel corresponding to a convolution layer may refer to a weight that each neuron in the convolution layer links with a corresponding neuron in a next layer in the neural network, respectively.

The weight matrix of the convolution kernel in the present disclosure is a weight matrix of m3×m3, i.e., the size of the weight matrix is m3×m3, where m3 is an odd number greater than zero. For example, m3 may be 3, 5, 7, or the like.

S801, under the condition that the data processor does not support the characteristic filling requirement of the convolution layer, obtaining a weight matrix to be operated after row-column expansion according to the weight matrix.

The feature fill requirements in this disclosure are for the input features of the convolutional layer. Feature fill requirements mean feature filling around the input features of the convolutional layer to increase the spatial resolution of the input features.

The weight matrix to be operated after the row and column expansion in the present disclosure is a new weight matrix obtained by performing row and column expansion on the weight matrix of the convolution kernel corresponding to the convolution layer by the pointer. For example, at least one row and at least one column are added to the weight matrix of the convolution kernel, thereby forming a weight matrix to be operated. The number of rows added to the weight matrix of the convolution kernel is typically the same as the number of columns added to the weight matrix. That is, if the size of the weight matrix to be operated is m4×m4, m4 is an integer greater than m3, and m4 may be even in the case where the data processor supports even convolution kernels, and m4 should be odd in the case where the data processor does not support even convolution kernels. m4×m4 is the size of the convolution kernel supported by the data processor. In addition, the present disclosure typically adds rows and columns, respectively, at the outermost side of the weight matrix of the convolution kernel. The values of the elements in the rows and columns added at the outermost side of the weight matrix are typically: the predetermined value of the convolution operation result of the input feature and the weight matrix is not changed as much as possible.

S802, performing convolution operation based on feature filling supported by a data processor on the weight matrix to be operated and the input features through the data processor to obtain a convolution operation result.

The convolution operation in the present disclosure may refer to an operation performed to implement feature extraction of a feature to be operated on the basis of feature filling, for example, a data processor performs feature filling processing on an input feature of a convolution layer to obtain the feature to be operated; then, the data processor performs multiply-add operation and the like on the feature to be operated and the weight matrix to be operated, thereby obtaining a convolution operation result. The step size of the convolution operation in the present disclosure may be 1 or other value.

S803, according to the convolution operation result, the output characteristics of the convolution layer are obtained.

The output characteristics of the convolution layer in the disclosure may be a weight matrix and input characteristics of a convolution kernel corresponding to the convolution layer by a pointer, and a convolution operation based on a characteristic filling requirement is performed, so as to obtain a convolution operation result.

Because the data processor in the present disclosure performs feature filling processing on the input features of the convolution layer, where the feature filling processing does not meet the feature filling requirement of the convolution layer, the convolution operation result obtained in the present disclosure based on the weight matrix to be operated and the feature to be operated may include redundant contents, and the present disclosure may remove redundant contents in the convolution operation result based on the weight matrix to be operated and the feature to be operated (for example, remove two rightmost columns and two bottommost rows in the convolution operation result), thereby obtaining the output features of the convolution layer. The redundant content in the convolution operation result can be determined according to the actual situation, and the disclosure is not limited to this.

Under the condition that the data processor does not support the characteristic filling requirement of the convolution layer, the weight matrix of the convolution kernel is subjected to row-column expansion, so that the data processor can utilize the expanded weight matrix to perform convolution operation on the input characteristic of the convolution layer based on the characteristic filling supported by the weight matrix, and the output characteristic of the convolution layer can be obtained based on the result of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can realize the convolution operation of the convolution layer with the corresponding feature filling requirement by using the data processor which does not support the feature filling requirement of the convolution layer, thereby being beneficial to realizing various convolution operations by using the data processor and enriching the realization modes of the convolution operation.

In an alternative example, the manner of obtaining the matrix of weights to be operated after the rank expansion of the present disclosure is generally: the weight matrix to be operated after the row and column expansion is obtained by adding the same number of expansion rows and expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the weight matrix of the convolution kernel. For example, where the weights of the convolution kernel are even weight matrices and the data processor supports odd weight matrices, the present disclosure may add one row of extension rows on top of the uppermost row of the weight matrix and one column of extension columns on the left of the leftmost column of the weight matrix. For another example, in the case where the weight of the convolution kernel is an odd weight matrix and the data processor supports the odd weight matrix, the present disclosure may add two rows of extension rows on top of the uppermost row of the weight matrix and two columns of extension columns on the left of the leftmost column of the weight matrix. The number of extension rows and extension columns added to the upper side of the uppermost row and the left side of the leftmost column of the weight matrix of the convolution kernel is generally determined according to the size of the convolution kernel corresponding to the convolution layer, the actual feature filling requirement, the size of the convolution kernel supported by the data processor, and the feature filling supported by the data processor. The method and the device can set a table in advance according to the size of the convolution kernel corresponding to the convolution layer, the actual feature filling requirement, the size and feature filling of the convolution kernel supported by the data processor and the number of the extension rows and the extension columns, so that the number of the extension rows and the extension columns can be determined through table lookup.

According to the method and the device, the extended rows and the extended columns are added to the upper side of the uppermost row and the left side of the leftmost column of the weight matrix respectively, so that the convolution operation result based on the feature filling requirement of the input feature and the weight matrix is enabled to completely appear in the preset area of the convolution operation result based on the feature filling supported by the data processor of the input feature and the weight matrix to be operated, and the output feature of the convolution layer can be conveniently obtained.

In an optional example, the values of each extension row and each element in each extension column in the weight matrix to be operated after the column extension obtained by the present disclosure may be zero. According to the method and the device, the values of the elements in the extension rows and the extension columns in the weight matrix to be operated are set to be zero, so that the convolution operation result based on the feature filling requirement of the input feature and the weight matrix can be completely displayed in the convolution operation result based on the feature filling supported by the data processor of the input feature and the weight matrix to be operated, and the output feature of the convolution layer can be conveniently obtained.

The present disclosure implements an example of convolution operation of a convolution layer using a matrix of weights to be operated after row-column expansion, as shown in fig. 9.

In fig. 9, it is assumed that the sizes of convolution kernels supported by the data processor include 5×5 and 7×7. Assume that the spatial resolution of the input feature 900 of a convolution layer of the present disclosure is 7×7, and the size of the weight matrix 901 of the convolution kernel corresponding to the convolution layer is 5×5. Assume again that the feature fill requirement of the convolutional layer is 1 and the feature fill supported by the data processor is limited, i.e., the feature fill supported by the data processor is 0 and 2, or 0 and 3.

Under the above assumption, the present disclosure may add two columns of extension columns at the leftmost side of the weight matrix 901 of fig. 9, where the values of the elements in the two columns of extension columns are all zero; adding two rows of extension rows at the uppermost side of the weight matrix 901, wherein the values of elements in the two rows of extension rows are zero; the present disclosure thus obtains a weight matrix to be operated 903 of size 7×7. As can be seen, the convolution operation (904 in fig. 9 represents a multiply-add operation) between the input feature 900 and the weight matrix 901 in the present disclosure is changed to the convolution operation between the input feature 900 and the weight matrix 903 to be operated.

The convolution result 906 (step size 1) based on the feature filling requirement of 1 of the input feature 900 and the weight matrix 903 to be operated is different from the convolution result 905 (step size 1) based on the feature filling of 3 supported by the data processor of the input feature 900 and the weight matrix 901. Specifically, the spatial resolution of the convolution operation result 905 is 5×5, and the spatial resolution of the convolution operation result 906 is 7×7. The present disclosure may intercept the content of row 1, column 1 through column 5 from the convolution operation result 906 and take the intercepted content as the output feature of the convolution layer. That is, the present disclosure can obtain the output characteristics of the convolution layer, that is, the convolution operation result 905 after removing the contents of the rightmost two columns and the bottommost two rows of the convolution operation result 906. The outermost turn in convolution operation 905 is due to the feature fill requirement of the convolution layer.

Exemplary apparatus

Fig. 10 is a schematic structural diagram of one embodiment of an apparatus for implementing convolution operations of the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments of the present disclosure shown in fig. 3-7. The apparatus as shown in fig. 10 includes: the device comprises a first acquisition module 1000, a second acquisition module 1001, a first acquisition operation result module 1002 and a first acquisition output characteristic module 1003.

The first obtaining module 1000 is configured to obtain an input feature of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer. The spatial resolution of the input features is n1×n11, the weight matrix is a weight matrix of m1×m1, n1 and n11 are positive integers, and m1 is a non-zero even number.

The second obtaining module 1001 is configured to obtain the feature to be operated and the weight matrix to be operated after the row and column expansion according to the input feature and the weight matrix obtained by the first obtaining module 1000. The spatial resolution of the feature to be operated is n2 multiplied by n22, the weight matrix to be operated is a weight matrix of m2 multiplied by m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 multiplied by m2 is the size of the weight matrix supported by the data processor.

Optionally, the second obtaining module 1001 may obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding at least one expansion row and at least one expansion column in the same direction of the input feature and the weight matrix.

Alternatively, the second obtaining module 1001 may obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding the same number of expansion rows and the same number of expansion columns in the same direction of the input feature and the weight matrix.

Alternatively, the number of extended rows may be the difference between the number of rows of the weight matrix of the first convolution kernel supported by the data processor and the number of rows of the weight matrix of the convolution kernel corresponding to the convolution layer. The number of columns of the extended columns may be the difference of the number of columns of the weight matrix of the first convolution kernel supported by the data processor and the number of columns of the weight matrix of the convolution kernel corresponding to the convolution layer; the weight matrix of the first convolution kernel is as follows: the weight matrix of each weight matrix supported by the data processor is larger than the weight matrix of the convolution kernel corresponding to the convolution layer, and the weight matrix closest to the weight matrix of the convolution kernel corresponding to the convolution layer in size.

Optionally, in the case that the data processor does not support the feature filling requirement of the convolutional layer, the second obtaining module 1001 may obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding the same number of expansion rows and the same number of expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the input feature and weight matrix, respectively.

Optionally, the second obtaining module 1001 may set the values of the elements in the extension row and the extension column in the weight matrix to be operated to zero. The second obtaining module 1001 may set the values of the elements in the extension row and the extension column in the feature to be calculated to arbitrary values.

The first obtaining operation result module 1002 is configured to perform convolution operation on the weight matrix to be operated and the feature to be operated obtained by the second obtaining module 1001 by using a data processor, so as to obtain a convolution operation result.

Optionally, under the condition that the data processor does not support the feature filling requirement of the convolution layer, the first operation result obtaining module 1002 is configured to perform, by using the data processor, convolution operation on the weight matrix to be operated and the feature to be operated based on feature filling supported by the data processor, to obtain a convolution operation result.

The first obtaining output characteristic module 1003 is configured to obtain an output characteristic of the convolution layer according to the convolution operation result obtained by the first obtaining operation result module 1002.

Alternatively, in the case that the data processor does not support the feature filling requirement of the convolution layer, the first obtaining output feature module 1003 may obtain the output feature of the convolution layer according to a part of rows and a part of columns in the convolution operation result.

FIG. 11 is a schematic diagram of an embodiment of an apparatus for performing convolution operations of the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments of the present disclosure shown in fig. 8-9. The apparatus as shown in fig. 11 includes: the third obtaining module 1100, the fourth obtaining module 1101, the second obtaining operation result module 1102, and the second obtaining output feature module 1103.

The third obtaining module 1100 is configured to obtain the input feature of the convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer. Wherein the weight matrix is a weight matrix of m3 x m3, and m3 is an odd number.

The fourth obtaining module 1101 is configured to obtain a weight matrix to be operated after the rank expansion according to the weight matrix obtained by the third obtaining module 1100 when the data processor does not support the feature filling requirement of the convolutional layer. The weight matrix to be operated is a weight matrix of m4 xm 4, wherein m4 xm 4 is the size of the weight matrix supported by the data processor.

Alternatively, the fourth obtaining module 1101 may obtain the weight matrix to be calculated by adding the same number of extension rows and extension columns to the upper side of the uppermost row and the left side of the leftmost column of the weight matrix.

Optionally, the fourth obtaining module 1101 may set the values of the elements in the extension row and the extension column in the weight matrix to be operated to zero.

The second obtaining operation result module 1102 is configured to perform, by using a data processor, a convolution operation based on feature filling supported by the data processor on the weight matrix to be operated obtained by the fourth obtaining module 1101 and the input feature obtained by the third obtaining module 1100, so as to obtain a convolution operation result.

The second obtaining output feature module 1103 is configured to obtain the output feature of the convolution layer according to the convolution operation result obtained by the second obtaining operation result module 1102.

Optionally, the second obtaining output feature module 1103 may obtain the output feature of the convolution layer according to a part of rows and a part of columns in the convolution result obtained by the second obtaining operation result module 1102.

Exemplary electronic device

An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 12. Fig. 12 shows a block diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 12, the electronic device 121 includes one or more processors 1211 and memory 1212.

The processor 1211 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 121 to perform the desired functions.

Memory 1212 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example: random Access Memory (RAM) and/or cache, etc. The nonvolatile memory may include, for example: read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 1211 to implement the methods for performing convolution operations and/or other desired functions of the various embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, and a noise component may also be stored in the computer-readable storage medium.

In one example, the electronic device 121 may further include: input devices 1213 and output devices 1214, etc., which are interconnected by a bus system and/or other forms of connection mechanisms (not shown). In addition, the input device 1213 may also include, for example, a keyboard, mouse, and the like. The output device 1214 can output various information to the outside. The output devices 1214 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.

Of course, only some of the components of the electronic device 121 relevant to the present disclosure are shown in fig. 12, components such as buses, input/output interfaces, and the like are omitted for simplicity. In addition, the electronic device 121 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer readable storage Medium

In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in a method for implementing convolution operations according to various embodiments of the present disclosure described in the above "exemplary methods" section of this specification.

The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium, having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a method for implementing convolution operations according to various embodiments of the present disclosure described in the above "exemplary methods" section of the present disclosure.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present disclosure have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present disclosure are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present disclosure. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, since the disclosure is not necessarily limited to practice with the specific details described.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.

The block diagrams of the devices, apparatuses, devices, systems referred to in this disclosure are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatus, devices, and systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present disclosure may also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the apparatus, devices and methods of the present disclosure, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered equivalent to the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, changes, additions, and sub-combinations thereof.

Claims

1. A method for implementing a convolution operation, comprising:

acquiring an input characteristic of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the spatial resolution of the input characteristic is n1 multiplied by n11, the weight matrix is a weight matrix of m1 multiplied by m1, n1 and n11 are positive integers, m1 is a non-zero even number, and m1 multiplied by m1 is the size of the weight matrix which is not supported by a data processor;

obtaining a feature to be operated and a weight matrix to be operated after row-column expansion according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 multiplied by n22, the weight matrix to be operated is a weight matrix of m2 multiplied by m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 multiplied by m2 is the size of the weight matrix supported by a data processor;

Performing convolution operation on the weight matrix to be operated and the characteristics to be operated through the data processor to obtain a convolution operation result;

and obtaining the output characteristics of the convolution layer according to the convolution operation result.

2. The method of claim 1, wherein the obtaining the row-column expanded feature-to-be-operated and weight-to-be-operated matrix according to the input feature and weight matrix comprises:

and obtaining the input feature to be operated and the weight matrix to be operated after the row and column expansion by adopting a mode of adding at least one expansion row and at least one expansion column in the same direction of the input feature and the weight matrix.

3. The method according to claim 2, wherein the obtaining the input feature to be operated and the weight matrix to be operated after the row-column expansion by adding at least one expansion row and at least one expansion column in the same direction of the input feature and the weight matrix respectively includes:

and the input characteristics to be operated and the weight matrix to be operated after the row and column expansion are obtained by adding the same number of expansion rows and the same number of expansion columns in the same direction of the input characteristics and the weight matrix respectively.

4. A method according to claim 3, wherein the obtaining the input feature to be operated and the weight matrix to be operated after the row-column expansion by adding the same number of expansion rows and the same number of expansion columns in the same direction of the input feature and the weight matrix respectively includes:

under the condition that the data processor does not support the feature filling requirement of the convolution layer, the input features to be operated and the weight matrix to be operated after the row and column expansion are obtained by adding the same number of expansion rows and the same number of expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the input features and weight matrix respectively;

and the performing, by the data processor, a convolution operation on the weight matrix to be operated and the feature to be operated, including:

and executing convolution operation on the weight matrix to be operated and the features to be operated through the data processor based on feature filling supported by the data processor.

5. The method of claim 4, wherein the obtaining the output characteristic of the convolutional layer according to the convolution operation result comprises:

and obtaining the output characteristics of the convolution layer according to part of rows and part of columns in the convolution operation result.

6. The method of any one of claims 2 to 5, wherein:

the values of the elements in the extension rows and the extension columns in the weight matrix to be operated are zero;

the values of the elements in the extension rows and the extension columns in the feature to be operated are all arbitrary values.

7. The method of any one of claims 2 to 5, wherein:

the number of the extended rows is the difference value between the number of the rows of the weight matrix of the first convolution kernel supported by the data processor and the number of the rows of the weight matrix of the convolution kernel corresponding to the convolution layer;

the number of the extended columns is the difference value between the number of the columns of the weight matrix of the first convolution kernel supported by the data processor and the number of the columns of the weight matrix of the convolution kernel corresponding to the convolution layer;

wherein, the weight matrix of the first convolution kernel is: and the weight matrix which is supported by the data processor is larger than the weight matrix of the convolution kernel corresponding to the convolution layer in each weight matrix, and is closest to the weight matrix of the convolution kernel corresponding to the convolution layer in size.

8. A method for implementing a convolution operation, comprising:

acquiring input characteristics of a convolution layer and a weight matrix of a convolution kernel corresponding to the convolution layer, wherein the weight matrix is a weight matrix of m3 multiplied by m3, and m3 is an odd number;

Under the condition that a data processor does not support the characteristic filling requirement of the convolution layer, obtaining a weight matrix to be operated after row-column expansion according to the weight matrix, wherein the weight matrix to be operated is a weight matrix of m4 multiplied by m4, and the m4 multiplied by m4 is the size of the weight matrix supported by the data processor;

performing convolution operation based on feature filling supported by a data processor on the weight matrix to be operated and the input features through the data processor to obtain a convolution operation result;

9. The method of claim 8, wherein the obtaining the matrix of weights to be operated after row-column expansion according to the matrix of weights comprises:

and the weight matrix to be operated is obtained by adding the same number of extension rows and extension columns on the upper side of the uppermost row and the left side of the leftmost column of the weight matrix.

10. The method according to claim 8 or 9, wherein:

and the values of the elements in the extension rows and the extension columns in the weight matrix to be operated are zero.

11. An apparatus for performing convolution operations, comprising:

The first acquisition module is used for acquiring the input characteristics of the convolution layer and the weight matrix of the convolution kernel corresponding to the convolution layer, wherein the spatial resolution of the input characteristics is n1 multiplied by n11, the weight matrix is a weight matrix of m1 multiplied by m1, n1 and n11 are positive integers, m1 is a non-zero even number, and m1 multiplied by m1 is the size of the weight matrix which is not supported by the data processor;

the second acquisition module is used for acquiring the characteristics to be operated and the weight matrix to be operated after the row and column expansion according to the input characteristics and the weight matrix acquired by the first acquisition module, wherein the spatial resolution of the characteristics to be operated is n2 multiplied by n22, the weight matrix to be operated is a weight matrix of m2 multiplied by m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 multiplied by m2 is the size of the weight matrix supported by the data processor;

the first acquisition operation result module is used for performing convolution operation on the weight matrix to be operated and the feature to be operated, which are obtained by the second acquisition module, through the data processor to obtain a convolution operation result;

the first acquisition output characteristic module is used for acquiring the output characteristics of the convolution layer according to the convolution operation result acquired by the first acquisition operation result module.

12. An apparatus for performing convolution operations, comprising:

the third acquisition module is used for acquiring the input characteristics of the convolution layer and the weight matrix of the convolution kernel corresponding to the convolution layer, wherein the weight matrix is a weight matrix of m3 multiplied by m3, and m3 is an odd number;

a fourth obtaining module, configured to obtain a weight matrix to be operated after row-column expansion according to the weight matrix obtained by the third obtaining module, where the weight matrix to be operated is a weight matrix of m4×m4, and m4×m4 is a size of the weight matrix supported by the data processor, where the data processor does not support a feature filling requirement of the convolutional layer;

the second acquisition operation result module is used for performing convolution operation based on feature filling supported by the data processor on the weight matrix to be operated obtained by the fourth acquisition module and the input features obtained by the third acquisition module through the data processor to obtain a convolution operation result;

and the second acquisition output characteristic module is used for acquiring the output characteristics of the convolution layer according to the convolution operation result acquired by the second acquisition operation result module.

13. A computer readable storage medium storing a computer program for performing the method of any one of the preceding claims 1-10.

14. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor being configured to read the executable instructions from the memory and execute the instructions to implement the method of any of the preceding claims 1-10.