WO2019119301A1 - 在卷积神经网络模型中确定特征图像的方法和装置 - Google Patents

在卷积神经网络模型中确定特征图像的方法和装置 Download PDF

Info

Publication number
WO2019119301A1
WO2019119301A1 PCT/CN2017/117503 CN2017117503W WO2019119301A1 WO 2019119301 A1 WO2019119301 A1 WO 2019119301A1 CN 2017117503 W CN2017117503 W CN 2017117503W WO 2019119301 A1 WO2019119301 A1 WO 2019119301A1
Authority
WO
WIPO (PCT)
Prior art keywords
convolution kernels
convolution
feature image
neural network
convolutional neural
Prior art date
Application number
PCT/CN2017/117503
Other languages
English (en)
French (fr)
Inventor
胡慧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/117503 priority Critical patent/WO2019119301A1/zh
Priority to CN201780096076.0A priority patent/CN111247527B/zh
Publication of WO2019119301A1 publication Critical patent/WO2019119301A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the present disclosure relates to the field of model training techniques, and more particularly to a method and apparatus for determining feature images in a convolutional neural network model.
  • the convolutional neural network consists of a convolutional layer, a fully connected layer, an activation function, etc., and the output of a single convolutional layer includes a plurality of feature images.
  • a large number of samples need to be calculated.
  • the calculation amount generated in the convolutional layer accounts for 90% of the total calculation amount in the whole training process.
  • the number of convolution kernels can be determined according to the number of input images and the number of feature images outputted, and a corresponding number of convolution kernels can be generated.
  • Each convolution kernel can be a small matrix, such as 3 ⁇ 3 matrix, each input image can be considered as a large matrix.
  • the processing of the convolution layer can be as follows: convolution calculation is performed on an input image and a convolution kernel, specifically, all matrices of the same size as the convolution kernel are extracted in the input image, and the extracted matrix and the convolution kernel are performed. The bit elements are multiplied and then added to obtain a value, and all the obtained values are combined into an intermediate matrix. Each input image and a convolution kernel are convoluted to obtain an intermediate matrix, and the intermediate matrices can be added.
  • a feature image is performed on an input image and a convolution kernel, specifically, all matrices of the same size as the convolution kernel are extracted in the input image, and the extracted matrix and the convolution kernel are performed. The bit elements are
  • each convolutional layer needs to output more feature images, and the number of convolution kernels corresponding to each feature image is also larger.
  • the calculation amount corresponding to each convolution kernel is already large, and the total calculation amount will increase exponentially throughout the training process. Therefore, the amount of computation generated in the convolutional layer is huge and requires a large amount of processing resources.
  • a method of determining a feature image in a convolutional neural network model comprising:
  • the method provided in this embodiment acquires a plurality of input images; generates at least one set of convolution kernels, wherein different convolution kernels in the same group contain the same elements and different order of elements; based on at least one set of convolution kernels
  • Each convolution kernel performs convolution calculation on different input images to obtain a plurality of intermediate matrices, and sums the plurality of intermediate matrices to obtain a feature image.
  • the different convolution kernels of the convolution kernel may have the same elements and different order of elements, thereby reducing the resources occupied by the storage convolution kernel, reducing the number of times the convolution kernel is read, and reducing the feature image when the convolution layer is determined. The amount of computation generated, as well as the system operating resources consumed during the calculation.
  • the summing the multiple intermediate matrices to obtain a feature image includes:
  • the polynomial processed after each merged similar item is separately evaluated to obtain the feature image.
  • the number of multiplication additions that need to be performed is much more than the number of multiplication additions required to perform the polynomial of the same type. It can be seen that as the number of convolution kernels included in a set of convolution kernels increases, and the completion of the entire calculation process of determining the feature image, the places involved in reducing the computation amount are greatly increased, and finally the determination of the feature image is accelerated. speed.
  • the method before acquiring at least one set of convolution kernels of the target processing layer, the method further includes:
  • the convolution kernel before the accumulation and element displacement constitutes a set of convolution kernels of the target processing layer, wherein the M is the number of convolution kernels in the preset group.
  • the different convolution kernels of the convolution kernel may have the same elements and different order of elements, thereby reducing the resources occupied by the storage convolution kernel, reducing the number of times the convolution kernel is read, and reducing the feature image when the convolution layer is determined.
  • the number of convolution kernels in each group is greater than two and less than the product of the number of rows of convolution kernels and the number of columns.
  • the method further includes:
  • the sum of the adjustment values of the same elements included in the different convolution kernels in the same group is determined as the corrected adjustment value corresponding to the adjustment value of the same element;
  • the respective convolution kernels are adjusted based on the corrected adjustment values for each element.
  • the convolutional neural network model there is a multi-layer convolutional layer in the convolutional neural network model.
  • the first layer of the convolutional layer to the Z-1 layer convolutional layer outputs the feature image, and the last layer is the Z-th layer convolutional layer. It is the final output.
  • the output of the convolutional neural network model is obtained, since the convolutional neural network model is still in the process of training, there is generally an error between the output result and the preset output result.
  • the adjusted values for each element in each convolution kernel in multiple sets of convolution kernels can be determined.
  • the sum of the adjustment values of the same elements included in the different convolution kernels in the same group is determined as the corrected adjustment value corresponding to the adjustment value of the same element.
  • apparatus for determining a feature image in a convolutional neural network model comprising at least one module for implementing the determination in the convolutional neural network model provided by the first aspect above The method of feature images.
  • a terminal comprising a processor, a memory configured to execute instructions stored in the memory, and the processor implementing the convolutional neural network model provided by the first aspect by executing the instruction A method of determining a feature image.
  • a computer readable storage medium comprising instructions for causing the source server to perform the convolutional neural network model provided by the first aspect described above when the computer readable storage medium is run on a source server A method of determining a feature image.
  • a fifth aspect a computer program product comprising instructions, when the computer program product is run on a source server, causing the source server to perform the determination of a feature image in a convolutional neural network model provided by the first aspect above method.
  • the method provided in this embodiment acquires a plurality of input images, and generates a plurality of sets of convolution kernels, wherein different convolution kernels in the same group contain the same elements and different order of elements; and at least one corresponding to the plurality of input images is determined.
  • Feature image The different convolution kernels of the convolution kernel may have the same elements and different order of elements, thereby reducing the resources occupied by the storage convolution kernel, reducing the number of times the convolution kernel is read, and reducing the feature image when the convolution layer is determined. The amount of computation generated, as well as the system operating resources consumed during the calculation.
  • FIG. 1 is a schematic structural diagram of a terminal according to an exemplary embodiment
  • FIG. 2 is a flow chart showing a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 3 is a schematic flowchart diagram of a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 4 is a schematic flowchart diagram of a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 5 is a schematic flowchart diagram of a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 6 is a schematic flowchart diagram of a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 7 is a schematic flowchart diagram of a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 8 is a flow chart showing a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 9 is a flow chart showing a method for determining a feature image in a convolutional neural network model, according to an exemplary embodiment
  • FIG. 10 is a schematic structural diagram of an apparatus for determining a feature image in a convolutional neural network model, according to an exemplary embodiment.
  • the embodiment of the invention provides a method for determining a feature image in a convolutional neural network model, and the execution body of the method is a terminal.
  • the terminal can include a processor 110, a memory 120, and the processor 110 can be coupled to the memory 120, as shown in FIG.
  • the processor 110 may include one or more processing units; the processor 110 may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP Processor, etc.), and the like.
  • DSP signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the program can include program code, the program code including computer operating instructions.
  • the terminal may also include a memory 120 that may be used to store software programs and modules, and the processor 110 performs tasks by reading software code stored in the memory 120 and modules.
  • the terminal may further include a receiver 130 and a transmitter 140, wherein the receiver 130 and the transmitter 140 may be respectively connected to the processor 110, and the transmitter 130 and the receiver 140 may be collectively referred to as a transceiver.
  • the transmitter 140 can be used to transmit messages or data.
  • the transmitter 140 can include, but is not limited to, at least one amplifier, a tuner, one or more oscillators, a coupler, an LNA (Low Noise Amplifier), a duplexer. Wait.
  • An exemplary embodiment of the present disclosure provides a method for determining a feature image in a convolutional neural network model. As shown in FIG. 2, the process flow of the method may include the following steps:
  • Step S210 Acquire a plurality of input images of the target processing layer in the convolutional neural network model.
  • the structure of the convolutional neural network model such as the number of layers of the convolutional layer contained in the convolutional neural network model, the input image in each layer, The number of convolution kernels and the number of characteristic images outputted, and so on.
  • the values of the elements used to convolve the input image to obtain the convolution kernel of the output image are random.
  • the convolution kernel may be a matrix, and the elements of the convolution kernel are the values at any position in the matrix (the position determined by the row and column). For a convolution kernel of size 3X3, there are 9 values in 3 rows and 3 columns in the convolution kernel.
  • the target processing layer a layer of convolutional layer in the convolutional neural network model
  • multiple feature images output by the upper convolution layer pass through other layers such as the Pooling layer (pooling layer) and the RELU layer (activation function layer).
  • the plurality of output images obtained by the processing are a plurality of input images of the convolution layer of the layer.
  • Each feature image in each convolutional layer in the convolutional neural network model can be determined using the method provided in this embodiment.
  • Step S220 acquiring at least one set of convolution kernels of the target processing layer.
  • the multi-dimensional tensor composed of all convolution kernels in each set of convolution kernels (including three-dimensional or third-order three-dimensional or more matrices is a tensor), which may be a tensor with a special structure arranged according to a certain rule.
  • the purpose is to make the elements in each set of convolution kernels repeatable, so that when calculating the elements of the convolution kernel, the amount of calculation can be reduced by combining the same items.
  • the size of the convolution kernel is generally 3X3 or 5X5, and the height and width of the convolution kernel are generally the same value. Different convolution kernels in the same group contain the same elements and different order of elements.
  • multiple sets of convolution kernels may be generated in units of groups. For example, as shown in FIG. 3, in a certain layer of convolutional layers, there are a total of six input images, and there are also six convolution kernels corresponding to each input image. These six convolution kernels can be grouped, for example, the convolution kernels 1-3 are divided into one group, and the convolution kernels 4-6 are divided into one group.
  • the method provided in this embodiment may further include: randomly generating N convolution kernels, where N is a preset number of groups; and for N volumes Each convolution kernel in the nucleus, element displacement in units of rows, and/or element displacement in column units, resulting in M-1 different convolution kernels, before M-1 convolution kernels and element displacement
  • the convolution kernels constitute a set of convolution kernels of the target processing layer, where M is the number of convolution kernels in the preset group.
  • the convolution kernels 1-3 are divided into one group, and the convolution kernels 4-6 are divided into one group.
  • two convolution kernels that is, a convolution kernel 1 and a convolution kernel 4 are randomly generated.
  • the convolution kernel 1 is elementally shifted in units of rows. If the size of the convolution kernel 1 is 3 ⁇ 3, as shown in FIG. 4, the element is displaced by the convolution kernel 1 in units of columns to obtain a convolution kernel 2 and a volume.
  • W 1 - W 8 are elements in the convolution kernel.
  • the convolution kernel 4-6 is generated in the same way.
  • the number M of convolution kernels in each group is greater than 2 and less than the product of the number of rows of convolution kernels and the number of columns.
  • the maximum M is no more than 9. Because, once M exceeds 9, the convolution kernel of size 3X3 has been elemental displacement in units of rows and element displacement in column units.
  • the displacement method is all possible displacement modes, and the 10th convolution kernel must be It is to repeat one of the first nine convolution kernels. That is, to ensure that the different convolution kernels in the same group contain the same elements and the order of the elements is different, it is necessary to control the M maximum not exceeding the product of the number of rows of the convolution kernel and the number of columns.
  • the convolution kernel 1 can be subjected to element displacement in units of columns to obtain a convolution kernel 2 and a convolution kernel 3, and then the convolution kernel 1 performs element displacement in units of rows to obtain a convolution kernel 4 and Convolution kernel 5.
  • the convolution kernel 2 may perform element displacement in units of rows to obtain a convolution kernel 4 and a convolution kernel 5, and the like.
  • Step S230 performing convolution calculation on different input images based on each convolution kernel in at least one set of convolution kernels to obtain a plurality of intermediate matrices, and summing the plurality of intermediate matrices to obtain a feature image.
  • each element of the intermediate matrix is a polynomial obtained by multiplying the corresponding convolution kernel and the input image by the alignment element in the convolution calculation process.
  • the convolution layer is required to output two feature images, that is, the feature image 1 and the feature image 2.
  • the four input images are convoluted by the four convolution kernels 1-4 to obtain the intermediate matrix 1-4, and the feature image 1 can be obtained based on the intermediate matrix 1-4.
  • the four input images are again convoluted by the four convolution kernels 5-8 to obtain an intermediate matrix 5-8, and the feature image 2 can be obtained based on the intermediate matrix 5-8.
  • the convolution kernels 1-4 can be divided into multiple sets of convolution kernels, and the convolution kernels 5-8 can also be divided into multiple sets of convolution kernels.
  • the number of convolution kernels corresponding to one feature image is large, and the convolution kernel corresponding to one feature image can be divided into multiple sets of convolution kernels.
  • the different convolution kernels contain the same elements and the order in which the elements are arranged.
  • the elements of the kernel are multiplied and then added to obtain a polynomial.
  • the convolution kernel is shifted in the input image 1 by the preset number of rows or the preset number of columns, and the 3 ⁇ 3 adjacent elements in the obtained input image 1 and the elements of the convolution kernel at the corresponding positions are repeated.
  • Multiplication and addition add a polynomial operation until the convolution kernel traverses all 3X3 adjacent elements on the input image, resulting in an intermediate matrix 1.
  • the step of summing the plurality of intermediate matrices to obtain the feature image may include: adding polynomials of the elements of the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image;
  • the polynomials corresponding to the elements are separately processed by the same type of items; the polynomials processed by each of the similar items are separately evaluated to obtain a feature image.
  • the polynomials of the plurality of intermediate matrices may be simultaneously determined by multiple channels, and the polynomials of the same position are added.
  • convolution calculation is performed on the input image 1, the input image 2, and the 3 ⁇ 3 adjacent elements in the upper left corner of the input image 3 by using the convolution kernel 1, the convolution kernel 2, and the convolution kernel 3, respectively.
  • an element of the first row and the first column of the intermediate matrix 1, the intermediate matrix 2, and the intermediate matrix 3 are respectively obtained.
  • the polynomial corresponding to one element of the first row and the first column of the intermediate matrix 1 is: W 0 ⁇ a 0 + W 1 ⁇ a 1 + W 2 ⁇ a 2 + W 3 ⁇ a 3 + W 4 ⁇ a 4 + W 5 ⁇ a 5 +W 6 ⁇ a 6 +W 7 ⁇ a 7 +W 8 ⁇ a 8 .
  • the polynomial corresponding to one element of the first row and the first column of the intermediate matrix 2 is: W 2 ⁇ b 0 + W 0 ⁇ b 1 + W 1 ⁇ b 2 + W 5 ⁇ b 3 + W 3 ⁇ b 4 + W 4 ⁇ b 5 +W 8 ⁇ b 6 +W 6 ⁇ b 7 +W 7 ⁇ b 8 .
  • the polynomial corresponding to one element of the first row and the first column of the intermediate matrix 3 is: W 1 ⁇ c 0 +W 2 ⁇ c 1 +W 0 ⁇ c 2 +W 4 ⁇ c 3 +W 5 ⁇ c 4 +W 3 ⁇ c 5 +W 7 ⁇ c 6 +W 8 ⁇ c 7 +W 6 ⁇ c 8 .
  • the polynomials of the elements of the same position of all the intermediate matrices corresponding to the feature image are added, which includes adding the polynomials of the elements of the same position of the intermediate matrix 1, the intermediate matrix 2, and the intermediate matrix 3. .
  • the operation amount of 18 times of multiplication can be reduced for determining a small part of the operation in the feature image.
  • the places involved in reducing the computational complexity are greatly increased, and finally the speed of determining the feature image is accelerated.
  • the method provided in this embodiment further includes: when the output result of the convolutional neural network model is obtained, according to the output result of the convolutional neural network model and the preset Outputting results, determining an adjustment value of each element in each convolution kernel of at least one set of convolution kernels; determining a sum of adjustment values of the same elements included in different convolution kernels in the same group as an adjustment value of the same element The corresponding corrected adjustment value is adjusted for each convolution kernel based on the corrected adjustment value of each element.
  • the convolutional neural network model there is a multi-layer convolutional layer in the convolutional neural network model.
  • the first layer of the convolutional layer to the Z-1 layer convolutional layer outputs the feature image, and the last layer is the Z-th layer convolutional layer. It is the final output.
  • the output of the convolutional neural network model is obtained, since the convolutional neural network model is still in the process of training, there is generally an error between the output result and the preset output result. Based on the errors produced by the entire convolutional neural network model, the adjusted values for each element in each convolution kernel in multiple sets of convolution kernels can be determined.
  • the sum of the adjustment values of the same elements included in the different convolution kernels in the same group is determined as the corrected adjustment value corresponding to the adjustment value of the same element.
  • the convolution kernel 1, the convolution kernel 2, and the convolution kernel 3 are respectively paired with 3 ⁇ 3 adjacent elements in the input image 1, 3 ⁇ 3 adjacent elements in the input image 2, and the input image 3
  • the 3X3 adjacent elements are convoluted by 3 channels. If there is Figure 9, the following formula is calculated when calculating the corrected adjustment value corresponding to the adjustment value of the same element:
  • ⁇ w is a corrected adjustment value corresponding to the adjustment value of the same element.
  • WH is the product of the width and height of the feature image.
  • ⁇ Rk is the sensitivity
  • R in ⁇ Rk represents the Rth feature image of the target processing layer
  • w_size 2 is the product of the width of the convolution kernel and the height of the convolution kernel.
  • the test was carried out by the method provided in the present example. Specifically, the Cifar10 data set was used for image recognition training, and the convolutional neural network model was designed as a 3-layer model. The size of each layer of convolution kernel was 5 ⁇ 5. The test results are shown in the following table:
  • the test was carried out by the method provided in the present example. Specifically, the convolutional neural network model is trained for the image super-resolution field, and a new image that enlarges the original image to a size of 3 times is set.
  • the convolutional neural network model is designed as a 3-layer model with a convolution kernel size of 5X5. The test results are shown in the following table:
  • PSNR is a commonly used measure in image super-resolution applications.
  • BaseHisrcnn is a convolutional neural network structure applied to image super-resolution.
  • the method provided in this embodiment acquires a plurality of input images, and generates a plurality of sets of convolution kernels, wherein different convolution kernels in the same group contain the same elements and different order of elements; and at least one corresponding to the plurality of input images is determined.
  • Feature image The different convolution kernels of the convolution kernel may have the same elements and different order of elements, thereby reducing the resources occupied by the storage convolution kernel, reducing the number of times the convolution kernel is read, and reducing the feature image when the convolution layer is determined. The amount of computation generated, as well as the system operating resources consumed during the calculation.
  • Yet another exemplary embodiment of the present disclosure provides an apparatus for determining a feature image in a convolutional neural network model, as shown in FIG. 10, the apparatus comprising:
  • the obtaining module 1010 is configured to acquire a plurality of input images of the target processing layer in the convolutional neural network model; and acquire at least one set of convolution kernels of the target processing layer.
  • the different convolution kernels in the same group contain the same elements and the order of the elements is different.
  • the obtaining function in the above steps S210 and S220, and other implicit steps can be implemented.
  • the determining module 1020 is configured to perform convolution calculation on different input images based on each convolution kernel in the at least one set of convolution kernels to obtain a plurality of intermediate matrices, and obtain the features by summing the plurality of intermediate matrices An image, wherein each element of the intermediate matrix is a polynomial obtained by multiplying and subtracting a corresponding convolution kernel with an input image during convolution calculation.
  • the obtaining function in the above step S230, and other implicit steps can be implemented.
  • the determining module 1020 is configured to add polynomials of elements of the same position in the plurality of intermediate matrices to obtain a polynomial corresponding to each element of the feature image; and each element of the feature image Corresponding polynomials are respectively processed by combining similar items; each polynomial processed by combining the same items is separately evaluated to obtain the feature image.
  • the device further includes:
  • a generating module configured to randomly generate N convolution kernels, where the N is a preset number of groups
  • a displacement module configured to perform element displacement on each of the N convolution kernels in units of rows, and/or element displacement in column units to obtain M-1 different convolution kernels
  • the M-1 convolution kernels and the convolution kernels before the element displacement constitute a set of convolution kernels of the target processing layer, wherein the M is the number of convolution kernels in the preset group.
  • the number of convolution kernels in each group is greater than two and less than the product of the number of rows of convolution kernels and the number of columns.
  • the determining module 1020 is further configured to: when the output result of the convolutional neural network model is obtained, determine the at least one group according to an output result of the convolutional neural network model and a preset output result.
  • the adjustment value of each element in each convolution kernel in the convolution kernel; the sum of the adjustment values of the same elements included in the different convolution kernels in the same group is determined as the corrected value corresponding to the adjustment value of the same element Adjustment value
  • the device also includes an adjustment module:
  • the adjustment module is configured to adjust each convolution kernel based on the corrected adjustment value of each element.
  • the foregoing obtaining module 1010 and the determining module 1020 may be implemented by a processor, or the processor may be implemented by using a memory, or the processor may execute a program instruction in the memory.
  • the different convolution kernels of the convolution kernel may have the same elements and different order of elements, thereby reducing the resources occupied by the storage convolution kernel, reducing the number of times the convolution kernel is read, and reducing the feature image when the convolution layer is determined.
  • the device for determining the feature image in the convolutional neural network model provided by the above embodiment is only illustrated by the division of each functional module. In practical applications, the above may be The function assignment is completed by different functional modules, that is, the internal structure of the terminal is divided into different functional modules to complete all or part of the functions described above.
  • the apparatus for determining a feature image in a convolutional neural network model provided by the above embodiment is the same as the method embodiment for determining a feature image in a convolutional neural network model, and the specific implementation process is described in the method embodiment. Let me repeat.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种在卷积神经网络模型中确定特征图像的方法和装置,属于模型训练技术领域。所述方法包括:获取卷积神经网络模型中目标处理层的多个输入图像(S210);获取目标处理层的至少一组卷积核(S220),其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;基于至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对多个中间矩阵求和得到特征图像(S230)。所述方法通过不同卷积核包含的元素相同但元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算量,以及减少计算过程中消耗的系统运行资源。

Description

在卷积神经网络模型中确定特征图像的方法和装置 技术领域
本公开是关于模型训练技术领域,尤其是关于一种在卷积神经网络模型中确定特征图像的方法和装置。
背景技术
卷积神经网络由卷积层、全连接层、激活函数等组成,单个卷积层的输出包括多个特征图像。在对卷积神经网络模型进行训练的过程中,需要对大量的样本进行计算。其中,在卷积层产生的计算量就占整个训练过程中总计算量的90%。
对于任意一个卷积层,可以根据输入图像的数量和输出的特征图像的数量,确定卷积核的数量,并生成相应数量的卷积核,每个卷积核可以是一个小矩阵,如3×3矩阵,每个输入图像可以认为是一个大矩阵。该卷积层的处理可以如下:将一个输入图像和一个卷积核进行卷积计算,具体地,在输入图像中提取所有与卷积核大小相同的矩阵,将提取的矩阵与卷积核进行对位元素相乘再相加,得到一个数值,将得到的所有数值组成一个中间矩阵,每个输入图像与一个卷积核进行卷积计算都可以得到一个中间矩阵,这些中间矩阵相加可以得到一个特征图像。
在实现本公开的过程中,发明人发现至少存在以下问题:
由于卷积神经网络包含卷积层的数量较多,每个卷积层需要输出的特征图像较多,每个特征图像对应的卷积核的数量也较多。每个卷积核对应的计算量已经较大了,而整个训练过程中总计算量就会呈指数增长。因此,在卷积层产生的计算量巨大,需要占用大量的处理资源。
发明内容
为了克服相关技术中存在的问题,本公开提供了以下技术方案:
第一方面,提供了一种在卷积神经网络模型中确定特征图像的方法,所述方法包括:
获取卷积神经网络模型中目标处理层的多个输入图像;
获取所述目标处理层的至少一组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;
基于所述至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对所述多个中间矩阵求和得到特征图像,其中所述中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。
本实施例提供的方法,获取多个输入图像;生成至少一组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;基于至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对多个中间矩阵求和得到特征图像。可以通过卷积核的不同卷积核包含的元素相同且元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算 量,以及减少计算过程中消耗的系统运行资源。
在一种可能的实现方式中,所述对所述多个中间矩阵求和得到特征图像,包括:
将所述多个中间矩阵中相同位置的元素的多项式相加,得到所述特征图像的每个元素对应的多项式;
对所述特征图像的每个元素对应的多项式,分别进行合并同类项处理;
对每个合并同类项处理后的多项式,分别求值,得到所述特征图像。
未进行同类项合并的多项式,一共需要进行的乘法加法的次数远比进行同类项合并的多项式需要进行的乘法加法的次数多。可见,随着一组卷积核包括的卷积核的个数的增加,以及确定特征图像的整个计算过程的完成,其中涉及的可以缩减运算量的地方大大增加,最终加速了确定特征图像的速度。
在一种可能的实现方式中,在获取所述目标处理层的至少一组卷积核之前,所述方法还包括:
随机生成N个卷积核,其中,所述N为预设的组数目;
对所述N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成所述目标处理层的一组卷积核,其中,所述M为预设的组中卷积核的数目。
可以通过卷积核的不同卷积核包含的元素相同且元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算量,以及减少计算过程中消耗的系统运行资源。
在一种可能的实现方式中,每个组中卷积核的数目大于2且小于卷积核的行数与列数的乘积。
在一种可能的实现方式中,对所述多个中间矩阵求和得到特征图像之后,所述方法还包括:
当得到所述卷积神经网络模型的输出结果时,根据所述卷积神经网络模型的输出结果和预设的输出结果,确定所述至少一组卷积核中每个卷积核中每个元素的调整值;
将同组中的不同卷积核包含的相同元素的调整值之和,确定为所述相同元素的调整值对应的修正后的调整值;
基于每个元素的修正后的调整值,对各卷积核进行调整。
在实施中,卷积神经网络模型中存在多层卷积层,第一层卷积层到第Z-1层卷积层输出的是特征图像,最后一层即第Z层卷积层输出的就是最终的输出结果。当得到卷积神经网络模型的输出结果时,由于卷积神经网络模型还处于训练的过程中,因此输出结果一般会和预设的输出结果之间存在误差。基于整个卷积神经网络模型产生的误差,可以确定多组卷积核中每个卷积核中每个元素的调整值。接着,将同组中的不同卷积核包含的相同元素的调整值之和,确定为相同元素的调整值对应的修正后的调整值。
第二方面,提供了一种在卷积神经网络模型中确定特征图像的装置,该装置包括至少一个模块,该至少一个模块用于实现上述第一方面所提供的在卷积神经网络模型中确定特征图像的方法。
第三方面,提供了一种终端,该终端包括处理器、存储器,处理器被配置为执行存储器中存储的指令;处理器通过执行指令来实现上述第一方面所提供的在卷积神经网络模型中确定特征图像的方法。
第四方面,提供了计算机可读存储介质,包括指令,当所述计算机可读存储介质在源服务器上运行时,使得所述源服务器执行上述第一方面所提供的在卷积神经网络模型中确定特征图像的方法。
第五方面,一种包含指令的计算机程序产品,当所述计算机程序产品在源服务器上运行时,使得所述源服务器执行上述第一方面所提供的在卷积神经网络模型中确定特征图像的方法。
本公开的实施例提供的技术方案可以包括以下有益效果:
本实施例提供的方法,获取多个输入图像;生成多组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;确定多个输入图像对应的至少一个特征图像。可以通过卷积核的不同卷积核包含的元素相同且元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算量,以及减少计算过程中消耗的系统运行资源。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是根据一示例性实施例示出的一种终端的结构示意图;
图2是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图3是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图4是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图5是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图6是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图7是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图8是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的 流程示意图;
图9是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的方法的流程示意图;
图10是根据一示例性实施例示出的一种在卷积神经网络模型中确定特征图像的装置的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
本发明实施例提供了一种在卷积神经网络模型中确定特征图像的方法,该方法的执行主体为终端。
终端可以包括处理器110、存储器120,处理器110可以与存储器120连接,如图1所示。处理器110可以包括一个或多个处理单元;处理器110可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件等。
具体地,程序可以包括程序代码,程序代码包括计算机操作指令。终端还可以包括存储器120,存储器120可用于存储软件程序以及模块,处理器110通过读取存储在存储器120的软件代码以及模块,从而执行任务。
另外,终端还可以包括接收器130和发射器140,其中,接收器130和发射器140可以分别与处理器110连接,发射器130和接收器140可以统称为收发器。发射器140可以用于发送消息或数据,发射器140可以包括但不限于至少一个放大器、调谐器、一个或多个振荡器、耦合器、LNA(Low Noise Amplifier,低噪声放大器)、双工器等。
本公开一示例性实施例提供了一种在卷积神经网络模型中确定特征图像的方法,如图2所示,该方法的处理流程可以包括如下的步骤:
步骤S210,获取卷积神经网络模型中目标处理层的多个输入图像。
在实施中,在对卷积神经网络模型进行训练的过程中,首先要设计卷积神经网络模型的结构,如卷积神经网络模型中包含的卷积层的层数,每层中输入图像、卷积核数量和输出的特征图像的数量等。在对卷积神经网络模型进行第一轮训练的过程中,用于对输入图像进行卷积计算得到输出图像的卷积核的元素的值是随机的。其中,卷积核可以是一个矩阵,卷积核的元素即为该矩阵中任意位置(通过行列决定的位置)处的数值。对于大小为3X3的卷积核,卷积核中存在3行和3列共9个数值,这9个数值即为卷积核的元素。同理,对于大小为5X5的卷积核,卷积核中存在5行和5列共25个数值,这25个数值即为卷积核的元素。其他大小的卷积核与之类似,在此不一一举例。而在第二轮训练至第N论训练的过程中,不断地通过卷积神经网络模型输出的结果与样本中的正确结果的差值,返回来去优化卷积核的元素的值,使得第N论训练后,卷积神经网络模型输出的结果尽可能 地与样本中的正确结果的差值取得最低值。对于目标处理层(卷积神经网络模型中某一层卷积层),其上一层卷积层输出的多个特征图像经过其它层如Pooling层(池化层)、RELU层(激活函数层)处理得到的多个输出图像,即为本层卷积层的多个输入图像。
卷积神经网络模型中的每一层卷积层中的每一个特征图像都可以使用本实施例提供的方法来确定。
步骤S220,获取目标处理层的至少一组卷积核。
其中,每组卷积核中的所有卷积核构成的多维的张量(包括三维或者三阶的三维以上的矩阵即为张量),可以是按照某种规律排列的具有特殊结构的张量,其目的在于让每组卷积核中的元素可以重复,这样在用卷积核的元素进行计算时,可以通过合并同类项的方式减少计算量。卷积核的大小一般为3X3或者5X5,卷积核的高度与宽度一般是一样的数值。同组中的不同卷积核包含的元素相同且元素的排列顺序不同。
在实施中,在对卷积神经网络模型进行初始化的过程中,可以以组为单位生成多组卷积核。例如,如图3所示,在某一层卷积层中,一共有6个输入图像,分别与每个输入图像对应的卷积核也有6个。可以将这6个卷积核进行分组,例如,将卷积核1-3分为1组,将卷积核4-6分为1组。
可选地,在获取目标处理层的至少一组卷积核之前,本实施例提供的方法还可以包括:随机生成N个卷积核,其中,N为预设的组数目;对N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成目标处理层的一组卷积核,其中,M为预设的组中卷积核的数目。
在实施中,在对卷积神经网络模型进行初始化的过程中,接上例,例如,将卷积核1-3分为1组,将卷积核4-6分为1组。首先,随机生成2个卷积核即卷积核1和卷积核4。接着,对卷积核1以行为单位进行元素位移,假如卷积核1的大小是3X3,则如图4所示,由卷积核1以列为单位进行元素位移得到卷积核2和卷积核3。其中,W 1-W 8是卷积核中的元素。卷积核4-6的生成方式同理。
可选地,每个组中卷积核的数目M大于2且小于卷积核的行数与列数的乘积。如对于大小为3X3的卷积核,M最大不超过9个。因为,一旦M超过9个,那么大小为3X3的卷积核已经以行为单位进行元素位移和以列为单位进行元素位移,位移的方式已是所有可能位移方式了,第10个卷积核一定就是重复前9个卷积核中的一个。即要保证同组中的不同卷积核包含的元素相同且元素的排列顺序不同,就要控制M最大不超过卷积核的行数与列数的乘积。
在实施中,例如,1组卷积核中存在5个卷积核1-5。如图5所示,首先,可以由卷积核1以列为单位进行元素位移得到卷积核2和卷积核3,再由卷积核1以行为单位进行元素位移得到卷积核4和卷积核5。或者由卷积核2以行为单位进行元素位移得到卷积核4和卷积核5等都可以。
步骤S230,基于至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对多个中间矩阵求和得到特征图像。
其中,中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。
在实施中,如图6所示,在同一层卷积层中,存在4个输入图像,需要该卷积层输出2个特征图像即特征图像1和特征图像2。通过4个卷积核1-4分别对这4个输入图像进行卷积计算,得到中间矩阵1-4,基于中间矩阵1-4可以得到特征图像1。通过4个卷积核5-8再次分别对这4个输入图像进行卷积计算,得到中间矩阵5-8,基于中间矩阵5-8可以得到特征图像2。
对于图6,卷积核1-4可以分为多组卷积核,卷积核5-8也可以分为多组卷积核。这里只是示例,实际中1个特征图像对应的卷积核的数量较大,可以将1个特征图像对应的卷积核分为多组卷积核。对于每组卷积核,不同卷积核包含的元素相同且元素的排列顺序不同。
如图7所示,以中间矩阵1的确定过程来说是,每次将输入图像1取出如3X3个相邻的元素,将输入图像1中的3X3个相邻的元素与对应位置上的卷积核的元素相乘再相加得到一个多项式。将卷积核在输入图像1中每次按预设行数或者预设列数进行移位,重复将得到的输入图像1中的3X3个相邻的元素与对应位置上的卷积核的元素相乘再相加得到一个多项式的操作,直到卷积核遍历输入图像上所有3X3个相邻的元素,就得到了中间矩阵1。
可选地,对多个中间矩阵求和得到特征图像的步骤可以包括:将多个中间矩阵中相同位置的元素的多项式相加,得到特征图像的每个元素对应的多项式;对特征图像的每个元素对应的多项式,分别进行合并同类项处理;对每个合并同类项处理后的多项式,分别求值,得到特征图像。
在实施中,对于本实施例提供的方法,可以通过多个通道同时将多个中间矩阵的多项式确定出,再将相同位置的多项式进行相加。
如图8所示,用卷积核1、卷积核2和卷积核3,分别对输入图像1、输入图像2和输入图像3中左上角位置的3X3个相邻的元素进行卷积计算,分别得到中间矩阵1、中间矩阵2和中间矩阵3的第一行第一列的一个元素为例。中间矩阵1的第一行第一列的一个元素对应的多项式为:W 0×a 0+W 1×a 1+W 2×a 2+W 3×a 3+W 4×a 4+W 5×a 5+W 6×a 6+W 7×a 7+W 8×a 8。中间矩阵2的第一行第一列的一个元素对应的多项式为:W 2×b 0+W 0×b 1+W 1×b 2+W 5×b 3+W 3×b 4+W 4×b 5+W 8×b 6+W 6×b 7+W 7×b 8。中间矩阵3的第一行第一列的一个元素对应的多项式为:W 1×c 0+W 2×c 1+W 0×c 2+W 4×c 3+W 5×c 4+W 3×c 5+W 7×c 6+W 8×c 7+W 6×c 8
在确定特征图像时,要将与特征图像对应的所有中间矩阵的相同位置的元素的多项式相加,这其中包括将中间矩阵1、中间矩阵2和中间矩阵3的相同位置的元素的多项式相加。当然,也包括将中间矩阵1、中间矩阵2和中间矩阵3的第一行第一列的元素对应的多项式相加,得到:W 0×a 0+W 1×a 1+W 2×a 2+W 3×a 3+W 4×a 4+W 5×a 5+W 6×a 6+W 7×a 7+W 8×a 8+W 2×b 0+W 0×b 1+W 1×b 2+W 5×b 3+W 3×b 4+W 4×b 5+W 8×b 6+W 6×b 7+W 7×b 8+W 1×c 0+W 2×c 1+W 0×c 2+W 4×c 3+W 5×c 4+W 3×c 5+W 7×c 6+W 8×c 7+W 6×c 8
可以看到,可以对上述式子进行合并同类项的处理,得到:W 0×(a 0+b 1+c 2)+W 1×(a 1+b 2+c 0)+W 2×(a 2+b 0+c 1)+W 3×(a 3+b 4+c 5)+W 4×(a 4+b 5+c 3)+W 5×(a 5+b 3+c 4)+W 6×(a 6+b 7+c 8)+W 7×(a 7+b 8+c 6)+W 8×(a 8+b 6+c 7)。可以看出,未进行同类项合并的多项式,一共需要进行27次乘法以及26次加法可以得到想要的结果,而进行同类项合并的多项式,一共需要进行9次乘法以及26次加法就可以得到想要的结果。
可见,对于确定特征图像中的一小部分运算就可以减少18次乘法的运算量。而随着一组卷积核包括的卷积核的个数的增加,以及确定特征图像的整个计算过程的完成,其中涉及的可以缩减运算量的地方大大增加,最终加速了确定特征图像的速度。
可选地,对多个中间矩阵求和得到特征图像之后,本实施例提供的方法还包括:当得到卷积神经网络模型的输出结果时,根据卷积神经网络模型的输出结果和预设的输出结果,确定至少一组卷积核中每个卷积核中每个元素的调整值;将同组中的不同卷积核包含的相同元素的调整值之和,确定为相同元素的调整值对应的修正后的调整值;基于每个元素的修正后的调整值,对各卷积核进行调整。
在实施中,卷积神经网络模型中存在多层卷积层,第一层卷积层到第Z-1层卷积层输出的是特征图像,最后一层即第Z层卷积层输出的就是最终的输出结果。当得到卷积神经网络模型的输出结果时,由于卷积神经网络模型还处于训练的过程中,因此输出结果一般会和预设的输出结果之间存在误差。基于整个卷积神经网络模型产生的误差,可以确定多组卷积核中每个卷积核中每个元素的调整值。接着,将同组中的不同卷积核包含的相同元素的调整值之和,确定为相同元素的调整值对应的修正后的调整值。例如,对于图8,假如通过卷积核1、卷积核2和卷积核3分别对输入图像1中的3X3个相邻元素、输入图像2中的3X3个相邻元素和输入图像3中的3X3个相邻元素通过3个通道进行卷积计算则有图9,则在计算相同元素的调整值对应的修正后的调整值时有如下公式:
Figure PCTCN2017117503-appb-000001
Figure PCTCN2017117503-appb-000002
Figure PCTCN2017117503-appb-000003
Figure PCTCN2017117503-appb-000004
Figure PCTCN2017117503-appb-000005
Figure PCTCN2017117503-appb-000006
Figure PCTCN2017117503-appb-000007
Figure PCTCN2017117503-appb-000008
Figure PCTCN2017117503-appb-000009
其中,Δw为相同元素的调整值对应的修正后的调整值。WH为特征图像的宽度与高度的乘积。δ Rk为灵敏度,δ Rk中的R表示目标处理层的第R个特征图像,w_size 2为卷积核的宽度与卷积核的高度的乘积。
最终,通过本实施例提供的方法,进行试验。具体使用Cifar10数据集进行图像识别训 练,设计卷积神经网络模型为3层模型,每层卷积核大小为5X5。得到试验结果如下表所示:
表1
Figure PCTCN2017117503-appb-000010
通过本实施例提供的方法,进行试验。具体对卷积神经网络模型训练用于图像超分辨率领域,设置将原图像放大为3倍大小的新图像。设计卷积神经网络模型为3层模型,卷积核的大小为5X5。得到试验结果如下表所示:
表2
Figure PCTCN2017117503-appb-000011
其中,PSNR是图像超分辨率应用中常用的衡量指标,PSNR越大,图像超分辨率后的效果越好。BaseHisrcnn是应用在图像超分辨率上的一种卷积神经网络结构。
本实施例提供的方法,获取多个输入图像;生成多组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;确定多个输入图像对应的至少一个特征图像。可以通过卷积核的不同卷积核包含的元素相同且元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算量,以及减少计算过程中消耗的系统运行资源。
本公开又一示例性实施例提供了一种在卷积神经网络模型中确定特征图像的装置,如图10所示,该装置包括:
获取模块1010,用于获取卷积神经网络模型中目标处理层的多个输入图像;获取所述目标处理层的至少一组卷积核。其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同。具体可以实现上述步骤S210和步骤S220中的获取功能,以及其他隐含步骤。
确定模块1020,用于基于所述至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对所述多个中间矩阵求和得到特征图像,其中所述中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。具体可以实现上述步骤S230中的获取功能,以及其他隐含步骤。
可选地,所述确定模块1020用于将所述多个中间矩阵中相同位置的元素的多项式相加,得到所述特征图像的每个元素对应的多项式;对所述特征图像的每个元素对应的多项式,分别进行合并同类项处理;对每个合并同类项处理后的多项式,分别求值,得到所述特征图像。
可选地,所述装置还包括:
生成模块,用于随机生成N个卷积核,其中,所述N为预设的组数目;
位移模块,用于对所述N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成所述目标处理层的一组卷积核,其中,所述M为预设的组中卷积核的数目。
可选地,每个组中卷积核的数目大于2且小于卷积核的行数与列数的乘积。
可选地,所述确定模块1020还用于当得到所述卷积神经网络模型的输出结果时,根据所述卷积神经网络模型的输出结果和预设的输出结果,确定所述至少一组卷积核中每个卷积核中每个元素的调整值;将同组中的不同卷积核包含的相同元素的调整值之和,确定为所述相同元素的调整值对应的修正后的调整值;
所述装置还包括调整模块:
所述调整模块,用于基于每个元素的修正后的调整值,对各卷积核进行调整。
需要说明的是,上述获取模块1010、确定模块1020可以由处理器实现,或者处理器配合存储器来实现,或者,处理器执行存储器中的程序指令来实现。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
可以通过卷积核的不同卷积核包含的元素相同且元素的排列顺序不同的特性,减少存储卷积核占用的资源,减少读取卷积核的次数,减少在卷积层确定特征图像时产生的计算量,以及减少计算过程中消耗的系统运行资源。
需要说明的是:上述实施例提供的在卷积神经网络模型中确定特征图像的装置在确定特征图像时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将终端的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的在卷积神经网络模型中确定特征图像的装置与在卷积神经网络模型中确定特征图像的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者 适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (17)

  1. 一种在卷积神经网络模型中确定特征图像的方法,其特征在于,所述方法包括:
    获取卷积神经网络模型中目标处理层的多个输入图像;
    获取所述目标处理层的至少一组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;
    基于所述至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对所述多个中间矩阵求和得到特征图像,其中所述中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述多个中间矩阵求和得到特征图像,包括:
    将所述多个中间矩阵中相同位置的元素的多项式相加,得到所述特征图像的每个元素对应的多项式;
    对所述特征图像的每个元素对应的多项式,分别进行合并同类项处理;
    对每个合并同类项处理后的多项式,分别求值,得到所述特征图像。
  3. 根据权利要求1所述的方法,其特征在于,在获取所述目标处理层的至少一组卷积核之前,所述方法还包括:
    随机生成N个卷积核,其中,所述N为预设的组数目;
    对所述N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成所述目标处理层的一组卷积核,其中,所述M为预设的组中卷积核的数目。
  4. 根据权利要求1所述的方法,其特征在于,每个组中卷积核的数目大于2且小于卷积核的行数与列数的乘积。
  5. 根据权利要求1所述的方法,其特征在于,对所述多个中间矩阵求和得到特征图像之后,所述方法还包括:
    当得到所述卷积神经网络模型的输出结果时,根据所述卷积神经网络模型的输出结果和预设的输出结果,确定所述至少一组卷积核中每个卷积核中每个元素的调整值;
    将同组中的不同卷积核包含的相同元素的调整值之和,确定为所述相同元素的调整值对应的修正后的调整值;
    基于每个元素的修正后的调整值,对各卷积核进行调整。
  6. 一种在卷积神经网络模型中确定特征图像的装置,其特征在于,所述装置包括:
    获取模块,用于获取卷积神经网络模型中目标处理层的多个输入图像;获取所述目标处理层的至少一组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;
    确定模块,用于基于所述至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对所述多个中间矩阵求和得到特征图像,其中所述中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。
  7. 根据权利要求6所述的装置,其特征在于,所述确定模块用于将所述多个中间矩阵中 相同位置的元素的多项式相加,得到所述特征图像的每个元素对应的多项式;对所述特征图像的每个元素对应的多项式,分别进行合并同类项处理;对每个合并同类项处理后的多项式,分别求值,得到所述特征图像。
  8. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    生成模块,用于随机生成N个卷积核,其中,所述N为预设的组数目;
    位移模块,用于对所述N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成所述目标处理层的一组卷积核,其中,所述M为预设的组中卷积核的数目。
  9. 根据权利要求6所述的装置,其特征在于,每个组中卷积核的数目大于2且小于卷积核的行数与列数的乘积。
  10. 根据权利要求6所述的装置,其特征在于,所述确定模块还用于当得到所述卷积神经网络模型的输出结果时,根据所述卷积神经网络模型的输出结果和预设的输出结果,确定所述至少一组卷积核中每个卷积核中每个元素的调整值;将同组中的不同卷积核包含的相同元素的调整值之和,确定为所述相同元素的调整值对应的修正后的调整值;
    所述装置还包括调整模块:
    所述调整模块,用于基于每个元素的修正后的调整值,对各卷积核进行调整。
  11. 一种终端,其特征在于,所述终端包括处理器和存储器,其中:
    所述处理器,用于获取所述存储器中存储的卷积神经网络模型中目标处理层的多个输入图像;获取所述存储器中存储的所述目标处理层的至少一组卷积核,其中,同组中的不同卷积核包含的元素相同且元素的排列顺序不同;基于所述至少一组卷积核中的各卷积核,分别对不同的输入图像进行卷积计算,得到多个中间矩阵,对所述多个中间矩阵求和得到特征图像,其中所述中间矩阵的每个元素是在卷积计算过程中对应的卷积核与输入图像进行对位元素相乘再相加得到的多项式。
  12. 根据权利要求11所述的终端,其特征在于,所述处理器用于将所述多个中间矩阵中相同位置的元素的多项式相加,得到所述特征图像的每个元素对应的多项式;对所述特征图像的每个元素对应的多项式,分别进行合并同类项处理;对每个合并同类项处理后的多项式,分别求值,得到所述特征图像。
  13. 根据权利要求11所述的终端,其特征在于,所述处理器还用于随机生成N个卷积核,其中,所述N为预设的组数目;对所述N个卷积核中的每个卷积核,以行为单位进行元素位移,和/或以列为单位进行元素位移,得到M-1个不同的卷积核,M-1个卷积核与元素位移之前的卷积核组成所述目标处理层的一组卷积核,其中,所述M为预设的组中卷积核的数目。
  14. 根据权利要求11所述的终端,其特征在于,每个组中卷积核的数目大于2且小于卷积核的行数与列数的乘积。
  15. 根据权利要求11所述的终端,其特征在于,所述处理器还用于当得到所述卷积神经网络模型的输出结果时,根据所述卷积神经网络模型的输出结果和预设的输出结果,确定所述至少一组卷积核中每个卷积核中每个元素的调整值;将同组中的不同卷积核包含的相同元素的调整值之和,确定为所述相同元素的调整值对应的修正后的调整值;基于每个元素的修 正后的调整值,对各卷积核进行调整。
  16. 一种计算机可读存储介质,其特征在于,包括指令,当所述计算机可读存储介质在终端上运行时,使得所述终端执行所述权利要求1-5中任一权利要求所述的方法。
  17. 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,使得所述终端执行所述权利要求1-5中任一权利要求所述的方法。
PCT/CN2017/117503 2017-12-20 2017-12-20 在卷积神经网络模型中确定特征图像的方法和装置 WO2019119301A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/117503 WO2019119301A1 (zh) 2017-12-20 2017-12-20 在卷积神经网络模型中确定特征图像的方法和装置
CN201780096076.0A CN111247527B (zh) 2017-12-20 2017-12-20 在卷积神经网络模型中确定特征图像的方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/117503 WO2019119301A1 (zh) 2017-12-20 2017-12-20 在卷积神经网络模型中确定特征图像的方法和装置

Publications (1)

Publication Number Publication Date
WO2019119301A1 true WO2019119301A1 (zh) 2019-06-27

Family

ID=66992897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/117503 WO2019119301A1 (zh) 2017-12-20 2017-12-20 在卷积神经网络模型中确定特征图像的方法和装置

Country Status (2)

Country Link
CN (1) CN111247527B (zh)
WO (1) WO2019119301A1 (zh)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807170A (zh) * 2019-10-21 2020-02-18 中国人民解放军国防科技大学 多样本多通道卷积神经网络Same卷积向量化实现方法
CN110929623A (zh) * 2019-11-15 2020-03-27 北京达佳互联信息技术有限公司 多媒体文件的识别方法、装置、服务器和存储介质
CN111241993A (zh) * 2020-01-08 2020-06-05 咪咕文化科技有限公司 座位数的确定方法、装置、电子设备及存储介质
CN111414995A (zh) * 2020-03-16 2020-07-14 北京君立康生物科技有限公司 小目标菌落的检测处理方法、装置、电子设备及介质
CN112016740A (zh) * 2020-08-18 2020-12-01 北京海益同展信息科技有限公司 数据处理方法和装置
CN112132279A (zh) * 2020-09-23 2020-12-25 平安科技(深圳)有限公司 卷积神经网络模型压缩方法、装置、设备及存储介质
CN112541565A (zh) * 2019-09-20 2021-03-23 腾讯科技(深圳)有限公司 一种卷积计算数据流映射方法及装置
CN112733585A (zh) * 2019-10-29 2021-04-30 杭州海康威视数字技术股份有限公司 图像识别方法
CN112766474A (zh) * 2019-11-04 2021-05-07 北京地平线机器人技术研发有限公司 用于实现卷积运算的方法、装置、介质以及电子设备
CN113052756A (zh) * 2019-12-27 2021-06-29 武汉Tcl集团工业研究院有限公司 一种图像处理方法、智能终端及存储介质
CN113919405A (zh) * 2020-07-07 2022-01-11 华为技术有限公司 数据处理方法、装置与相关设备
CN114090470A (zh) * 2020-07-29 2022-02-25 中国科学院深圳先进技术研究院 数据预加载装置及其预加载方法、存储介质和计算机设备
US11681915B2 (en) 2019-12-26 2023-06-20 Samsung Electronics Co., Ltd. Neural network method and apparatus
CN116295188A (zh) * 2023-05-15 2023-06-23 山东慧点智能技术有限公司 基于位移传感器的测量装置及测量方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767928B (zh) * 2020-06-28 2023-08-08 中国矿业大学 基于卷积神经网络提取图像特征信息的方法及装置
CN112149694B (zh) * 2020-08-28 2024-04-05 特斯联科技集团有限公司 一种基于卷积神经网络池化模块的图像处理方法、系统、存储介质及终端
CN115861043B (zh) * 2023-02-16 2023-05-16 深圳市旗云智能科技有限公司 基于人工智能的图像数据处理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160110499A1 (en) * 2014-10-21 2016-04-21 Life Technologies Corporation Methods, systems, and computer-readable media for blind deconvolution dephasing of nucleic acid sequencing data
CN106156781A (zh) * 2016-07-12 2016-11-23 北京航空航天大学 排序卷积神经网络构建方法及其图像处理方法与装置
CN106447030A (zh) * 2016-08-30 2017-02-22 深圳市诺比邻科技有限公司 卷积神经网络的计算资源优化方法及系统
CN106778584A (zh) * 2016-12-08 2017-05-31 南京邮电大学 一种基于深层特征与浅层特征融合的人脸年龄估计方法
CN106971174A (zh) * 2017-04-24 2017-07-21 华南理工大学 一种cnn模型、cnn训练方法以及基于cnn的静脉识别方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077233B (zh) * 2014-06-18 2017-04-05 百度在线网络技术(北京)有限公司 多通道卷积层处理方法和装置
CN104915322B (zh) * 2015-06-09 2018-05-01 中国人民解放军国防科学技术大学 一种卷积神经网络硬件加速方法
CN106326985A (zh) * 2016-08-18 2017-01-11 北京旷视科技有限公司 神经网络训练方法和装置及数据处理方法和装置
CN106682736A (zh) * 2017-01-18 2017-05-17 北京小米移动软件有限公司 图像识别方法及装置
CN107491787A (zh) * 2017-08-21 2017-12-19 珠海习悦信息技术有限公司 局部二值化cnn的处理方法、装置、存储介质及处理器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160110499A1 (en) * 2014-10-21 2016-04-21 Life Technologies Corporation Methods, systems, and computer-readable media for blind deconvolution dephasing of nucleic acid sequencing data
CN106156781A (zh) * 2016-07-12 2016-11-23 北京航空航天大学 排序卷积神经网络构建方法及其图像处理方法与装置
CN106447030A (zh) * 2016-08-30 2017-02-22 深圳市诺比邻科技有限公司 卷积神经网络的计算资源优化方法及系统
CN106778584A (zh) * 2016-12-08 2017-05-31 南京邮电大学 一种基于深层特征与浅层特征融合的人脸年龄估计方法
CN106971174A (zh) * 2017-04-24 2017-07-21 华南理工大学 一种cnn模型、cnn训练方法以及基于cnn的静脉识别方法

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541565B (zh) * 2019-09-20 2023-08-29 腾讯科技(深圳)有限公司 一种卷积计算数据流映射方法及装置
CN112541565A (zh) * 2019-09-20 2021-03-23 腾讯科技(深圳)有限公司 一种卷积计算数据流映射方法及装置
CN110807170A (zh) * 2019-10-21 2020-02-18 中国人民解放军国防科技大学 多样本多通道卷积神经网络Same卷积向量化实现方法
CN110807170B (zh) * 2019-10-21 2023-06-27 中国人民解放军国防科技大学 多样本多通道卷积神经网络Same卷积向量化实现方法
CN112733585B (zh) * 2019-10-29 2023-09-05 杭州海康威视数字技术股份有限公司 图像识别方法
CN112733585A (zh) * 2019-10-29 2021-04-30 杭州海康威视数字技术股份有限公司 图像识别方法
CN112766474B (zh) * 2019-11-04 2024-03-22 北京地平线机器人技术研发有限公司 用于实现卷积运算的方法、装置、介质以及电子设备
CN112766474A (zh) * 2019-11-04 2021-05-07 北京地平线机器人技术研发有限公司 用于实现卷积运算的方法、装置、介质以及电子设备
CN110929623A (zh) * 2019-11-15 2020-03-27 北京达佳互联信息技术有限公司 多媒体文件的识别方法、装置、服务器和存储介质
US11681915B2 (en) 2019-12-26 2023-06-20 Samsung Electronics Co., Ltd. Neural network method and apparatus
CN113052756A (zh) * 2019-12-27 2021-06-29 武汉Tcl集团工业研究院有限公司 一种图像处理方法、智能终端及存储介质
CN111241993B (zh) * 2020-01-08 2023-10-20 咪咕文化科技有限公司 座位数的确定方法、装置、电子设备及存储介质
CN111241993A (zh) * 2020-01-08 2020-06-05 咪咕文化科技有限公司 座位数的确定方法、装置、电子设备及存储介质
CN111414995B (zh) * 2020-03-16 2023-05-19 北京君立康生物科技有限公司 小目标菌落的检测处理方法、装置、电子设备及介质
CN111414995A (zh) * 2020-03-16 2020-07-14 北京君立康生物科技有限公司 小目标菌落的检测处理方法、装置、电子设备及介质
CN113919405A (zh) * 2020-07-07 2022-01-11 华为技术有限公司 数据处理方法、装置与相关设备
CN113919405B (zh) * 2020-07-07 2024-01-19 华为技术有限公司 数据处理方法、装置与相关设备
CN114090470A (zh) * 2020-07-29 2022-02-25 中国科学院深圳先进技术研究院 数据预加载装置及其预加载方法、存储介质和计算机设备
CN112016740A (zh) * 2020-08-18 2020-12-01 北京海益同展信息科技有限公司 数据处理方法和装置
CN112132279B (zh) * 2020-09-23 2023-09-15 平安科技(深圳)有限公司 卷积神经网络模型压缩方法、装置、设备及存储介质
CN112132279A (zh) * 2020-09-23 2020-12-25 平安科技(深圳)有限公司 卷积神经网络模型压缩方法、装置、设备及存储介质
CN116295188B (zh) * 2023-05-15 2023-08-11 山东慧点智能技术有限公司 基于位移传感器的测量装置及测量方法
CN116295188A (zh) * 2023-05-15 2023-06-23 山东慧点智能技术有限公司 基于位移传感器的测量装置及测量方法

Also Published As

Publication number Publication date
CN111247527A (zh) 2020-06-05
CN111247527B (zh) 2023-08-22

Similar Documents

Publication Publication Date Title
WO2019119301A1 (zh) 在卷积神经网络模型中确定特征图像的方法和装置
US11816532B2 (en) Performing kernel striding in hardware
JP7279226B2 (ja) 代替ループ限界値
US10534607B2 (en) Accessing data in multi-dimensional tensors using adders
EP3373210B1 (en) Transposing neural network matrices in hardware
US11645529B2 (en) Sparsifying neural network models
CN109324827B (zh) 用于处理用于访问数据的指令的装置、方法和系统
US20190065958A1 (en) Apparatus and Methods for Training in Fully Connected Layers of Convolutional Networks
US9946539B1 (en) Accessing data in multi-dimensional tensors using adders
CN111476360A (zh) 用于神经网络的Winograd变换卷积操作的装置和方法
JP2019537139A5 (zh)
US11244028B2 (en) Neural network processor and convolution operation method thereof
CN109313663B (zh) 人工智能计算辅助处理装置、方法、存储介质、及终端
EP3093757B1 (en) Multi-dimensional sliding window operation for a vector processor
CN109255438A (zh) 调整张量数据的方法和装置
JP2023541350A (ja) 表畳み込みおよびアクセラレーション
WO2020093669A1 (en) Convolution block array for implementing neural network application and method using the same, and convolution block circuit
US10509996B2 (en) Reduction of parameters in fully connected layers of neural networks
US20210303987A1 (en) Power reduction for machine learning accelerator background
GB2567038B (en) Accessing prologue and epilogue data
TWI834729B (zh) 神經網路處理器及其卷積操作方法
US20230056869A1 (en) Method of generating deep learning model and computing device performing the same
CN116868205A (zh) 经层式分析的神经网络剪枝方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17935587

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17935587

Country of ref document: EP

Kind code of ref document: A1