WO2020164271A1 - Pooling method and device for convolutional neural network, storage medium and computer device - Google Patents

Pooling method and device for convolutional neural network, storage medium and computer device Download PDF

Info

Publication number
WO2020164271A1
WO2020164271A1 PCT/CN2019/117863 CN2019117863W WO2020164271A1 WO 2020164271 A1 WO2020164271 A1 WO 2020164271A1 CN 2019117863 W CN2019117863 W CN 2019117863W WO 2020164271 A1 WO2020164271 A1 WO 2020164271A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
data
sample data
pooling
neural network
Prior art date
Application number
PCT/CN2019/117863
Other languages
French (fr)
Chinese (zh)
Inventor
房树明
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164271A1 publication Critical patent/WO2020164271A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • This application relates to the technical field of convolutional neural network models. Specifically, this application relates to a pooling method, device, storage medium, and computer equipment of a convolutional neural network.
  • Convolutional Neural Network is a feedforward neural network.
  • Convolutional neural networks use a large number of artificial neurons, which can respond to surrounding units in a part of the coverage area, and are often used for large-scale image processing.
  • Convolutional neural networks include convolutional layer and pooling layer.
  • the current general pooling function uses max-pooling or mean-pooling.
  • max pooling is that every time you pool you will lose information other than the maximum position; the disadvantage of average pooling is that it cannot reflect some important information that is far from the average. Therefore, the current pooling method adopted by the convolutional neural network reduces the input information of the convolutional neural network, resulting in a decrease in the accuracy of the output result of the convolutional neural network model.
  • This application proposes a convolutional neural network pooling method, device, storage medium, and computer equipment to expand the pooled data samples and enrich the input information of the convolutional neural network.
  • a convolutional neural network pooling method including: obtaining training samples of the convolutional neural network, inputting the training samples into the convolutional neural network model; inputting the sample data matrix output by the convolutional layer of the convolutional neural network model into the convolution
  • the pooling layer of the neural network model divides the sample data matrix into multiple data sub-matrices in the pooling layer; randomly extracts a sample data from each data sub-matrix; generates a pooling matrix according to each sample data extracted, Use this pooling matrix as the output of the pooling layer.
  • a pooling device for a convolutional neural network comprising: an acquisition module for acquiring training samples of the convolutional neural network, and inputting the training samples into a convolutional neural network model; a division module for convolutional neural network model The sample data matrix output by the product layer is input to the pooling layer of the convolutional neural network model, where the sample data matrix is divided into multiple data sub-matrices; the extraction module is used to randomly extract one from each data sub-matrix Sample data; a generation module, used to generate a pooling matrix according to each sample data extracted, and use the pooling matrix as the output of the pooling layer.
  • a computer nonvolatile storage medium on which a computer program is stored; the computer program is suitable for being loaded by a processor and executing the convolutional neural network pooling method described in any of the above embodiments.
  • a computer device includes: one or more processors; a memory; one or more application programs, wherein the one or more application programs are stored in the memory and configured to be operated by the one or more Executed by a processor, and the one or more application programs are configured to execute the convolutional neural network pooling method according to any one of the foregoing embodiments.
  • each data sub-matrix of the sample data is randomly collected in the pooling layer and then pooled.
  • a sample data matrix of M*M can be M*M *M*M, compared with the traditional maximum pooling or average pooling, expands the pooled data samples and enriches the input information of the convolutional neural network.
  • FIG. 1 is a schematic diagram of the internal structure in an embodiment of a convolutional neural network model provided by this application;
  • FIG. 2 is a schematic diagram in an embodiment of the maximum pooling method provided by this application.
  • FIG. 3 is a schematic diagram in an embodiment of the average pooling method provided by this application.
  • FIG. 5 is a flowchart of a method in an embodiment of step S200 provided in this application.
  • FIG. 6 is a schematic diagram in an embodiment of the random pooling method provided by this application.
  • FIG. 7 is a structural block diagram of an embodiment of a convolutional neural network pooling device provided by this application.
  • FIG. 8 is a schematic structural diagram in an embodiment of a computer device provided by this application.
  • This application provides a pooling method of a convolutional neural network, which is applied to the pooling layer of a convolutional neural network.
  • the convolutional neural network includes two convolutional layers, two pooling layers, and a fully connected hidden layer between the input layer and the output layer.
  • the input of the pooling layer comes from the previous convolutional layer, which mainly provides strong robustness. While reducing the number of deep learning parameters, it can also prevent overfitting. The reduction in the number of parameters means that the training speed of the convolutional neural network will be faster, the model file after training will be smaller, and the inference speed will be faster when predicting samples. Therefore, the pooling function is of great significance to convolutional neural networks.
  • Pooling functions in convolutional neural networks generally use max-pooling and mean-pooling methods.
  • For the maximum pooling method as shown in Figure 2. After a 4 by 4 matrix is pooled by a 2 by 2 maximum pooling function, it becomes a 2 by 2 matrix.
  • the pooling calculation step is: split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and obtain the maximum value of each sub-matrix to form a new matrix, which is the maximum pooling.
  • For the average pooling method as shown in Figure 3. After a 4 by 4 matrix is pooled by a 2 by 2 average pooling function, it becomes a 2 by 2 matrix.
  • Pooling calculation steps split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and obtain the average value of each sub-matrix to form a new matrix, which is average pooling.
  • the maximum pooling method information other than the maximum position will be lost every time the pooling is performed.
  • Using the average pooling method cannot reflect some important information that is far from the average.
  • This application provides a pooling method for convolutional neural networks to expand pooled data samples and enrich the input information of the convolutional neural network.
  • the pooling method of the convolutional neural network includes the following steps:
  • the server before performing data operations on the convolutional neural network structure, the server first obtains training samples of characteristic data.
  • the training sample is the target feature calculated by the server through the convolutional neural network. Further, the training samples are input into the convolutional neural network model.
  • the convolutional neural network model includes convolutional layer, sampling layer, activation layer, pooling layer and fully connected layer.
  • the convolutional layer is used to extract the spatial features of the input data.
  • the convolutional layer may include multiple convolution kernels to extract multiple spatial features of the input data.
  • the activation layer can adopt a nonlinear activation function.
  • the pooling layer is used to avoid overfitting in the convolution process.
  • the fully connected layer is used for the adjacent connection between neurons and neurons in the network, and can be calculated by the softmax function and output to obtain different probability values.
  • Step S100 includes: acquiring the image sample data as a training sample of the convolutional neural network model.
  • the sample data for model training may be image sample data
  • the model output is the classification of image sample data
  • the acquiring the image sample data as the training sample of the convolutional neural network model includes: acquiring the image sample data in the image sample data as the training sample of the convolutional neural network model.
  • the sample data matrix output by the convolution layer of the convolutional neural network model is input into the pooling layer of the convolutional neural network model, and the sample data matrix is divided into a plurality of data sub-matrices in the pooling layer.
  • a sample data is randomly extracted from the data sub-matrix, a pooling matrix is generated according to each sample data extracted, and the pooling matrix is used as the output of the pooling layer, including: the output of the first convolutional layer of the convolutional neural network model After the image sample data matrix is activated, enter the first pooling layer of the convolutional neural network model.
  • the sample data matrix is divided into multiple data sub-matrices, and one is randomly extracted from each data sub-matrix Picture sample data, after the pooling matrix is generated according to the extracted picture sample data for the output of the first pooling layer, the data output by the first pooling layer is input into the second convolutional layer for convolution, and the second The data output by the convolutional layer is randomly activated, and the randomly activated data is input to the second pooling layer for pooling training, and the pooling training result is used as the output of the system pooling layer.
  • S200 Input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into multiple data sub-matrices in the pooling layer.
  • the server when the server inputs training samples into the convolutional neural network model, first the training samples are input into the convolutional layer of the convolutional neural network model.
  • the convolution layer performs convolution training on the training samples, extracts different spatial features of the input training samples, and outputs a sample data matrix containing multiple features.
  • the sample data matrix of features with larger dimensions is usually obtained after the convolutional layer. At this time, cut the feature into several regions, and take the maximum or average value to obtain a new feature with a smaller dimension. That is, the pooling operation of the convolutional neural network.
  • the sample data matrix output by the convolutional layer is divided into sub-matrices.
  • a convolution kernel of a preset size (less than the size of the sample data matrix) is set.
  • the convolution kernel window is a window with the same length and width.
  • the convolution kernel of windows with different length and width values is not excluded.
  • the sample data matrix can be divided into multiple sub-matrices through the convolution kernel.
  • the plurality of data sub-matrices include data sub-matrices with the same number of rows and columns.
  • dividing the sample data matrix into multiple data sub-matrices in the pooling layer includes:
  • S230 According to the number of rows and columns of the sample data matrix, divide the sample data matrix into a plurality of data sub-matrices with the same number of rows and columns in the pooling layer.
  • the convolution kernel in the convolutional neural network generally takes a small square matrix.
  • Mainstream deep learning frameworks generally support convolution kernels with the same length and width.
  • a 3 by 3 convolution kernel or a 2 by 2 convolution kernel can be used.
  • the number of rows M and the number of columns N of the sample data matrix are obtained, and the sample data matrix is divided into a plurality of data sub-matrices with the same number of rows and columns according to the number of rows M and N of columns.
  • the number of rows M and the number of columns N can be the same.
  • the multiple data sub-matrices include a first data sub-matrix with the same number of rows and columns, and a second data sub-matrix with different rows and columns. That is, when the sample data matrix is divided, multiple first data sub-matrices with the same number of rows and columns and multiple second data sub-matrices with different numbers of rows and columns can be simultaneously divided.
  • the sample data matrix is divided into multiple data sub-matrices in the pooling layer, including: obtaining the number of rows and columns of the sample data matrix; pooling according to the number of rows and columns of the sample data matrix
  • the sample data matrix is divided into a plurality of the first data sub-matrices and a plurality of the second data sub-matrices in the layer.
  • the server when the server performs the pooling operation in the convolutional neural network model, it randomly extracts one sample data from each data sub-matrix, so that a new matrix can be formed according to the data extracted from each data sub-matrix.
  • the data randomly extracted each time can be any data value in the data sub-matrix, and each data value can also be extracted repeatedly.
  • the data sub-matrix is a matrix with multiple rows and multiple columns.
  • Step S300 includes: randomly selecting any row of data from the multiple rows of each data sub-matrix; randomly selecting data corresponding to any column from the row of data as the sample data; or, from each data Randomly select any column of data from the multiple columns of the sub-matrix; from the column of data, randomly select data corresponding to any row as the sample data.
  • the server first randomly selects any row of data from a data sub-matrix of multiple rows and multiple columns, and then selects any column from the row of data, and the data corresponding to the column is the sample data.
  • the server first randomly selects any column of data from the data sub-matrix of multiple rows and multiple columns, and then selects any row from the column of data, and the data corresponding to the row is the sample data.
  • the data sub-matrix is a matrix with multiple rows and multiple columns.
  • Step S300 includes: randomly selecting data corresponding to any row and any column from multiple rows and multiple columns of each data sub-matrix as the sample data.
  • the server randomly selects any row value and column value, and the data corresponding to the row value and column value is the sample data. If the server randomly selects the row value and the column value (4, 3), the data in the data sub-matrix corresponding to (4, 3) is the sample data.
  • S400 Generate a pooling matrix according to each extracted sample data, and use the pooling matrix as an output of the pooling layer.
  • a new matrix can be generated based on the sample data extracted from each data sub-matrix, that is, a pooling matrix, and the pooling matrix is used as the output of the pooling layer to output to the convolutional neural network model The fully connected hidden layer.
  • a 4-by-4 sample data matrix is pooled into a random pool function divided into 2-by-2 data sub-matrices to become multiple 2-by-2 matrices.
  • the calculation steps of random pooling are: split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and randomly select a value in each sub-matrix to form a new matrix, which is random Pooling matrix.
  • the pooling method of the convolutional neural network inputs training samples into the convolutional neural network model, and after convolution processing is performed on the convolutional layer of the convolutional neural network model, the sample data matrix output by the convolutional layer Enter the pooling layer of the convolutional neural network model.
  • the sample data matrix is divided into multiple data sub-matrices, and a data sample is randomly extracted from each data sub-matrix to generate a pooling matrix.
  • an M*M sample data matrix can get M*M*M*M, which is different from the traditional maximum pooling or Compared with the average pooling method, the pooled data samples are expanded and the input information of the convolutional neural network is enriched.
  • the pooling method of the convolutional neural network described above can be used in image classification.
  • the image classification method using the pooling method of the convolutional neural network includes: acquiring image data of a target image; inputting the image data into a convolutional neural network model to obtain the result data of the target image to be classified;
  • the convolutional neural network model is used to perform image feature category analysis on the image data, and output the result data to be classified; wherein, the pooling layer in the convolutional neural network model is used to combine the convolutional neural network model
  • the image data output by the convolutional layer in is divided into multiple data matrices, and one image sample data is randomly extracted from each data sub-matrix, and a pooling matrix is generated according to the extracted image sample data, and the pooling matrix is regarded as pooling Output; classify the target image according to the result data to be classified.
  • the convolutional neural network model further includes a first activation unit and a second activation unit;
  • the convolutional layer includes a first convolutional layer and a second convolutional layer;
  • the pooling layer includes a first pooling Layer and a second pooling layer; wherein the first pooling layer and the second pooling layer are both used to divide the image data output by the convolutional layer in the convolutional neural network model into multiple data Matrix, and randomly extract an image sample data from each data sub-matrix, generate a pooling matrix according to the extracted image sample data, and output the pooling matrix as a pooling;
  • the first activation unit is used to transfer the first After non-random activation of the image data output by a convolutional layer, the result value is input to the first pooling layer;
  • the first pooling layer is used to pool the input image data and output it to the first pooling layer.
  • the second convolutional layer is used to convolve the input image data and output to the second activation unit; the second activation unit is used to randomly activate the input image data and output To the fully connected layer of the convolutional neural network model, the fully connected layer is used to output the result data to be classified.
  • a typical convolutional neural network has a convolutional layer, an activation layer, a pooling layer, and a fully connected layer.
  • a three-layer convolutional neural network can be expressed as: input image -> convolution -> activation -> pool Conversion->Convolution->Activation->Pooling->Convolution->Activation->Pooling->Fully connected layer->Output to be classified. If random pooling is applied to the convolutional neural network, the structure of the network becomes: input image -> convolution -> activation -> random pooling -> convolution -> random activation -> pooling -> convolution- > Random activation -> Pooling -> Fully connected layer -> Output to be classified.
  • the pooling device of the convolutional neural network includes an acquisition module 10, a division module 20, an extraction module 30 and a generation module 40.
  • the obtaining module 10 is used to obtain training samples of the convolutional neural network, and input the training samples into the convolutional neural network model.
  • the server before performing data operations on the convolutional neural network structure, the server first obtains training samples of characteristic data.
  • the training sample is the target feature calculated by the server through the convolutional neural network. Further, the training samples are input into the convolutional neural network model.
  • the convolutional neural network model includes convolutional layer, sampling layer, activation layer, pooling layer and fully connected layer.
  • the convolutional layer is used to extract the spatial features of the input data.
  • the convolutional layer may include multiple convolution kernels to extract multiple spatial features of the input data.
  • the activation layer can adopt a nonlinear activation function.
  • the pooling layer is used to avoid overfitting in the convolution process.
  • the fully connected layer is used for the adjacent connection between neurons and neurons in the network, and can be calculated by the softmax function and output to obtain different probability values.
  • the dividing module 20 is configured to input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into a plurality of data sub-matrices in the pooling layer.
  • the server when the server inputs training samples into the convolutional neural network model, first the training samples are input into the convolutional layer of the convolutional neural network model.
  • the convolution layer performs convolution training on the training samples, extracts different spatial features of the input training samples, and outputs a sample data matrix containing multiple features.
  • the sample data matrix of features with larger dimensions is usually obtained after the convolutional layer.
  • the feature is cut into several regions, and the maximum or average value is selected to obtain a new feature with a smaller dimension. That is, the pooling operation of the convolutional neural network.
  • the sample data matrix output by the convolutional layer is divided into sub-matrices. Specifically, a convolution kernel of a preset size (less than the size of the sample data matrix) is set.
  • the convolution kernel window is a window with the same length and width. Of course, the convolution kernel of windows with different length and width values is not excluded.
  • the sample data matrix can be divided into multiple sub-matrices through the convolution kernel.
  • the extraction module 30 is used to randomly extract a sample data from each data sub-matrix.
  • the server when the server performs the pooling operation in the convolutional neural network model, it randomly extracts one sample data from each data sub-matrix, so that a new matrix can be formed according to the data extracted from each data sub-matrix.
  • the data randomly extracted each time can be any data value in the data sub-matrix, and each data value can also be extracted repeatedly.
  • the generating module 40 is configured to generate a pooling matrix according to each extracted sample data, and use the pooling matrix as the output of the pooling layer.
  • a new matrix can be generated based on the sample data extracted from each data sub-matrix, that is, a pooling matrix, and the pooling matrix is used as the output of the pooling layer to output to the convolutional neural network model The fully connected hidden layer.
  • each module in the convolutional neural network pooling device provided in this application is also used to execute the operations performed corresponding to each step in the convolutional neural network pooling method described in this application, here No more detailed instructions.
  • the application also provides a storage medium.
  • the storage medium stores a computer program; when the computer program is executed by a processor, the convolutional neural network pooling method described in any of the above embodiments is implemented.
  • the storage medium may be a memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the storage medium disclosed in this application includes but is not limited to these types of memories.
  • the memory disclosed in this application is only an example and not a limitation.
  • a computer device includes: one or more processors; memory; and one or more application programs. Wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute the one described in any of the above embodiments The pooling method of convolutional neural network.
  • FIG. 8 is a schematic structural diagram of a computer device in an embodiment of this application.
  • the computer device described in this embodiment may be a server, a personal computer, and a network device.
  • the device includes a processor 803, a memory 805, an input unit 807, a display unit 809 and other devices.
  • the memory 805 may be used to store an application program 801 and various functional modules, and the processor 803 runs the application program 801 stored in the memory 805 to execute various functional applications and data processing of the device.
  • the memory may be internal memory or external memory, or include both internal memory and external memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the memory disclosed in this application includes but is not limited to these types of memory.
  • the memory disclosed in this application is only an example and not a limitation.
  • the input unit 807 is used to receive signal input and keywords input by the user.
  • the input unit 807 may include a touch panel and other input devices.
  • the touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset
  • the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 809 may be used to display information input by the user or information provided to the user and various menus of the computer device.
  • the display unit 809 may take the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the processor 803 is the control center of the computer equipment. It uses various interfaces and lines to connect the various parts of the entire computer. It executes by running or executing the software programs and/or modules stored in the memory 805 and calling the data stored in the memory. Various functions and processing data.
  • the device includes one or more processors 803, one or more memories 805, and one or more application programs 801.
  • the one or more application programs 801 are stored in the memory 805 and configured to be executed by the one or more processors 803, and the one or more application programs 801 are configured to execute the above-mentioned embodiments.
  • the pooling method of convolutional neural network is configured to execute the above-mentioned embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A pooling method and device for a convolutional neural network, a storage medium and a computer device, the method comprising: obtaining a convolutional neural network training sample, and inputting the training sample into a convolutional neural network model (S100); inputting a sample data matrix outputted by a convolutional layer of the convolutional neural network model into a pooling layer of the convolutional neural network model, and dividing the sample data matrix into a plurality of data sub-matrices in the pooling layer (S200); randomly extracting sample data from each data submatrix (S300); and generating a pooling matrix according to said extracted sample data, and using the pooling matrix as the output of the pooling layer (S400). According to the present method, pooled data samples may be expanded, and the input information of the convolutional neural network is enriched.

Description

卷积神经网络的池化方法、装置及存储介质、计算机设备Convolutional neural network pooling method, device, storage medium, and computer equipment
本申请要求于2019年02月13日提交中国专利局、申请号为201910113187.9、申请名称为“卷积神经网络的池化方法、装置及存储介质、计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 13, 2019, the application number is 201910113187.9, and the application name is "Convolutional Neural Network Pooling Method, Device, Storage Medium, and Computer Equipment". The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及卷积神经网络模型技术领域,具体而言,本申请涉及一种卷积神经网络的池化方法、装置及存储介质、计算机设备。This application relates to the technical field of convolutional neural network models. Specifically, this application relates to a pooling method, device, storage medium, and computer equipment of a convolutional neural network.
背景技术Background technique
卷积神经网络(Convolutional Neural Network,CNN)是一种前馈神经网络。卷积神经网络采用大量的人工神经元,人工神经元可以响应一部分覆盖范围内的周围单元,其常用于大型图像处理。卷积神经网络采用包括卷积层(convolutional layer)和池化层(pooling layer)。Convolutional Neural Network (CNN) is a feedforward neural network. Convolutional neural networks use a large number of artificial neurons, which can respond to surrounding units in a part of the coverage area, and are often used for large-scale image processing. Convolutional neural networks include convolutional layer and pooling layer.
目前通用的池化函数采用最大池化(max-pooling)或平均池化(mean-pooling)。最大池化的缺点是,每次池化都会丢失最大值位置以外的信息;平均池化的缺点是,不能体现出一些离平均值较远但比较重要的信息。因此,当前卷积神经网络采用的池化方式,减少了卷积神经网络的输入信息,导致卷积神经网络模型的输出结果准确性降低。The current general pooling function uses max-pooling or mean-pooling. The disadvantage of max pooling is that every time you pool you will lose information other than the maximum position; the disadvantage of average pooling is that it cannot reflect some important information that is far from the average. Therefore, the current pooling method adopted by the convolutional neural network reduces the input information of the convolutional neural network, resulting in a decrease in the accuracy of the output result of the convolutional neural network model.
发明内容Summary of the invention
本申请提出一种卷积神经网络的池化方法、装置及存储介质、计算机设备,以扩充池化后的数据样本,丰富了卷积神经网络的输入信息。This application proposes a convolutional neural network pooling method, device, storage medium, and computer equipment to expand the pooled data samples and enrich the input information of the convolutional neural network.
本申请提供以下方案:This application provides the following solutions:
一种卷积神经网络的池化方法,包括:获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型;将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵;从每个数据子矩阵中随机提取一个样本数据;根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。A convolutional neural network pooling method, including: obtaining training samples of the convolutional neural network, inputting the training samples into the convolutional neural network model; inputting the sample data matrix output by the convolutional layer of the convolutional neural network model into the convolution The pooling layer of the neural network model divides the sample data matrix into multiple data sub-matrices in the pooling layer; randomly extracts a sample data from each data sub-matrix; generates a pooling matrix according to each sample data extracted, Use this pooling matrix as the output of the pooling layer.
一种卷积神经网络的池化装置,包括:获取模块,用于获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型;划分模块,用于将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵;提取模块,用于从每个数据子矩阵中随机提取一个样本数据;生成模块,用于根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。A pooling device for a convolutional neural network, comprising: an acquisition module for acquiring training samples of the convolutional neural network, and inputting the training samples into a convolutional neural network model; a division module for convolutional neural network model The sample data matrix output by the product layer is input to the pooling layer of the convolutional neural network model, where the sample data matrix is divided into multiple data sub-matrices; the extraction module is used to randomly extract one from each data sub-matrix Sample data; a generation module, used to generate a pooling matrix according to each sample data extracted, and use the pooling matrix as the output of the pooling layer.
一种计算机非易失性存储介质,其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述任一实施例所述的卷积神经网络的池化方法。A computer nonvolatile storage medium, on which a computer program is stored; the computer program is suitable for being loaded by a processor and executing the convolutional neural network pooling method described in any of the above embodiments.
一种计算机设备,其包括:一个或多个处理器;存储器;一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行根据上述任一实施例所述的卷积神经网络的池化方法。A computer device includes: one or more processors; a memory; one or more application programs, wherein the one or more application programs are stored in the memory and configured to be operated by the one or more Executed by a processor, and the one or more application programs are configured to execute the convolutional neural network pooling method according to any one of the foregoing embodiments.
上述实施例提供的卷积神经网络的池化方法,通过在池化层中对样本数据的每个数据子矩阵进行数据随机采集后池化,一个M*M的样本数据矩阵可得到M*M*M*M,与传统的最大池化或平均池化的方式相比,扩充了池化后的数据样本,丰富了卷积神经网络的输入信息。In the convolutional neural network pooling method provided in the above embodiments, each data sub-matrix of the sample data is randomly collected in the pooling layer and then pooled. A sample data matrix of M*M can be M*M *M*M, compared with the traditional maximum pooling or average pooling, expands the pooled data samples and enriches the input information of the convolutional neural network.
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.
附图说明Description of the drawings
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become obvious and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, in which:
图1为本申请提供的卷积神经网络模型的一实施例中的内部结构示意图;FIG. 1 is a schematic diagram of the internal structure in an embodiment of a convolutional neural network model provided by this application;
图2为本申请提供的最大池化方法的一实施例中的示意图;FIG. 2 is a schematic diagram in an embodiment of the maximum pooling method provided by this application;
图3为本申请提供的平均池化方法的一实施例中的示意图;FIG. 3 is a schematic diagram in an embodiment of the average pooling method provided by this application;
图4为本申请提供的一种卷积神经网络的池化方法的一实施例中的方法流程图;4 is a method flowchart in an embodiment of a convolutional neural network pooling method provided by this application;
图5为本申请提供的步骤S200的一实施例中的方法流程图;FIG. 5 is a flowchart of a method in an embodiment of step S200 provided in this application;
图6为本申请提供的随机池化方法的一实施例中的示意图;FIG. 6 is a schematic diagram in an embodiment of the random pooling method provided by this application;
图7为本申请提供的一种卷积神经网络的池化装置的一实施例中的结构框图;FIG. 7 is a structural block diagram of an embodiment of a convolutional neural network pooling device provided by this application;
图8为本申请提供的一种计算机设备的一实施例中的结构示意图。FIG. 8 is a schematic structural diagram in an embodiment of a computer device provided by this application.
具体实施方式detailed description
下面详细描述本申请的实施例,所述实施例的示例在附图中示出。The embodiments of the present application are described in detail below, and examples of the embodiments are shown in the accompanying drawings.
本申请提供一种卷积神经网络的池化方法,应用于卷积神经网络的池化层中。以下先对本申请所述的卷积神经网络的池化方法的背景技术做先导性说明:This application provides a pooling method of a convolutional neural network, which is applied to the pooling layer of a convolutional neural network. The following first gives a leading explanation of the background technology of the convolutional neural network pooling method described in this application:
参见图1所示,卷积神经网络在输入层到输出层之间,包含有两个卷积层、两个池化层、一个全连接隐层。一般来说,池化层的输入来源于上一个卷积层,主要提供了很强的鲁棒性。在减少了深度学习参数的数量的同时,还能防止过拟合现象的发生。参数数量的减少意味着卷积神经网络训练的速度会更快,训练后的模型文件更小,预测样本时候推理速度更快。因而池化函数对于卷积神经网络有着重要的意义。As shown in Figure 1, the convolutional neural network includes two convolutional layers, two pooling layers, and a fully connected hidden layer between the input layer and the output layer. Generally speaking, the input of the pooling layer comes from the previous convolutional layer, which mainly provides strong robustness. While reducing the number of deep learning parameters, it can also prevent overfitting. The reduction in the number of parameters means that the training speed of the convolutional neural network will be faster, the model file after training will be smaller, and the inference speed will be faster when predicting samples. Therefore, the pooling function is of great significance to convolutional neural networks.
卷积神经网络中池化函数一般采用最大池化(max-pooling)和平均池化(mean-pooling)方法。对于最大池化方法,如图2所示。一个4乘4的矩阵经过2乘2的最大池函数池化后,变为一个2乘以2的矩阵。池化计算步骤为:将4乘4的矩阵拆分为4个2乘2的子矩阵,分别求取每个子矩阵的最大值,构成一个新的矩阵,即为最大池化。对于平均池化方法,如图3所示。一个4乘4的矩阵经过2乘2的平均池函数池化后,变为一个2乘以2的矩阵。池化计算步骤:将4乘4的矩阵拆分为4个2乘2的子矩阵,分别求取每个子矩阵的平均值,构成一个新的矩阵,即为平均池化。然而,采用最大池化方法,每次池化都会丢失最大值位置以外的信息。采用平均池化方法,不能体现出一些离平均值较远但比较重要的信息。Pooling functions in convolutional neural networks generally use max-pooling and mean-pooling methods. For the maximum pooling method, as shown in Figure 2. After a 4 by 4 matrix is pooled by a 2 by 2 maximum pooling function, it becomes a 2 by 2 matrix. The pooling calculation step is: split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and obtain the maximum value of each sub-matrix to form a new matrix, which is the maximum pooling. For the average pooling method, as shown in Figure 3. After a 4 by 4 matrix is pooled by a 2 by 2 average pooling function, it becomes a 2 by 2 matrix. Pooling calculation steps: split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and obtain the average value of each sub-matrix to form a new matrix, which is average pooling. However, with the maximum pooling method, information other than the maximum position will be lost every time the pooling is performed. Using the average pooling method cannot reflect some important information that is far from the average.
本申请提供一种卷积神经网络的池化方法,以扩充池化后的数据样本,丰富卷积神经网络的输入信息。在一实施例中,如图4所示,该卷积神经网络的池化方法,包括以下步骤:This application provides a pooling method for convolutional neural networks to expand pooled data samples and enrich the input information of the convolutional neural network. In an embodiment, as shown in FIG. 4, the pooling method of the convolutional neural network includes the following steps:
S100,获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型。S100: Obtain training samples of the convolutional neural network, and input the training samples into the convolutional neural network model.
在本实施例中,在进行卷积神经网络结构的数据运算之前,服务器先获取 特征数据的训练样本。该训练样本为服务器通过卷积神经网络进行计算的目标特征。进一步地,将该训练样本输入卷积神经网络模型中。其中,卷积神经网络模型包括卷积层、采样层、激活层、池化层和全连接层。卷积层用于提取输入数据的空间特征。卷积层可以包括多个卷积核,以提取输入数据的多个空间特征。激活层可采用非线性激活函数。池化层用于避免卷积过程的过拟合现象。全连接层用于网络中神经元与神经元之间的邻接相连,可通过softmax函数计算后输出,得到不同的概率值。In this embodiment, before performing data operations on the convolutional neural network structure, the server first obtains training samples of characteristic data. The training sample is the target feature calculated by the server through the convolutional neural network. Further, the training samples are input into the convolutional neural network model. Among them, the convolutional neural network model includes convolutional layer, sampling layer, activation layer, pooling layer and fully connected layer. The convolutional layer is used to extract the spatial features of the input data. The convolutional layer may include multiple convolution kernels to extract multiple spatial features of the input data. The activation layer can adopt a nonlinear activation function. The pooling layer is used to avoid overfitting in the convolution process. The fully connected layer is used for the adjacent connection between neurons and neurons in the network, and can be calculated by the softmax function and output to obtain different probability values.
在一实施例中,所述训练样本为图像样本数据。步骤S100,包括:获取所述图像样本数据作为所述卷积神经网络模型的训练样本。In an embodiment, the training samples are image sample data. Step S100 includes: acquiring the image sample data as a training sample of the convolutional neural network model.
在该实施例中,对于模型训练的样本数据可以是图像样本数据,模型输出则为图像样本数据的分类。进一步地,所述获取所述图像样本数据作为所述卷积神经网络模型的训练样本,包括:获取所述图像样本数据中的图片样本数据作为所述卷积神经网络模型的训练样本。In this embodiment, the sample data for model training may be image sample data, and the model output is the classification of image sample data. Further, the acquiring the image sample data as the training sample of the convolutional neural network model includes: acquiring the image sample data in the image sample data as the training sample of the convolutional neural network model.
其中,所述将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵,从每个数据子矩阵中随机提取一个样本数据,根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出,包括:将卷积神经网络模型的第一卷积层输出的图片样本数据矩阵经过激活后,输入卷积神经网络模型的第一池化层,在第一池化层中将样本数据矩阵划分为多个数据子矩阵,从每个数据子矩阵中随机提取一个图片样本数据,根据提取的图片样本数据生成池化矩阵进行第一池化层输出后,再将所述第一池化层输出的数据输入第二卷积层进行卷积,对所述第二卷积层输出的数据进行随机激活,并将随机激活后的数据输入第二池化层进行池化训练,将该池化训练结果作为系统池化层的输出。Wherein, the sample data matrix output by the convolution layer of the convolutional neural network model is input into the pooling layer of the convolutional neural network model, and the sample data matrix is divided into a plurality of data sub-matrices in the pooling layer. A sample data is randomly extracted from the data sub-matrix, a pooling matrix is generated according to each sample data extracted, and the pooling matrix is used as the output of the pooling layer, including: the output of the first convolutional layer of the convolutional neural network model After the image sample data matrix is activated, enter the first pooling layer of the convolutional neural network model. In the first pooling layer, the sample data matrix is divided into multiple data sub-matrices, and one is randomly extracted from each data sub-matrix Picture sample data, after the pooling matrix is generated according to the extracted picture sample data for the output of the first pooling layer, the data output by the first pooling layer is input into the second convolutional layer for convolution, and the second The data output by the convolutional layer is randomly activated, and the randomly activated data is input to the second pooling layer for pooling training, and the pooling training result is used as the output of the system pooling layer.
S200,将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵。S200: Input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into multiple data sub-matrices in the pooling layer.
在本实施例中,服务器将训练样本输入到卷积神经网络模型时,首先训练样本输入到卷积神经网络模型的卷积层。卷积层对训练样本进行卷积训练,提取输入的训练样本的不同空间特征,输出包含多个特征的样本数据矩阵。通常在卷积层之后得到维度较大的特征的样本数据矩阵。此时,将特征切成几个区 域,取其最大值或平均值,得到新的、维度较小的特征。也即是卷积神经网络的池化运算。在进行卷积神经网络的池化层运算时,将卷积层输出的样本数据矩阵划分为子矩阵。具体可以是,设置预设大小(小于样本数据矩阵的大小)的卷积核。卷积核窗口为长和宽值大小相同的窗口。当然,也不排除长和宽值大小不相同的窗口的卷积核。通过卷积核可将样本数据矩阵划分为多个子矩阵。In this embodiment, when the server inputs training samples into the convolutional neural network model, first the training samples are input into the convolutional layer of the convolutional neural network model. The convolution layer performs convolution training on the training samples, extracts different spatial features of the input training samples, and outputs a sample data matrix containing multiple features. The sample data matrix of features with larger dimensions is usually obtained after the convolutional layer. At this time, cut the feature into several regions, and take the maximum or average value to obtain a new feature with a smaller dimension. That is, the pooling operation of the convolutional neural network. When performing the pooling layer operation of the convolutional neural network, the sample data matrix output by the convolutional layer is divided into sub-matrices. Specifically, a convolution kernel of a preset size (less than the size of the sample data matrix) is set. The convolution kernel window is a window with the same length and width. Of course, the convolution kernel of windows with different length and width values is not excluded. The sample data matrix can be divided into multiple sub-matrices through the convolution kernel.
在一实施例中,所述多个数据子矩阵包括行数与列数相同的数据子矩阵。如图5所示,步骤S200中,在池化层中将样本数据矩阵划分为多个数据子矩阵,包括:In an embodiment, the plurality of data sub-matrices include data sub-matrices with the same number of rows and columns. As shown in FIG. 5, in step S200, dividing the sample data matrix into multiple data sub-matrices in the pooling layer includes:
S210,获取所述样本数据矩阵的行数和列数。S210: Obtain the number of rows and the number of columns of the sample data matrix.
S230,根据所述样本数据矩阵的行数和列数,在池化层中将所述样本数据矩阵划分为多个行数与列数相同的数据子矩阵。S230: According to the number of rows and columns of the sample data matrix, divide the sample data matrix into a plurality of data sub-matrices with the same number of rows and columns in the pooling layer.
卷积神经网络中的卷积核一般取小的正方形矩阵。主流的深度学习框架一般支持的也是长宽一样的卷积核。例如,可采用3乘3的卷积核,或者2乘2的卷积核。在该实施例中,获取样本数据矩阵的行数M和列数N,根据行数M和列数N将样本数据矩阵划分为多个行数与列数相同的数据子矩阵。其中,行数M和列数N可以相同。The convolution kernel in the convolutional neural network generally takes a small square matrix. Mainstream deep learning frameworks generally support convolution kernels with the same length and width. For example, a 3 by 3 convolution kernel or a 2 by 2 convolution kernel can be used. In this embodiment, the number of rows M and the number of columns N of the sample data matrix are obtained, and the sample data matrix is divided into a plurality of data sub-matrices with the same number of rows and columns according to the number of rows M and N of columns. Among them, the number of rows M and the number of columns N can be the same.
在其他实施例中,多个数据子矩阵包括行数与列数相同的第一数据子矩阵以及行数与列数不相同的第二数据子矩阵。也即是,进行样本数据矩阵划分时,可同时划分出多个行数与列数相同的第一数据子矩阵,以及多个行数与列数不相同的第二数据子矩阵。具体地,步骤S200中,在池化层中将样本数据矩阵划分为多个数据子矩阵,包括:获取样本数据矩阵的行数和列数;根据样本数据矩阵的行数和列数在池化层中将样本数据矩阵划分为多个所述第一数据子矩阵以及多个所述第二数据子矩阵。In other embodiments, the multiple data sub-matrices include a first data sub-matrix with the same number of rows and columns, and a second data sub-matrix with different rows and columns. That is, when the sample data matrix is divided, multiple first data sub-matrices with the same number of rows and columns and multiple second data sub-matrices with different numbers of rows and columns can be simultaneously divided. Specifically, in step S200, the sample data matrix is divided into multiple data sub-matrices in the pooling layer, including: obtaining the number of rows and columns of the sample data matrix; pooling according to the number of rows and columns of the sample data matrix The sample data matrix is divided into a plurality of the first data sub-matrices and a plurality of the second data sub-matrices in the layer.
S300,从每个数据子矩阵中随机提取一个样本数据。S300: Randomly extract a sample data from each data sub-matrix.
在本实施例中,服务器在卷积神经网络模型中进行池化运算时,从每个数据子矩阵中随机提取一个样本数据,从而可根据每个数据子矩阵中提取的数据构成新的矩阵。此处,每次随机提取的数据可以是数据子矩阵中任何一个数据值,每个数据值还可被重复提取。In this embodiment, when the server performs the pooling operation in the convolutional neural network model, it randomly extracts one sample data from each data sub-matrix, so that a new matrix can be formed according to the data extracted from each data sub-matrix. Here, the data randomly extracted each time can be any data value in the data sub-matrix, and each data value can also be extracted repeatedly.
在一实施例中,所述数据子矩阵为多行及多列的矩阵。步骤S300,包括: 从所述每个数据子矩阵的多行中随机选取任一行数据;从该行数据中,随机选取任一列对应的数据作为所述样本数据;或,从所述每个数据子矩阵的多列中随机选取任一列数据;从该列数据中,随机选取任一行对应的数据作为所述样本数据。In an embodiment, the data sub-matrix is a matrix with multiple rows and multiple columns. Step S300 includes: randomly selecting any row of data from the multiple rows of each data sub-matrix; randomly selecting data corresponding to any column from the row of data as the sample data; or, from each data Randomly select any column of data from the multiple columns of the sub-matrix; from the column of data, randomly select data corresponding to any row as the sample data.
在该实施例中,服务器先从多行及多列的数据子矩阵中随机筛选出任意一行的数据,再从该行数据中筛选任意一列,该列对应的数据则为所述样本数据。或者,服务器先从多行及多列的数据子矩阵中随机筛选出任意一列的数据,再从该列数据中筛选任意一行,该行对应的数据则为所述样本数据。In this embodiment, the server first randomly selects any row of data from a data sub-matrix of multiple rows and multiple columns, and then selects any column from the row of data, and the data corresponding to the column is the sample data. Alternatively, the server first randomly selects any column of data from the data sub-matrix of multiple rows and multiple columns, and then selects any row from the column of data, and the data corresponding to the row is the sample data.
在一实施例中,所述数据子矩阵为多行及多列的矩阵。步骤S300,包括:从所述每个数据子矩阵的多行及多列中,随机选取任一行和任一列对应的数据作为所述样本数据。在该实施例中,服务器随机选出任意行值和列值,该行值和列值对应的数据即为所述样本数据。如服务器随机选取出行值和列值为(4、3),则(4、3)对应数据子矩阵中的数据即为所述样本数据。In an embodiment, the data sub-matrix is a matrix with multiple rows and multiple columns. Step S300 includes: randomly selecting data corresponding to any row and any column from multiple rows and multiple columns of each data sub-matrix as the sample data. In this embodiment, the server randomly selects any row value and column value, and the data corresponding to the row value and column value is the sample data. If the server randomly selects the row value and the column value (4, 3), the data in the data sub-matrix corresponding to (4, 3) is the sample data.
S400,根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。S400: Generate a pooling matrix according to each extracted sample data, and use the pooling matrix as an output of the pooling layer.
在本实施例中,可根据每个数据子矩阵中提取的样本数据生成新的矩阵,也即是池化矩阵,将该池化矩阵作为池化层的输出,以输出到卷积神经网络模型的全连接隐层。在一具体实施方式中,如图6所示,一个4乘4的样本数据矩阵经过划分为2乘2数据子矩阵的随机池函数池化后,变为多个2乘以2的矩阵。具体地,随机池化的计算步骤为:将4乘4的矩阵拆分为4个2乘2的子矩阵,分别在每个子矩阵的里面随机取一个值,构成一个新的矩阵,即为随机池化矩阵。通过采用随机池化,一个4乘4的矩阵将产生4*4*4*4=1024种不同的池化结果,相当于把样本扩充了1024倍,极大地丰富了卷积神经网络的输入信息。In this embodiment, a new matrix can be generated based on the sample data extracted from each data sub-matrix, that is, a pooling matrix, and the pooling matrix is used as the output of the pooling layer to output to the convolutional neural network model The fully connected hidden layer. In a specific embodiment, as shown in FIG. 6, a 4-by-4 sample data matrix is pooled into a random pool function divided into 2-by-2 data sub-matrices to become multiple 2-by-2 matrices. Specifically, the calculation steps of random pooling are: split the 4 by 4 matrix into 4 2 by 2 sub-matrices, and randomly select a value in each sub-matrix to form a new matrix, which is random Pooling matrix. By using random pooling, a 4 by 4 matrix will produce 4*4*4*4=1024 different pooling results, which is equivalent to expanding the sample by 1024 times, which greatly enriches the input information of the convolutional neural network .
上述实施例提供的卷积神经网络的池化方法,将训练样本输入卷积神经网络模型中,经过卷积神经网络模型的卷积层进行卷积处理后,将卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层。在池化层中,将样本数据矩阵划分为多个数据子矩阵,并且从每个数据子矩阵随机提取一个数据样本以生成池化矩阵。因此,通过在池化层中对样本数据的每个数据子矩阵进行数据随机 采集后池化,一个M*M的样本数据矩阵可得到M*M*M*M,与传统的最大池化或平均池化的方式相比,扩充了池化后的数据样本,丰富了卷积神经网络的输入信息。The pooling method of the convolutional neural network provided in the above embodiment inputs training samples into the convolutional neural network model, and after convolution processing is performed on the convolutional layer of the convolutional neural network model, the sample data matrix output by the convolutional layer Enter the pooling layer of the convolutional neural network model. In the pooling layer, the sample data matrix is divided into multiple data sub-matrices, and a data sample is randomly extracted from each data sub-matrix to generate a pooling matrix. Therefore, by randomly collecting and pooling each data sub-matrix of the sample data in the pooling layer, an M*M sample data matrix can get M*M*M*M, which is different from the traditional maximum pooling or Compared with the average pooling method, the pooled data samples are expanded and the input information of the convolutional neural network is enriched.
在一实施例中,上述卷积神经网络的池化方法可用于图像分类中。具体地,应用该卷积神经网络的池化方法的图像分类方法包括:获取目标图像的图像数据;将所述图像数据输入卷积神经网络模型,得到所述目标图像的待分类结果数据;所述卷积神经网络模型用于对所述图像数据进行图像特征类别分析,输出待分类结果数据;其中,所述卷积神经网络模型中的池化层,用于将所述卷积神经网络模型中的卷积层输出的图像数据划分为多个数据矩阵,并从每个数据子矩阵中随机提取一个图像样本数据,根据提取的图像样本数据生成池化矩阵,将该池化矩阵作为池化输出;根据所述待分类结果数据,对目标图像进行分类。In an embodiment, the pooling method of the convolutional neural network described above can be used in image classification. Specifically, the image classification method using the pooling method of the convolutional neural network includes: acquiring image data of a target image; inputting the image data into a convolutional neural network model to obtain the result data of the target image to be classified; The convolutional neural network model is used to perform image feature category analysis on the image data, and output the result data to be classified; wherein, the pooling layer in the convolutional neural network model is used to combine the convolutional neural network model The image data output by the convolutional layer in is divided into multiple data matrices, and one image sample data is randomly extracted from each data sub-matrix, and a pooling matrix is generated according to the extracted image sample data, and the pooling matrix is regarded as pooling Output; classify the target image according to the result data to be classified.
进一步地,所述卷积神经网络模型还包括第一激活单元和第二激活单元;所述卷积层包括第一卷积层和第二卷积层;所述池化层包括第一池化层和第二池化层;其中,所述第一池化层和所述第二池化层均用于将所述卷积神经网络模型中的卷积层输出的图像数据划分为多个数据矩阵,并从每个数据子矩阵中随机提取一个图像样本数据,根据提取的图像样本数据生成池化矩阵,将该池化矩阵作为池化输出;所述第一激活单元用于将所述第一卷积层输出的图像数据进行非随机激活后,将结果值输入到所述第一池化层;所述第一池化层用于将输入的图像数据进行池化后输出到所述第二卷积层;所述第二卷积层用于将输入的图像数据进行卷积后输出到所述第二激活单元;所述第二激活单元用于将输入的图像数据进行随机激活后输出到所述卷积神经网络模型的全连接层,所述全连接层用于输出所述待分类结果数据。Further, the convolutional neural network model further includes a first activation unit and a second activation unit; the convolutional layer includes a first convolutional layer and a second convolutional layer; the pooling layer includes a first pooling Layer and a second pooling layer; wherein the first pooling layer and the second pooling layer are both used to divide the image data output by the convolutional layer in the convolutional neural network model into multiple data Matrix, and randomly extract an image sample data from each data sub-matrix, generate a pooling matrix according to the extracted image sample data, and output the pooling matrix as a pooling; the first activation unit is used to transfer the first After non-random activation of the image data output by a convolutional layer, the result value is input to the first pooling layer; the first pooling layer is used to pool the input image data and output it to the first pooling layer. Two convolutional layers; the second convolutional layer is used to convolve the input image data and output to the second activation unit; the second activation unit is used to randomly activate the input image data and output To the fully connected layer of the convolutional neural network model, the fully connected layer is used to output the result data to be classified.
具体地,一个典型的卷积神经网络有卷积层、激活层、池化层、全连接层,一个三层的卷积神经网络可以表示为:输入图片->卷积->激活->池化->卷积->激活->池化->卷积->激活->池化->全连接层->待分类输出。如果将随机池化应用到该卷积神经网络,网络的结构变成:输入图片->卷积->激活->随机池化->卷积->随机激活->池化->卷积->随机激活->池化->全连接层->待分类输出。Specifically, a typical convolutional neural network has a convolutional layer, an activation layer, a pooling layer, and a fully connected layer. A three-layer convolutional neural network can be expressed as: input image -> convolution -> activation -> pool Conversion->Convolution->Activation->Pooling->Convolution->Activation->Pooling->Fully connected layer->Output to be classified. If random pooling is applied to the convolutional neural network, the structure of the network becomes: input image -> convolution -> activation -> random pooling -> convolution -> random activation -> pooling -> convolution- > Random activation -> Pooling -> Fully connected layer -> Output to be classified.
本申请还提供一种卷积神经网络的池化装置。在一实施例中,如图7所示, 该卷积神经网络的池化装置包括获取模块10、划分模块20、提取模块30和生成模块40。This application also provides a pooling device for the convolutional neural network. In an embodiment, as shown in FIG. 7, the pooling device of the convolutional neural network includes an acquisition module 10, a division module 20, an extraction module 30 and a generation module 40.
获取模块10用于获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型。在本实施例中,在进行卷积神经网络结构的数据运算之前,服务器先获取特征数据的训练样本。该训练样本为服务器通过卷积神经网络进行计算的目标特征。进一步地,将该训练样本输入卷积神经网络模型中。其中,卷积神经网络模型包括卷积层、采样层、激活层、池化层和全连接层。卷积层用于提取输入数据的空间特征。卷积层可以包括多个卷积核,以提取输入数据的多个空间特征。激活层可采用非线性激活函数。池化层用于避免卷积过程的过拟合现象。全连接层用于网络中神经元与神经元之间的邻接相连,可通过softmax函数计算后输出,得到不同的概率值。The obtaining module 10 is used to obtain training samples of the convolutional neural network, and input the training samples into the convolutional neural network model. In this embodiment, before performing data operations on the convolutional neural network structure, the server first obtains training samples of characteristic data. The training sample is the target feature calculated by the server through the convolutional neural network. Further, the training samples are input into the convolutional neural network model. Among them, the convolutional neural network model includes convolutional layer, sampling layer, activation layer, pooling layer and fully connected layer. The convolutional layer is used to extract the spatial features of the input data. The convolutional layer may include multiple convolution kernels to extract multiple spatial features of the input data. The activation layer can adopt a nonlinear activation function. The pooling layer is used to avoid overfitting in the convolution process. The fully connected layer is used for the adjacent connection between neurons and neurons in the network, and can be calculated by the softmax function and output to obtain different probability values.
划分模块20用于将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵。在本实施例中,服务器将训练样本输入到卷积神经网络模型时,首先训练样本输入到卷积神经网络模型的卷积层。卷积层对训练样本进行卷积训练,提取输入的训练样本的不同空间特征,输出包含多个特征的样本数据矩阵。通常在卷积层之后得到维度较大的特征的样本数据矩阵。此时,将特征切成几个区域,取其最大值或平均值,得到新的、维度较小的特征。也即是卷积神经网络的池化运算。在进行卷积神经网络的池化层运算时,将卷积层输出的样本数据矩阵划分为子矩阵。具体可以是,设置预设大小(小于样本数据矩阵的大小)的卷积核。卷积核窗口为长和宽值大小相同的窗口。当然,也不排除长和宽值大小不相同的窗口的卷积核。通过卷积核可将样本数据矩阵划分为多个子矩阵。The dividing module 20 is configured to input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into a plurality of data sub-matrices in the pooling layer. In this embodiment, when the server inputs training samples into the convolutional neural network model, first the training samples are input into the convolutional layer of the convolutional neural network model. The convolution layer performs convolution training on the training samples, extracts different spatial features of the input training samples, and outputs a sample data matrix containing multiple features. The sample data matrix of features with larger dimensions is usually obtained after the convolutional layer. At this time, the feature is cut into several regions, and the maximum or average value is selected to obtain a new feature with a smaller dimension. That is, the pooling operation of the convolutional neural network. When performing the pooling layer operation of the convolutional neural network, the sample data matrix output by the convolutional layer is divided into sub-matrices. Specifically, a convolution kernel of a preset size (less than the size of the sample data matrix) is set. The convolution kernel window is a window with the same length and width. Of course, the convolution kernel of windows with different length and width values is not excluded. The sample data matrix can be divided into multiple sub-matrices through the convolution kernel.
提取模块30用于从每个数据子矩阵中随机提取一个样本数据。在本实施例中,服务器在卷积神经网络模型中进行池化运算时,从每个数据子矩阵中随机提取一个样本数据,从而可根据每个数据子矩阵中提取的数据构成新的矩阵。此处,每次随机提取的数据可以是数据子矩阵中任何一个数据值,每个数据值还可被重复提取。The extraction module 30 is used to randomly extract a sample data from each data sub-matrix. In this embodiment, when the server performs the pooling operation in the convolutional neural network model, it randomly extracts one sample data from each data sub-matrix, so that a new matrix can be formed according to the data extracted from each data sub-matrix. Here, the data randomly extracted each time can be any data value in the data sub-matrix, and each data value can also be extracted repeatedly.
生成模块40用于根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。在本实施例中,可根据每个数据子矩阵中提取的样本数据 生成新的矩阵,也即是池化矩阵,将该池化矩阵作为池化层的输出,以输出到卷积神经网络模型的全连接隐层。The generating module 40 is configured to generate a pooling matrix according to each extracted sample data, and use the pooling matrix as the output of the pooling layer. In this embodiment, a new matrix can be generated based on the sample data extracted from each data sub-matrix, that is, a pooling matrix, and the pooling matrix is used as the output of the pooling layer to output to the convolutional neural network model The fully connected hidden layer.
在其他实施例中,本申请提供的卷积神经网络的池化装置中的各个模块还用于执行本申请所述的卷积神经网络的池化方法中,对应各个步骤执行的操作,在此不再做详细的说明。In other embodiments, each module in the convolutional neural network pooling device provided in this application is also used to execute the operations performed corresponding to each step in the convolutional neural network pooling method described in this application, here No more detailed instructions.
本申请还提供一种存储介质。该存储介质上存储有计算机程序;所述计算机程序被处理器执行时,实现上述任一实施例所述的卷积神经网络的池化方法。该存储介质可以是存储器。例如,内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储介质包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。The application also provides a storage medium. The storage medium stores a computer program; when the computer program is executed by a processor, the convolutional neural network pooling method described in any of the above embodiments is implemented. The storage medium may be a memory. For example, internal memory or external memory, or both internal memory and external memory. The internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory. External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc. The storage medium disclosed in this application includes but is not limited to these types of memories. The memory disclosed in this application is only an example and not a limitation.
本申请还提供一种计算机设备。一种计算机设备包括:一个或多个处理器;存储器;一个或多个应用程序。其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行上述任一实施例所述的卷积神经网络的池化方法。This application also provides a computer device. A computer device includes: one or more processors; memory; and one or more application programs. Wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute the one described in any of the above embodiments The pooling method of convolutional neural network.
图8为本申请一实施例中的计算机设备的结构示意图。本实施例所述计算机设备可以是服务器、个人计算机以及网络设备。如图8所示,设备包括处理器803、存储器805、输入单元807以及显示单元809等器件。本领域技术人员可以理解,图8示出的设备结构器件并不构成对所有设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储器805可用于存储应用程序801以及各功能模块,处理器803运行存储在存储器805的应用程序801,从而执行设备的各种功能应用以及数据处理。存储器可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储器包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。FIG. 8 is a schematic structural diagram of a computer device in an embodiment of this application. The computer device described in this embodiment may be a server, a personal computer, and a network device. As shown in Figure 8, the device includes a processor 803, a memory 805, an input unit 807, a display unit 809 and other devices. Those skilled in the art can understand that the device structure shown in FIG. 8 does not constitute a limitation on all devices, and may include more or less components than those shown in the figure, or combine certain components. The memory 805 may be used to store an application program 801 and various functional modules, and the processor 803 runs the application program 801 stored in the memory 805 to execute various functional applications and data processing of the device. The memory may be internal memory or external memory, or include both internal memory and external memory. The internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory. External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc. The memory disclosed in this application includes but is not limited to these types of memory. The memory disclosed in this application is only an example and not a limitation.
输入单元807用于接收信号的输入,以及接收用户输入的关键字。输入单元807可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元809可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元809可采用液晶显示器、有机发光二极管等形式。处理器803是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储器805内的软件程序和/或模块,以及调用存储在存储器内的数据,执行各种功能和处理数据。The input unit 807 is used to receive signal input and keywords input by the user. The input unit 807 may include a touch panel and other input devices. The touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset The program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick. The display unit 809 may be used to display information input by the user or information provided to the user and various menus of the computer device. The display unit 809 may take the form of a liquid crystal display, an organic light emitting diode, or the like. The processor 803 is the control center of the computer equipment. It uses various interfaces and lines to connect the various parts of the entire computer. It executes by running or executing the software programs and/or modules stored in the memory 805 and calling the data stored in the memory. Various functions and processing data.
在一实施方式中,设备包括一个或多个处理器803,以及一个或多个存储器805,一个或多个应用程序801。其中所述一个或多个应用程序801被存储在存储器805中并被配置为由所述一个或多个处理器803执行,所述一个或多个应用程序801配置用于执行以上实施例所述的卷积神经网络的池化方法。In an embodiment, the device includes one or more processors 803, one or more memories 805, and one or more application programs 801. The one or more application programs 801 are stored in the memory 805 and configured to be executed by the one or more processors 803, and the one or more application programs 801 are configured to execute the above-mentioned embodiments. The pooling method of convolutional neural network.
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, the functional units in the various embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括存储器、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the above-mentioned embodiments can be completed by hardware, or by a program instructing related hardware. The program can be stored in a computer-readable storage medium, and the storage medium can include Storage, magnetic disk or optical disc, etc.
应该理解的是,在本申请各实施例中的各功能单元可集成在一个处理模块中,也可以各个单元单独物理存在,也可以两个或两个以上单元集成于一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。It should be understood that the functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.

Claims (20)

  1. 一种卷积神经网络的池化方法,其特征在于,包括:A pooling method of convolutional neural network, which is characterized in that it includes:
    获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型;Obtain training samples of the convolutional neural network, and input the training samples into the convolutional neural network model;
    将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵;Input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into multiple data sub-matrices in the pooling layer;
    从每个数据子矩阵中随机提取一个样本数据;Randomly extract a sample data from each data sub-matrix;
    根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。A pooling matrix is generated according to each sample data extracted, and the pooling matrix is used as the output of the pooling layer.
  2. 根据权利要求1所述的方法,其特征在于,所述多个数据子矩阵包括行数与列数相同的数据子矩阵;所述在池化层中将样本数据矩阵划分为多个数据子矩阵,包括:The method according to claim 1, wherein the multiple data sub-matrices comprise data sub-matrices with the same number of rows and columns; and the sample data matrix is divided into multiple data sub-matrices in the pooling layer ,include:
    获取所述样本数据矩阵的行数和列数;Obtaining the number of rows and columns of the sample data matrix;
    根据所述样本数据矩阵的行数和列数,在池化层中将所述样本数据矩阵划分为多个行数与列数相同的数据子矩阵。According to the number of rows and columns of the sample data matrix, the sample data matrix is divided into a plurality of data sub-matrices with the same number of rows and columns in the pooling layer.
  3. 根据权利要求1所述的方法,其特征在于,所述多个数据子矩阵包括行数与列数相同的第一数据子矩阵以及行数与列数不相同的第二数据子矩阵;所述在池化层中将样本数据矩阵划分为多个数据子矩阵,包括:The method according to claim 1, wherein the plurality of data sub-matrices comprise a first data sub-matrix with the same number of rows and columns and a second data sub-matrix with different rows and columns; the In the pooling layer, the sample data matrix is divided into multiple data sub-matrices, including:
    获取样本数据矩阵的行数和列数;Get the number of rows and columns of the sample data matrix;
    根据样本数据矩阵的行数和列数在池化层中将样本数据矩阵划分为多个所述第一数据子矩阵以及多个所述第二数据子矩阵。The sample data matrix is divided into a plurality of the first data sub-matrices and a plurality of the second data sub-matrices in the pooling layer according to the number of rows and the number of columns of the sample data matrix.
  4. 根据权利要求1所述的方法,其特征在于,所述数据子矩阵为多行及多列的矩阵;所述从每个数据子矩阵中随机提取一个样本数据,包括:The method according to claim 1, wherein the data sub-matrix is a matrix with multiple rows and multiple columns; and said randomly extracting a sample data from each data sub-matrix comprises:
    从所述每个数据子矩阵的多行中随机选取任一行数据;从该行数据中,随机选取任一列对应的数据作为所述样本数据;或,Randomly select any row of data from multiple rows of each data sub-matrix; randomly select data corresponding to any column from the row of data as the sample data; or,
    从所述每个数据子矩阵的多列中随机选取任一列数据;从该列数据中,随机选取任一行对应的数据作为所述样本数据。Randomly select any column of data from the multiple columns of each data sub-matrix; from the column of data, randomly select data corresponding to any row as the sample data.
  5. 根据权利要求1所述的方法,其特征在于,所述数据子矩阵为多行及多列的矩阵;所述从每个数据子矩阵中随机提取一个样本数据,包括:The method according to claim 1, wherein the data sub-matrix is a matrix with multiple rows and multiple columns; and said randomly extracting a sample data from each data sub-matrix comprises:
    从所述每个数据子矩阵的多行及多列中,随机选取任一行和任一列对应的数据作为所述样本数据。From the multiple rows and multiple columns of each data sub-matrix, randomly select data corresponding to any row and any column as the sample data.
  6. 根据权利要求1所述的方法,其特征在于,所述训练样本为图像样本数据;所述获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型,包括:The method according to claim 1, wherein the training samples are image sample data; the obtaining training samples of the convolutional neural network and inputting the training samples into the convolutional neural network model comprises:
    获取所述图像样本数据作为所述卷积神经网络模型的训练样本。Obtain the image sample data as a training sample of the convolutional neural network model.
  7. 根据权利要求6所述的方法,其特征在于,所述获取所述图像样本数据作为所述卷积神经网络模型的训练样本,包括:获取所述图像样本数据中的图片样本数据作为所述卷积神经网络模型的训练样本;The method according to claim 6, wherein said acquiring the image sample data as a training sample of the convolutional neural network model comprises: acquiring image sample data in the image sample data as the volume Training samples of the product neural network model;
    所述将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵,从每个数据子矩阵中随机提取一个样本数据,根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出,包括:The sample data matrix output by the convolution layer of the convolutional neural network model is input into the pooling layer of the convolutional neural network model, and the sample data matrix is divided into a plurality of data sub-matrices in the pooling layer. A sample data is randomly extracted from the matrix, a pooling matrix is generated according to each sample data extracted, and the pooling matrix is used as the output of the pooling layer, including:
    将卷积神经网络模型的第一卷积层输出的图片样本数据矩阵经过激活后,输入卷积神经网络模型的第一池化层,在第一池化层中将样本数据矩阵划分为多个数据子矩阵,从每个数据子矩阵中随机提取一个图片样本数据,根据提取的图片样本数据生成池化矩阵进行第一池化层输出后,再将所述第一池化层输出的数据输入第二卷积层进行卷积,对所述第二卷积层输出的数据进行随机激活,并将随机激活后的数据输入第二池化层进行池化训练,将该池化训练结果作为系统池化层的输出。After the image sample data matrix output by the first convolution layer of the convolutional neural network model is activated, it is input to the first pooling layer of the convolutional neural network model, and the sample data matrix is divided into multiple in the first pooling layer Data sub-matrix, randomly extract a picture sample data from each data sub-matrix, generate a pooling matrix according to the extracted picture sample data, and output the first pooling layer, and then input the data output by the first pooling layer The second convolutional layer performs convolution, randomly activates the data output by the second convolutional layer, and inputs the randomly activated data into the second pooling layer for pooling training, and the pooling training result is used as the system The output of the pooling layer.
  8. 一种卷积神经网络的池化装置,其特征在于,包括:A pooling device for convolutional neural network, characterized in that it comprises:
    获取模块,用于获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型;The acquisition module is used to acquire training samples of the convolutional neural network, and input the training samples into the convolutional neural network model;
    划分模块,用于将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵;The division module is used to input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into multiple data sub-matrices in the pooling layer;
    提取模块,用于从每个数据子矩阵中随机提取一个样本数据;The extraction module is used to randomly extract a sample data from each data sub-matrix;
    生成模块,用于根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。The generating module is used to generate a pooling matrix according to each extracted sample data, and use the pooling matrix as the output of the pooling layer.
  9. 根据权利要求8所述的装置,其特征在于,所述多个数据子矩阵包括行数与列数相同的数据子矩阵;所述划分模块执行所述在池化层中将样本数据矩阵划分为多个数据子矩阵时,具体用于:8. The device according to claim 8, wherein the multiple data sub-matrices comprise data sub-matrices with the same number of rows and columns; and the division module executes the division of the sample data matrix in the pooling layer into When there are multiple data sub-matrices, it is specifically used for:
    获取所述样本数据矩阵的行数和列数;Obtaining the number of rows and columns of the sample data matrix;
    根据所述样本数据矩阵的行数和列数,在池化层中将所述样本数据矩阵划分为多个行数与列数相同的数据子矩阵。According to the number of rows and columns of the sample data matrix, the sample data matrix is divided into a plurality of data sub-matrices with the same number of rows and columns in the pooling layer.
  10. 根据权利要求8所述的装置,其特征在于,所述多个数据子矩阵包括行数与列数相同的第一数据子矩阵以及行数与列数不相同的第二数据子矩阵;所述划分模块执行所述在池化层中将样本数据矩阵划分为多个数据子矩阵时,具体用于:8. The device according to claim 8, wherein the multiple data sub-matrices comprise a first data sub-matrix with the same number of rows and columns and a second data sub-matrix with different rows and columns; the When the division module executes the division of the sample data matrix into multiple data sub-matrices in the pooling layer, it is specifically used for:
    获取样本数据矩阵的行数和列数;Get the number of rows and columns of the sample data matrix;
    根据样本数据矩阵的行数和列数在池化层中将样本数据矩阵划分为多个所述第一数据子矩阵以及多个所述第二数据子矩阵。The sample data matrix is divided into a plurality of the first data sub-matrices and a plurality of the second data sub-matrices in the pooling layer according to the number of rows and the number of columns of the sample data matrix.
  11. 根据权利要求8所述的装置,其特征在于,所述数据子矩阵为多行及多列的矩阵;所述提取模块在执行所述从每个数据子矩阵中随机提取一个样本数据时,具体用于:The device according to claim 8, wherein the data sub-matrix is a matrix with multiple rows and multiple columns; when the extraction module executes the random extraction of a sample data from each data sub-matrix, the specific Used for:
    从所述每个数据子矩阵的多行中随机选取任一行数据;从该行数据中,随机选取任一列对应的数据作为所述样本数据;或,Randomly select any row of data from multiple rows of each data sub-matrix; randomly select data corresponding to any column from the row of data as the sample data; or,
    从所述每个数据子矩阵的多列中随机选取任一列数据;从该列数据中,随机选取任一行对应的数据作为所述样本数据。Randomly select any column of data from the multiple columns of each data sub-matrix; from the column of data, randomly select data corresponding to any row as the sample data.
  12. 根据权利要求8所述的装置,其特征在于,所述数据子矩阵为多行及多列的矩阵;所述提取模块在执行所述从每个数据子矩阵中随机提取一个样本数据时,具体用于:The device according to claim 8, wherein the data sub-matrix is a matrix with multiple rows and multiple columns; when the extraction module executes the random extraction of a sample data from each data sub-matrix, the specific Used for:
    从所述每个数据子矩阵的多行及多列中,随机选取任一行和任一列对应的数据作为所述样本数据。From the multiple rows and multiple columns of each data sub-matrix, randomly select data corresponding to any row and any column as the sample data.
  13. 根据权利要求8所述的装置,其特征在于,所述训练样本为图像样本数据;所述获取模块具体用于:The device according to claim 8, wherein the training sample is image sample data; the acquisition module is specifically configured to:
    获取所述图像样本数据作为所述卷积神经网络模型的训练样本。Obtain the image sample data as a training sample of the convolutional neural network model.
  14. 根据权利要求13所述的装置,其特征在于,所述获取模块在执行所述获取所述图像样本数据作为所述卷积神经网络模型的训练样本时,具体用于:获取所述图像样本数据中的图片样本数据作为所述卷积神经网络模型的训练样本;The device according to claim 13, wherein the acquiring module is specifically configured to: acquire the image sample data when executing the acquiring the image sample data as the training sample of the convolutional neural network model The picture sample data in as the training sample of the convolutional neural network model;
    所述划分模块在执行所述将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵时,具体用于:将卷积神经网络模型的第一卷积层输出的图片样本数据矩阵经过激活后,输入卷积神经网络模型的第一池化层,在第一池化层中将样本数据矩阵划分为多个数据子矩阵;The dividing module performs the input of the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divides the sample data matrix into a plurality of data sub-matrices in the pooling layer When, it is specifically used to: activate the image sample data matrix output by the first convolutional layer of the convolutional neural network model, and then input it into the first pooling layer of the convolutional neural network model, and the samples in the first pooling layer The data matrix is divided into multiple data sub-matrices;
    所述提取模块在执行所述从每个数据子矩阵中随机提取一个样本数据时,具体用于:从每个数据子矩阵中随机提取一个图片样本数据;When the extraction module executes the random extraction of sample data from each data sub-matrix, it is specifically configured to: randomly extract a picture sample data from each data sub-matrix;
    所述生成模块在执行根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出时,具体用于:根据提取的图片样本数据生成池化矩阵进行第一池化层输出后,再将所述第一池化层输出的数据输入第二卷积层进行卷积,对所述第二卷积层输出的数据进行随机激活,并将随机激活后的数据输入第二池化层进行池化训练,将该池化训练结果作为系统池化层的输出。When the generating module executes the generation of a pooling matrix according to each extracted sample data, and the pooling matrix is used as the output of the pooling layer, it is specifically used for: generating a pooling matrix according to the extracted picture sample data to perform the first pooling After the layer is output, the data output by the first pooling layer is input into the second convolutional layer for convolution, the data output by the second convolutional layer is randomly activated, and the randomly activated data is input into the second convolutional layer. The second pooling layer performs pooling training, and the result of the pooling training is used as the output of the system pooling layer.
  15. 一种计算机非易失性存储介质,其特征在于,其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述权利要求1至7中任一项所述的卷积神经网络的池化方法。A computer non-volatile storage medium, characterized in that a computer program is stored thereon; the computer program is adapted to be loaded by a processor and execute the convolutional neural network according to any one of claims 1 to 7 The pooling method.
  16. 一种计算机设备,其特征在于,其包括:A computer device, characterized in that it includes:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行以下步骤:One or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute The following steps:
    获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型;Obtain training samples of the convolutional neural network, and input the training samples into the convolutional neural network model;
    将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵;Input the sample data matrix output by the convolution layer of the convolutional neural network model into the pooling layer of the convolutional neural network model, and divide the sample data matrix into multiple data sub-matrices in the pooling layer;
    从每个数据子矩阵中随机提取一个样本数据;Randomly extract a sample data from each data sub-matrix;
    根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出。A pooling matrix is generated according to each sample data extracted, and the pooling matrix is used as the output of the pooling layer.
  17. 根据权利要求16所述的计算机设备,其特征在于,所述多个数据子矩阵包括行数与列数相同的数据子矩阵;所述在池化层中将样本数据矩阵划分为多个数据子矩阵时,所述一个或多个应用程序被配置用于执行以下步骤:The computer device according to claim 16, wherein the multiple data sub-matrices comprise data sub-matrices with the same number of rows and columns; and the sample data matrix is divided into multiple data sub-matrices in the pooling layer. When matrixing, the one or more applications are configured to perform the following steps:
    获取所述样本数据矩阵的行数和列数;Obtaining the number of rows and columns of the sample data matrix;
    根据所述样本数据矩阵的行数和列数,在池化层中将所述样本数据矩阵划分为多个行数与列数相同的数据子矩阵。According to the number of rows and columns of the sample data matrix, the sample data matrix is divided into a plurality of data sub-matrices with the same number of rows and columns in the pooling layer.
  18. 根据权利要求16所述的计算机设备,其特征在于,所述多个数据子矩阵包括行数与列数相同的第一数据子矩阵以及行数与列数不相同的第二数据子矩阵;所述在池化层中将样本数据矩阵划分为多个数据子矩阵时,所述一个或多个应用程序被配置用于执行以下步骤:The computer device according to claim 16, wherein the plurality of data sub-matrices comprise a first data sub-matrix with the same number of rows and columns, and a second data sub-matrix with different rows and columns; When the sample data matrix is divided into multiple data sub-matrices in the pooling layer, the one or more application programs are configured to perform the following steps:
    获取样本数据矩阵的行数和列数;Get the number of rows and columns of the sample data matrix;
    根据样本数据矩阵的行数和列数在池化层中将样本数据矩阵划分为多个所述第一数据子矩阵以及多个所述第二数据子矩阵。The sample data matrix is divided into a plurality of the first data sub-matrices and a plurality of the second data sub-matrices in the pooling layer according to the number of rows and the number of columns of the sample data matrix.
  19. 根据权利要求16所述的计算机设备,其特征在于,所述数据子矩阵为多行及多列的矩阵;所述从每个数据子矩阵中随机提取一个样本数据时,所述一个或多个应用程序被配置用于执行以下步骤:The computer device according to claim 16, wherein the data sub-matrix is a matrix with multiple rows and multiple columns; when one sample data is randomly extracted from each data sub-matrix, the one or more The application is configured to perform the following steps:
    从所述每个数据子矩阵的多行中随机选取任一行数据;从该行数据中,随机选取任一列对应的数据作为所述样本数据;或,Randomly select any row of data from multiple rows of each data sub-matrix; randomly select data corresponding to any column from the row of data as the sample data; or,
    从所述每个数据子矩阵的多列中随机选取任一列数据;从该列数据中,随机选取任一行对应的数据作为所述样本数据。Randomly select any column of data from the multiple columns of each data sub-matrix; from the column of data, randomly select data corresponding to any row as the sample data.
  20. 根据权利要求16所述的计算机设备,其特征在于,所述训练样本为图像样本数据;所述获取卷积神经网络的训练样本,将训练样本输入卷积神经网络模型时,所述一个或多个应用程序被配置用于执行以下步骤:The computer device according to claim 16, wherein the training samples are image sample data; when the training samples of the convolutional neural network are obtained and the training samples are input into the convolutional neural network model, the one or more One application is configured to perform the following steps:
    获取所述图像样本数据中的图片样本数据作为所述卷积神经网络模型的训练样本;Acquiring picture sample data in the image sample data as a training sample of the convolutional neural network model;
    所述将卷积神经网络模型的卷积层输出的样本数据矩阵输入卷积神经网络模型的池化层,在池化层中将样本数据矩阵划分为多个数据子矩阵,从每个数据子矩阵中随机提取一个样本数据,根据提取的每个样本数据生成池化矩阵,将该池化矩阵作为池化层的输出时,所述一个或多个应用程序被配置用于执行以下步骤:The sample data matrix output by the convolution layer of the convolutional neural network model is input into the pooling layer of the convolutional neural network model, and the sample data matrix is divided into a plurality of data sub-matrices in the pooling layer. One sample data is randomly extracted from the matrix, a pooling matrix is generated according to each extracted sample data, and when the pooling matrix is used as the output of the pooling layer, the one or more applications are configured to perform the following steps:
    将卷积神经网络模型的第一卷积层输出的图片样本数据矩阵经过激活后,输入卷积神经网络模型的第一池化层,在第一池化层中将样本数据矩阵划分为 多个数据子矩阵,从每个数据子矩阵中随机提取一个图片样本数据,根据提取的图片样本数据生成池化矩阵进行第一池化层输出后,再将所述第一池化层输出的数据输入第二卷积层进行卷积,对所述第二卷积层输出的数据进行随机激活,并将随机激活后的数据输入第二池化层进行池化训练,将该池化训练结果作为系统池化层的输出。After the image sample data matrix output by the first convolution layer of the convolutional neural network model is activated, it is input to the first pooling layer of the convolutional neural network model, and the sample data matrix is divided into multiple in the first pooling layer Data sub-matrix, randomly extract a picture sample data from each data sub-matrix, generate a pooling matrix according to the extracted picture sample data and output the first pooling layer, and then input the data output by the first pooling layer The second convolution layer performs convolution, randomly activates the data output by the second convolution layer, and inputs the randomly activated data into the second pooling layer for pooling training, and the pooling training result is used as the system The output of the pooling layer.
PCT/CN2019/117863 2019-02-13 2019-11-13 Pooling method and device for convolutional neural network, storage medium and computer device WO2020164271A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910113187.9A CN109948651A (en) 2019-02-13 2019-02-13 Pond method, apparatus and storage medium, the computer equipment of convolutional neural networks
CN201910113187.9 2019-02-13

Publications (1)

Publication Number Publication Date
WO2020164271A1 true WO2020164271A1 (en) 2020-08-20

Family

ID=67007583

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117863 WO2020164271A1 (en) 2019-02-13 2019-11-13 Pooling method and device for convolutional neural network, storage medium and computer device

Country Status (2)

Country Link
CN (1) CN109948651A (en)
WO (1) WO2020164271A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112052627A (en) * 2020-08-21 2020-12-08 海南星瞰信息咨询中心(有限合伙) Method, device, medium and equipment for estimating near-surface ozone space distribution
CN112099737A (en) * 2020-09-29 2020-12-18 北京百度网讯科技有限公司 Method, device and equipment for storing data and storage medium
CN116167148A (en) * 2023-04-26 2023-05-26 青岛理工大学 Urban neighborhood form optimization method and system based on local microclimate

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948651A (en) * 2019-02-13 2019-06-28 平安科技(深圳)有限公司 Pond method, apparatus and storage medium, the computer equipment of convolutional neural networks
CN111882565B (en) * 2020-07-28 2023-07-07 深圳市雨滴科技有限公司 Image binarization method, device, equipment and storage medium
CN114124973B (en) * 2021-09-27 2023-06-09 烽火通信科技股份有限公司 Mirror image synchronization method and device for multi-cloud scene
CN115985465B (en) * 2023-03-21 2023-07-07 天津医科大学总医院 Myoelectric signal characteristic extraction method, device, equipment and storage medium based on time sequence
CN117251715B (en) * 2023-11-17 2024-03-19 华芯程(杭州)科技有限公司 Layout measurement area screening method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408435A (en) * 2014-12-05 2015-03-11 浙江大学 Face identification method based on random pooling convolutional neural network
US20170323202A1 (en) * 2016-05-06 2017-11-09 Fujitsu Limited Recognition apparatus based on deep neural network, training apparatus and methods thereof
CN107506722A (en) * 2017-08-18 2017-12-22 中国地质大学(武汉) One kind is based on depth sparse convolution neutral net face emotion identification method
CN107871136A (en) * 2017-03-22 2018-04-03 中山大学 The image-recognizing method of convolutional neural networks based on openness random pool
CN109948651A (en) * 2019-02-13 2019-06-28 平安科技(深圳)有限公司 Pond method, apparatus and storage medium, the computer equipment of convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408435A (en) * 2014-12-05 2015-03-11 浙江大学 Face identification method based on random pooling convolutional neural network
US20170323202A1 (en) * 2016-05-06 2017-11-09 Fujitsu Limited Recognition apparatus based on deep neural network, training apparatus and methods thereof
CN107871136A (en) * 2017-03-22 2018-04-03 中山大学 The image-recognizing method of convolutional neural networks based on openness random pool
CN107506722A (en) * 2017-08-18 2017-12-22 中国地质大学(武汉) One kind is based on depth sparse convolution neutral net face emotion identification method
CN109948651A (en) * 2019-02-13 2019-06-28 平安科技(深圳)有限公司 Pond method, apparatus and storage medium, the computer equipment of convolutional neural networks

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112052627A (en) * 2020-08-21 2020-12-08 海南星瞰信息咨询中心(有限合伙) Method, device, medium and equipment for estimating near-surface ozone space distribution
CN112099737A (en) * 2020-09-29 2020-12-18 北京百度网讯科技有限公司 Method, device and equipment for storing data and storage medium
CN112099737B (en) * 2020-09-29 2023-09-01 北京百度网讯科技有限公司 Method, device, equipment and storage medium for storing data
CN116167148A (en) * 2023-04-26 2023-05-26 青岛理工大学 Urban neighborhood form optimization method and system based on local microclimate

Also Published As

Publication number Publication date
CN109948651A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
WO2020164271A1 (en) Pooling method and device for convolutional neural network, storage medium and computer device
US11605019B2 (en) Visually guided machine-learning language model
US11604822B2 (en) Multi-modal differential search with real-time focus adaptation
JP6928371B2 (en) Classifier, learning method of classifier, classification method in classifier
US20190364123A1 (en) Resource push method and apparatus
WO2019129060A1 (en) Method and system for automatically generating machine learning sample
WO2016062044A1 (en) Model parameter training method, device and system
US20210133535A1 (en) Parameter sharing decoder pair for auto composing
JP7286013B2 (en) Video content recognition method, apparatus, program and computer device
CN110929807A (en) Training method of image classification model, and image classification method and device
CN111160288A (en) Gesture key point detection method and device, computer equipment and storage medium
CN112951328B (en) MiRNA-gene relation prediction method and system based on deep learning heterogeneous information network
JP2022163051A (en) Compound property prediction model training method and device, electronic device, storage medium, and computer program
CN116186326A (en) Video recommendation method, model training method, electronic device and storage medium
WO2021253938A1 (en) Neural network training method and apparatus, and video recognition method and apparatus
CN114925320B (en) Data processing method and related device
CN111198905A (en) Visual analytics framework for understanding missing links in bipartite networks
CN110032948A (en) A kind of sketch gesture identification method based on interaction timing information
WO2024027068A1 (en) Attack method and device for evaluating robustness of object detection model
CN115906861A (en) Statement emotion analysis method and device based on interaction aspect information fusion
WO2022111688A1 (en) Face liveness detection method and apparatus, and storage medium
CN111091585B (en) Target tracking method, device and storage medium
KR102215824B1 (en) Method and apparatus of analyzing diagram containing visual and textual information
JP2022548053A (en) Generating follow-up questions for interpretable recursive multi-hop question answering
US20220310068A1 (en) Methods and devices for structured pruning for automatic speech recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915474

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 05.10.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19915474

Country of ref document: EP

Kind code of ref document: A1