WO2020233010A1 - Image recognition method and apparatus based on segmentable convolutional network, and computer device - Google Patents
Image recognition method and apparatus based on segmentable convolutional network, and computer device Download PDFInfo
- Publication number
- WO2020233010A1 WO2020233010A1 PCT/CN2019/117743 CN2019117743W WO2020233010A1 WO 2020233010 A1 WO2020233010 A1 WO 2020233010A1 CN 2019117743 W CN2019117743 W CN 2019117743W WO 2020233010 A1 WO2020233010 A1 WO 2020233010A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- convolution
- result
- output matrix
- pooling
- image data
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
Definitions
- This application relates to the field of image recognition technology, and in particular to an image recognition method, device, computer equipment and storage medium based on a separable convolutional network.
- the input data is generally convolved and then input to the pooling layer for pooling. After one or more sets of convolutional pooling processing, the dimensionality-reduced pooling is obtained. As a result, follow-up calculations are performed, but the calculation amount of the standard convolutional network is large, and the training time of the data set is long, which can no longer meet the requirements for better and faster model training and use.
- the embodiments of the application provide an image recognition method, device, computer equipment, and storage medium based on a separable convolutional network, aiming to solve the problem of using a standard convolutional network for image recognition in the prior art, which requires a large amount of calculation and a data set The problem of long training time.
- an embodiment of the present application provides an image recognition method based on a separable convolutional network, which includes:
- the pooling result is input to the fully connected layer to obtain the recognition result corresponding to the original image data, and the recognition result is sent to the upload terminal corresponding to the original image data.
- an embodiment of the present application provides an image recognition device based on a separable convolutional network, which includes:
- Picture receiving unit for receiving original image data
- the shallow convolution unit is configured to input the pixel matrix corresponding to the original image data to the first convolution network constructed in the convolution layer for convolution to obtain the first output matrix;
- a deep convolution unit configured to input the first output matrix into a second convolution network constructed in the convolution layer for convolution to obtain a second output matrix
- a pooling unit for inputting the second output matrix to the pooling layer for pooling, and obtaining a pooling result
- the recognition result obtaining unit is configured to input the pooling result to the fully connected layer to obtain the recognition result corresponding to the original image data, and send the recognition result to the uploader corresponding to the original image data.
- an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer
- the program implements the image recognition method based on the separable convolutional network described in the first aspect.
- the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the above-mentioned An image recognition method based on a separable convolutional network described in one aspect.
- FIG. 1 is a schematic diagram of an application scenario of an image recognition method based on a segmentable convolutional network provided by an embodiment of the application;
- FIG. 2 is a schematic flowchart of an image recognition method based on a segmentable convolutional network provided by an embodiment of the application;
- FIG. 3 is a schematic diagram of a sub-process of an image recognition method based on a separable convolutional network provided by an embodiment of the application;
- FIG. 4 is a schematic diagram of another sub-process of the image recognition method based on a separable convolutional network provided by an embodiment of the application;
- FIG. 5 is a schematic diagram of another sub-flow of the image recognition method based on a separable convolutional network provided by an embodiment of the application;
- FIG. 6 is a schematic block diagram of an image recognition device based on a separable convolutional network provided by an embodiment of the application;
- FIG. 7 is a schematic block diagram of subunits of an image recognition device based on a separable convolutional network provided by an embodiment of the application;
- FIG. 8 is a schematic block diagram of another subunit of the image recognition device based on a separable convolutional network according to an embodiment of the application;
- FIG. 9 is a schematic block diagram of another sub-unit of the image recognition device based on a separable convolutional network provided by an embodiment of the application;
- FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the application.
- FIG. 1 is a schematic diagram of an application scenario of an image recognition method based on a separable convolutional network provided by an embodiment of the application
- FIG. 2 is an image recognition based on a separable convolutional network provided by an embodiment of the application
- the image recognition method based on a segmentable convolutional network is applied to a server, and the method is executed by application software installed in the server.
- the method includes steps S110 to S150.
- the user terminal ie uploader
- the image recognition model in the server will The original image data is recognized to obtain the recognition result.
- the original image data after the original image data is acquired, it needs to be correspondingly converted into a pixel matrix, and subsequent processing is performed on the pixel matrix.
- the pixel matrix of the original image data is directly input to the convolutional layer for convolution, and then input to the pooling layer for pooling, and finally the pooling result is input to the fully connected layer Get the recognition result.
- the degree of compression may not be sufficient. Therefore, a separable convolutional network based on the improvement of the standard convolutional network is adopted in this application, which is not limited to Perform a convolution.
- step S120 includes:
- S122 Perform normalization processing on each value included in the first convolution result to obtain a first normalization result
- the 3*3 deep convolution kernel is Depthwise Convolution (Depthwise Convolution is deep convolution, which is a basic idea for constructing a model and can effectively reduce the computational complexity of a deep neural network).
- the process of convolution can be understood as using a filter (convolution kernel) to filter each small area of the image, so as to obtain the feature value of these small areas.
- a shallow convolution is realized, and the convolution of the depth dimension in the pixel matrix is realized.
- D_k*D_k*1 convolution kernel For each input channel, one D_k*D_k*1 convolution kernel is used for convolution. A total of M convolution kernels are used, and M operations are performed to obtain M D_f*D_f*1 feature maps (first The output matrix can be regarded as a feature map). These feature maps are learned from different input channels and are independent of each other.
- step S121 includes:
- the convolution when the convolution is performed by the depth convolution kernel, for example, the input picture is Dk*Dk*M (Dk is the picture size, M is the number of input channels), then there are M depth volumes of Dw*Dw
- the product kernel is to perform convolution with M channels respectively, and output the D_f*D_f*M result. That is, for each input channel, one D_k*D_k*1 convolution kernel is used for convolution, a total of M convolution kernels are used, and M operations are performed to obtain M D_f*D_f*1 feature maps.
- each channel is independent, so the sum subscript does not need M.
- M operations are expressed as a formula.
- the standard convolution calculation amount is: D_k*D_k*M*N*D_f*D_f. That is to say, to calculate D_f*D_f values, calculating each value requires multiplying the values of all corresponding sliding windows, and then adding the values of all channels.
- the calculation amount of convolution needs to calculate D_f*D_f values, the calculation amount each time is D_k*D_k, and the loop M times is D_k*D_k*M*D_f*D_f.
- step S122 includes:
- the normalization processing (Batch Normalization, which means normalization) is performed on the convolution result to solve the problem that the data distribution in the middle layer changes during the calculation process to prevent the gradient from disappearing or exploding and speeding up Training speed.
- the first average value corresponding to all the values in the first convolution result is calculated first, and then the first variance corresponding to all the values in the first convolution result is calculated, and finally passed Divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain a first normalized result.
- step S123 includes:
- the negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
- the first normalized result is activated by the first activation function to obtain the first output matrix, which increases the nonlinear relationship between the layers of the neural network. Otherwise, if there is no activation function , There is a simple linear relationship between layers, and each layer is equivalent to matrix multiplication, which cannot complete the complex tasks required by the neural network.
- the first activation function is a Relu function (Rectified linear unit, representing a modified linear unit), and the function of the Relu function is to increase the nonlinear relationship between the layers of the neural network.
- the width dimension convolution can be performed through the second convolution network constructed in the convolution layer in advance, and this convolution process is regarded as deep convolution.
- step S130 includes:
- S133 Activate the second normalized result through a second activation function to obtain a second output matrix.
- step S120 for the M feature maps obtained in step S120 as the input of M channels, standard convolution is performed with N 1 ⁇ 1 ⁇ M convolution kernels to obtain the output of D_f*D_f*N.
- the calculation amount analysis is about 1*1*M*N*D_f*D_f, and the calculation amount that can be saved is: 1/N+1/D_k 2 .
- the general convolution kernel is 3*3, and the amount of calculation can be saved about 9 times.
- the normalization process in step S132 is the same as in step S122, and the activation function in step S133 is the same as in step S123.
- inputting the second output matrix to the pooling layer for pooling is to further sample the second output matrix to reduce dimensionality.
- the original picture is 20*20, it is down-sampled, the sampling window is 10*10, and finally it is down-sampled into a 2*2 feature map.
- pooling because even after convolution, the image is still very large (because the convolution kernel is relatively small), so in order to reduce the data dimension, downsampling is performed. In the process of pooling, even if a lot of data is reduced, the statistical attributes of the features can still describe the image, and because the data dimension is reduced, overfitting is effectively avoided.
- pooling is divided into maximum down sampling (Max-Pooling) and average down sampling (Mean-Pooling) according to the down-sampling method. That is, the second output matrix is input to the pooling layer to perform pooling through maximum down sampling or average down sampling to obtain a pooling result.
- Max-Pooling maximum down sampling
- Mean-Pooling average down sampling
- the above original picture is 20*20 in size
- the maximum value is downsampled
- the sampling window is 10*10
- the area where the original picture is 20*20 is divided into upper left, upper right, lower left, and lower right 4 10*10 areas
- the maximum value of each 10*10 area is taken as the characteristic value of the area
- the maximum value is down-sampled
- the average value of each 10*10 area is taken as the characteristic value of the area Is the maximum downsampling.
- fully connected layers function as a "classifier" in the entire convolutional neural network. If operations such as the convolutional layer, pooling layer, and activation function layer map the original data to the hidden layer feature space, the fully connected layer functions to map the learned "distributed feature representation" to the sample label space.
- the fully connected layer can be realized by the convolution operation: the fully connected layer that is fully connected to the previous layer can be converted into a convolution with a convolution kernel of 1*1; and the fully connected layer of the convolutional layer in the previous layer can be Converted into a global convolution with the convolution kernel h*w, where h and w are the height and width of the previous convolution result.
- This method adopts the image recognition of the segmentable convolutional network, which reduces the amount of calculation in the image recognition process.
- An embodiment of the present application also provides an image recognition device based on a separable convolutional network, and the image recognition device based on a separable convolutional network is used to execute any embodiment of the aforementioned image recognition method based on a separable convolutional network.
- FIG. 6, is a schematic block diagram of an image recognition apparatus based on a separable convolutional network provided by an embodiment of the present application.
- the image recognition device 100 based on a separable convolutional network can be configured in a server.
- the image recognition device 100 based on the separable convolutional network includes a picture receiving unit 110, a shallow convolution unit 120, a deep convolution unit 130, a pooling unit 140, and a recognition result obtaining unit 150.
- the picture receiving unit 110 is used to receive original image data.
- the shallow convolution unit 120 is configured to input the pixel matrix corresponding to the original image data to the first convolution network constructed in the convolution layer for convolution to obtain a first output matrix.
- the shallow convolution unit 120 includes:
- the first convolution unit 121 is configured to convolve the pixel matrix with a 3*3 deep convolution kernel to obtain a first convolution result
- the first normalization unit 122 is configured to perform normalization processing on each value included in the first convolution result to obtain a first normalization result
- the first activation unit 123 is configured to activate the first normalized result through a first activation function to obtain a first output matrix.
- the first convolution unit 121 is further configured to:
- the first normalization unit 122 includes:
- the average value obtaining unit 1221 is configured to obtain the first average value corresponding to all the values in the first convolution result
- the variance obtaining unit 1222 is configured to obtain the first variance corresponding to all the values in the first convolution result
- the normalization calculation unit 1223 is configured to divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain the first normalization result.
- the first activation unit 123 is further configured to:
- the negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
- the deep convolution unit 130 is configured to input the first output matrix into a second convolutional network pre-built in the convolution layer for convolution to obtain a second output matrix.
- the deep convolution unit 130 includes:
- the second convolution unit 131 is configured to convolve the first output matrix with a 1*1 convolution kernel to obtain a second convolution result;
- a second normalization unit 132 configured to 132, normalize each value included in the second convolution result to obtain a second normalization result
- the second activation unit 133 is configured to activate the second normalized result through a second activation function to obtain a second output matrix.
- the pooling unit 140 is configured to input the second output matrix to the pooling layer for pooling, and obtain a pooling result.
- the recognition result obtaining unit 150 is configured to input the pooling result into the fully connected layer to obtain the recognition result corresponding to the original image data, and send the recognition result to the uploader corresponding to the original image data.
- the device adopts image recognition of a segmentable convolutional network, which reduces the amount of calculation in the image recognition process.
- the above-mentioned image recognition apparatus based on a separable convolutional network may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 10.
- FIG. 10 is a schematic block diagram of a computer device according to an embodiment of the present application.
- the computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
- the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
- the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
- the processor 502 can execute an image recognition method based on a separable convolutional network.
- the processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
- the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
- the processor 502 can make the processor 502 execute an image recognition method based on a separable convolutional network.
- the network interface 505 is used for network communication, such as providing data information transmission.
- the structure shown in FIG. 10 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
- the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
- the processor 502 is configured to run a computer program 5032 stored in a memory to implement the image recognition method based on a separable convolutional network in the embodiment of the present application.
- the embodiment of the computer device shown in FIG. 10 does not constitute a limitation on the specific configuration of the computer device.
- the computer device may include more or less components than shown in the figure. Or combine certain components, or different component arrangements.
- the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 10, and will not be repeated here.
- the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
- a computer-readable storage medium may be a non-volatile computer-readable storage medium.
- the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the image recognition method based on a separable convolutional network in the embodiment of the present application.
- the storage medium is a physical, non-transitory storage medium, such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk, or an optical disk that can store program codes. medium.
- a physical, non-transitory storage medium such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk, or an optical disk that can store program codes. medium.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Disclosed are an image recognition method and apparatus based on a segmentable convolutional network, and a computer device and a storage medium. The method comprises: receiving original image data; inputting a pixel matrix corresponding to the original image data into a pre-constructed first convolutional network in a convolutional layer for convolution, so as to obtain a first output matrix; inputting the first output matrix into a pre-constructed second convolutional network in the convolutional layer for convolution, so as to obtain a second output matrix; inputting the second output matrix into a pooling layer for pooling, so as to obtain a pooling result; and inputting the pooling result into a fully connected layer to obtain a recognition result corresponding to the original image data, and sending the recognition result to an upload end corresponding to the original image data.
Description
本申请要求于2019年5月23日提交中国专利局、申请号为201910433281.2、申请名称为“基于可分割卷积网络的图像识别方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 23, 2019, the application number is 201910433281.2, and the application title is "Image recognition method, device and computer equipment based on a separable convolutional network", all of which The content is incorporated in this application by reference.
本申请涉及图像识别技术领域,尤其涉及一种基于可分割卷积网络的图像识别方法、装置、计算机设备及存储介质。This application relates to the field of image recognition technology, and in particular to an image recognition method, device, computer equipment and storage medium based on a separable convolutional network.
目前,进行图像识别时,采用标准的卷积网络时,一般对输入数据进行卷积后即输入至池化层进行池化,经过一组或多组卷积池化处理得到降维的池化结果进行后续计算,但是采用标准的卷积网络的计算量较大,数据集的训练时间长,已不能满足对于模型训练和使用上更好更快的要求。At present, when performing image recognition, when a standard convolutional network is used, the input data is generally convolved and then input to the pooling layer for pooling. After one or more sets of convolutional pooling processing, the dimensionality-reduced pooling is obtained As a result, follow-up calculations are performed, but the calculation amount of the standard convolutional network is large, and the training time of the data set is long, which can no longer meet the requirements for better and faster model training and use.
发明内容Summary of the invention
本申请实施例提供了一种基于可分割卷积网络的图像识别方法、装置、计算机设备及存储介质,旨在解决现有技术中图像识别采用标准的卷积网络,计算量较大,数据集的训练时间长的问题。The embodiments of the application provide an image recognition method, device, computer equipment, and storage medium based on a separable convolutional network, aiming to solve the problem of using a standard convolutional network for image recognition in the prior art, which requires a large amount of calculation and a data set The problem of long training time.
第一方面,本申请实施例提供了一种基于可分割卷积网络的图像识别方法,其包括:In the first aspect, an embodiment of the present application provides an image recognition method based on a separable convolutional network, which includes:
接收原始图像数据;Receive original image data;
将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵;Input the pixel matrix corresponding to the original image data to the first convolutional network pre-built in the convolutional layer for convolution to obtain the first output matrix;
将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵;Inputting the first output matrix to a second convolutional network pre-built in the convolutional layer for convolution to obtain a second output matrix;
将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及Input the second output matrix to the pooling layer for pooling to obtain a pooling result; and
将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The pooling result is input to the fully connected layer to obtain the recognition result corresponding to the original image data, and the recognition result is sent to the upload terminal corresponding to the original image data.
第二方面,本申请实施例提供了一种基于可分割卷积网络的图像识别装置,其包括:In the second aspect, an embodiment of the present application provides an image recognition device based on a separable convolutional network, which includes:
图片接收单元,用于接收原始图像数据;Picture receiving unit for receiving original image data;
浅层卷积单元,用于将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵;The shallow convolution unit is configured to input the pixel matrix corresponding to the original image data to the first convolution network constructed in the convolution layer for convolution to obtain the first output matrix;
深层卷积单元,用于将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵;A deep convolution unit, configured to input the first output matrix into a second convolution network constructed in the convolution layer for convolution to obtain a second output matrix;
池化单元,用于将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及A pooling unit for inputting the second output matrix to the pooling layer for pooling, and obtaining a pooling result; and
识别结果获取单元,用于将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The recognition result obtaining unit is configured to input the pooling result to the fully connected layer to obtain the recognition result corresponding to the original image data, and send the recognition result to the uploader corresponding to the original image data.
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的基于可分割卷积网络的图像识别方法。In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer The program implements the image recognition method based on the separable convolutional network described in the first aspect.
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的基于可分割卷积网络的图像识别方法。In a fourth aspect, the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the above-mentioned An image recognition method based on a separable convolutional network described in one aspect.
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1为本申请实施例提供的基于可分割卷积网络的图像识别方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of an image recognition method based on a segmentable convolutional network provided by an embodiment of the application;
图2为本申请实施例提供的基于可分割卷积网络的图像识别方法的流程示意图;2 is a schematic flowchart of an image recognition method based on a segmentable convolutional network provided by an embodiment of the application;
图3为本申请实施例提供的基于可分割卷积网络的图像识别方法的子流程 示意图;FIG. 3 is a schematic diagram of a sub-process of an image recognition method based on a separable convolutional network provided by an embodiment of the application;
图4为本申请实施例提供的基于可分割卷积网络的图像识别方法的另一子流程示意图;FIG. 4 is a schematic diagram of another sub-process of the image recognition method based on a separable convolutional network provided by an embodiment of the application;
图5为本申请实施例提供的基于可分割卷积网络的图像识别方法的另一子流程示意图;FIG. 5 is a schematic diagram of another sub-flow of the image recognition method based on a separable convolutional network provided by an embodiment of the application;
图6为本申请实施例提供的基于可分割卷积网络的图像识别装置的示意性框图;6 is a schematic block diagram of an image recognition device based on a separable convolutional network provided by an embodiment of the application;
图7为本申请实施例提供的基于可分割卷积网络的图像识别装置的子单元示意性框图;FIG. 7 is a schematic block diagram of subunits of an image recognition device based on a separable convolutional network provided by an embodiment of the application;
图8为本申请实施例提供的基于可分割卷积网络的图像识别装置的另一子单元示意性框图;FIG. 8 is a schematic block diagram of another subunit of the image recognition device based on a separable convolutional network according to an embodiment of the application;
图9为本申请实施例提供的基于可分割卷积网络的图像识别装置的另一子单元示意性框图;9 is a schematic block diagram of another sub-unit of the image recognition device based on a separable convolutional network provided by an embodiment of the application;
图10为本申请实施例提供的计算机设备的示意性框图。FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the application.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the terms "including" and "including" indicate the existence of the described features, wholes, steps, operations, elements and/or components, but do not exclude one or The existence or addition of multiple other features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terms used in the specification of this application are only for the purpose of describing specific embodiments and are not intended to limit the application. As used in the specification of this application and the appended claims, unless the context clearly indicates other circumstances, the singular forms "a", "an" and "the" are intended to include plural forms.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations .
请参阅图1和图2,图1为本申请实施例提供的基于可分割卷积网络的图像识别方法的应用场景示意图;图2为本申请实施例提供的基于可分割卷积网络的图像识别方法的流程示意图,该基于可分割卷积网络的图像识别方法应用于服务器中,该方法通过安装于服务器中的应用软件进行执行。Please refer to FIGS. 1 and 2. FIG. 1 is a schematic diagram of an application scenario of an image recognition method based on a separable convolutional network provided by an embodiment of the application; FIG. 2 is an image recognition based on a separable convolutional network provided by an embodiment of the application A schematic flow chart of the method. The image recognition method based on a segmentable convolutional network is applied to a server, and the method is executed by application software installed in the server.
如图2所示,该方法包括步骤S110~S150。As shown in Fig. 2, the method includes steps S110 to S150.
S110、接收原始图像数据。S110. Receive original image data.
在本实施例中,当用户需获取目标图像的图像识别结果时,操作用户终端(即上传端)将原始图像数据通过服务器所提供接口的用户交互界面上传,由服务器中的图像识别模型对所述原始图像数据进行识别,从而得到识别结果。In this embodiment, when the user needs to obtain the image recognition result of the target image, the user terminal (ie uploader) is operated to upload the original image data through the user interaction interface of the interface provided by the server, and the image recognition model in the server will The original image data is recognized to obtain the recognition result.
S120、将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵。S120. Input a pixel matrix corresponding to the original image data to a first convolutional network pre-built in the convolutional layer for convolution to obtain a first output matrix.
在本实施例中,当获取了原始图像数据后,需将其对应转化为像素矩阵,对像素矩阵进行后续处理。在现有技术中,采用卷积神经网络时是将原始图像数据的像素矩阵直接输入至卷积层进行卷积后,再输入池化层进行池化,最后将池化结果输入至全连接层得到识别结果。但由于原始图像的像素矩阵直接输入至卷积层进行卷积后,可能压缩程度不够,故本申请中采用一种基于标准卷积网络改进的可分割卷积网络,也即并不局限于只进行一次卷积。In this embodiment, after the original image data is acquired, it needs to be correspondingly converted into a pixel matrix, and subsequent processing is performed on the pixel matrix. In the prior art, when a convolutional neural network is used, the pixel matrix of the original image data is directly input to the convolutional layer for convolution, and then input to the pooling layer for pooling, and finally the pooling result is input to the fully connected layer Get the recognition result. However, since the pixel matrix of the original image is directly input to the convolutional layer for convolution, the degree of compression may not be sufficient. Therefore, a separable convolutional network based on the improvement of the standard convolutional network is adopted in this application, which is not limited to Perform a convolution.
在一实施例中,如图3所示,步骤S120包括:In an embodiment, as shown in FIG. 3, step S120 includes:
S121、通过3*3的深度卷积核对所述像素矩阵进行卷积,得到第一卷积结果;S121: Perform convolution on the pixel matrix through a 3*3 depth convolution kernel to obtain a first convolution result;
S122、将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果;S122: Perform normalization processing on each value included in the first convolution result to obtain a first normalization result;
S123、将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵。S123: Activate the first normalized result through a first activation function to obtain a first output matrix.
在本实施例中,3*3的深度卷积核即Depthwise Convolution(Depthwise Convolution即深度卷积,它是一种构建模型的基本思想,能够有效降低深度神经网络的计算复杂度)。卷积这个过程可以理解为使用一个过滤器(卷积核)来过滤图像的各个小区域,从而得到这些小区域的特征值。之后对第一卷积结果进行归一化处理和激活函数激活后,实现了一种浅层的卷积,实现了对像素矩阵中深度维度的卷积。In this embodiment, the 3*3 deep convolution kernel is Depthwise Convolution (Depthwise Convolution is deep convolution, which is a basic idea for constructing a model and can effectively reduce the computational complexity of a deep neural network). The process of convolution can be understood as using a filter (convolution kernel) to filter each small area of the image, so as to obtain the feature value of these small areas. After the first convolution result is normalized and the activation function is activated, a shallow convolution is realized, and the convolution of the depth dimension in the pixel matrix is realized.
对于输入的每一个通道分别用1个D_k*D_k*1的卷积核进行卷积,共使用了M个卷积核,操作M次,得到M个D_f*D_f*1的特征图(第一输出矩阵可以视为特征图)。这些特征图分别是从输入的不同通道学习而来,彼此独立。For each input channel, one D_k*D_k*1 convolution kernel is used for convolution. A total of M convolution kernels are used, and M operations are performed to obtain M D_f*D_f*1 feature maps (first The output matrix can be regarded as a feature map). These feature maps are learned from different input channels and are independent of each other.
在一实施例中,步骤S121包括:In an embodiment, step S121 includes:
获取所述像素矩阵中的输入通道个数,通过与所述输入通道个数相同个数的3*3的深度卷积核遍历所述像素矩阵进行卷积,得到第一卷积结果。Obtain the number of input channels in the pixel matrix, and traverse the pixel matrix to perform convolution through a 3*3 depth convolution kernel with the same number as the number of input channels to obtain a first convolution result.
在本实施例中,通过深度卷积核进行卷积时,比如输入的图片是Dk*Dk*M(Dk是图片大小,M是输入的通道数),则有M个Dw*Dw的深度卷积核,分别去跟M个通道进行卷积,输出D_f*D_f*M结果。即对于输入的每一个通道分别用1个D_k*D_k*1的卷积核进行卷积,共使用了M个卷积核,操作M次,得到M个D_f*D_f*1的特征图。In this embodiment, when the convolution is performed by the depth convolution kernel, for example, the input picture is Dk*Dk*M (Dk is the picture size, M is the number of input channels), then there are M depth volumes of Dw*Dw The product kernel is to perform convolution with M channels respectively, and output the D_f*D_f*M result. That is, for each input channel, one D_k*D_k*1 convolution kernel is used for convolution, a total of M convolution kernels are used, and M operations are performed to obtain M D_f*D_f*1 feature maps.
在进行深层卷积计算,与标准卷积相对比,各个通道是独立的,所以求和下标不需要M,这里将M次操作表达为一个公式。标准卷积计算量为:D_k*D_k*M*N*D_f*D_f。也就是说要计算出D_f*D_f个值,计算每个值需要对应的所有对应滑动窗口的值相乘,然后所有通道的值相加。本实施例中卷积计算量需要计算出D_f*D_f个值,每次的计算量为D_k*D_k,循环M次,为D_k*D_k*M*D_f*D_f。通过对像素矩阵中进行深度维度的卷积,实现了将像素矩阵变瘦,从而使得后续计算量得到降低。In the deep convolution calculation, compared with the standard convolution, each channel is independent, so the sum subscript does not need M. Here, M operations are expressed as a formula. The standard convolution calculation amount is: D_k*D_k*M*N*D_f*D_f. That is to say, to calculate D_f*D_f values, calculating each value requires multiplying the values of all corresponding sliding windows, and then adding the values of all channels. In this embodiment, the calculation amount of convolution needs to calculate D_f*D_f values, the calculation amount each time is D_k*D_k, and the loop M times is D_k*D_k*M*D_f*D_f. By convolving the depth dimension in the pixel matrix, the pixel matrix is made thinner, so that the subsequent calculation amount is reduced.
在一实施例中,如图4所示,步骤S122包括:In an embodiment, as shown in FIG. 4, step S122 includes:
S1221、获取所述第一卷积结果中所有值对应的第一平均值;S1221. Obtain a first average value corresponding to all values in the first convolution result.
S1222、获取所述第一卷积结果中所有值对应的第一方差;S1222: Acquire first variances corresponding to all values in the first convolution result;
S1223、将所述第一卷积结果中每一值均减去第一方差得到的各差值除以所述第一方差,得到第一归一化结果。S1223. Divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain a first normalized result.
在本实施例中,对卷积结果进行归一化处理(Batch Normalization,表示归一化),是为了解决在计算过程中,中间层数据分布发生改变的问题,以防止梯度消失或爆炸、加快训练速度。具体进行归一化处理时,是先计算所述第一卷积结果中所有值对应的第一平均值,然后计算取所述第一卷积结果中所有值对应的第一方差,最后通过将所述第一卷积结果中每一值均减去第一方差得到的各差值除以所述第一方差,得到第一归一化结果。In this embodiment, the normalization processing (Batch Normalization, which means normalization) is performed on the convolution result to solve the problem that the data distribution in the middle layer changes during the calculation process to prevent the gradient from disappearing or exploding and speeding up Training speed. In the specific normalization process, the first average value corresponding to all the values in the first convolution result is calculated first, and then the first variance corresponding to all the values in the first convolution result is calculated, and finally passed Divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain a first normalized result.
在一实施例中,步骤S123包括:In an embodiment, step S123 includes:
通过所述第一激活函数将所述第一归一化结果中的负值进行置零,以激活得到第一输出矩阵。The negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
在本实施例中,将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵,是增加了神经网络各层之间的非线性关系,否则,如果没有激活函数,层与层之间是简单的线性关系,每层都相当于矩阵相乘,无法完成神经网络所需完成的复杂任务。具体实施时,所述第一激活函数为Relu函数(Rectified linear unit,表示修正线性单元),Relu函数的作用就是增加了神经网络各层之间的非线性关系。Relu函数的表达式如下:f(x)=max(0,x),即只保留所述第一归一化结果中的正值,将所述第一归一化结果中的负值进行置零,以激活得到第一输出矩阵。In this embodiment, the first normalized result is activated by the first activation function to obtain the first output matrix, which increases the nonlinear relationship between the layers of the neural network. Otherwise, if there is no activation function , There is a simple linear relationship between layers, and each layer is equivalent to matrix multiplication, which cannot complete the complex tasks required by the neural network. In a specific implementation, the first activation function is a Relu function (Rectified linear unit, representing a modified linear unit), and the function of the Relu function is to increase the nonlinear relationship between the layers of the neural network. The expression of the Relu function is as follows: f(x)=max(0,x), that is, only the positive value in the first normalized result is retained, and the negative value in the first normalized result is set Zero to activate the first output matrix.
S130、将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵。S130. Input the first output matrix to a second convolutional network pre-built in the convolutional layer for convolution to obtain a second output matrix.
在本实施例中,当完成了浅层卷积后,可以通过卷积层中预先构建的第二卷积网络进行宽度维度的卷积,这一卷积过程视为深层卷积。In this embodiment, after the shallow convolution is completed, the width dimension convolution can be performed through the second convolution network constructed in the convolution layer in advance, and this convolution process is regarded as deep convolution.
在一实施例中,如图5所示,步骤S130包括:In an embodiment, as shown in FIG. 5, step S130 includes:
S131、通过1*1的卷积核对所述第一输出矩阵进行卷积,得到第二卷积结果;S131: Perform convolution on the first output matrix through a 1*1 convolution kernel to obtain a second convolution result;
S132、将所述第二卷积结果中所包括的每一值进行归一化处理,得到第二归一化结果;S132: Perform normalization processing on each value included in the second convolution result to obtain a second normalization result;
S133、将所述第二归一化结果通过第二激活函数进行激活,以得到第二输出矩阵。S133: Activate the second normalized result through a second activation function to obtain a second output matrix.
在本实施例中,对于步骤S120中得到的M个特征图作为M个通道的输入,用N个1×1×M的卷积核进行标准卷积,得到D_f*D_f*N的输出。计算量分析约为1*1*M*N*D_f*D_f,可节约计算量为:1/N+1/D_k
2。一般卷积核为3*3,计算量能节约9倍左右。其中,在步骤S132中进行归一化处理的方式与步骤S122中相同,在步骤S133中进行激活函数进行激活处理的方式与步骤S123中相同。
In this embodiment, for the M feature maps obtained in step S120 as the input of M channels, standard convolution is performed with N 1×1×M convolution kernels to obtain the output of D_f*D_f*N. The calculation amount analysis is about 1*1*M*N*D_f*D_f, and the calculation amount that can be saved is: 1/N+1/D_k 2 . The general convolution kernel is 3*3, and the amount of calculation can be saved about 9 times. Among them, the normalization process in step S132 is the same as in step S122, and the activation function in step S133 is the same as in step S123.
S140、将所述第二输出矩阵输入至池化层进行池化,得到池化结果。S140. Input the second output matrix to a pooling layer for pooling, and obtain a pooling result.
在本实施例中,将所述第二输出矩阵输入至池化层进行池化,是进一步对所述第二输出矩阵进行采样以降维。In this embodiment, inputting the second output matrix to the pooling layer for pooling is to further sample the second output matrix to reduce dimensionality.
原始图片是20*20的,对其进行下采样,采样窗口为10*10,最终将其下采样成为一个2*2大小的特征图。The original picture is 20*20, it is down-sampled, the sampling window is 10*10, and finally it is down-sampled into a 2*2 feature map.
之所以进行池化,是因为即使做完了卷积,图像仍然很大(因为卷积核比较小),所以为了降低数据维度,就进行下采样。在池化的过程中,即使减少了许多数据,特征的统计属性仍能够描述图像,而且由于降低了数据维度,有效地避免了过拟合。The reason for pooling is because even after convolution, the image is still very large (because the convolution kernel is relatively small), so in order to reduce the data dimension, downsampling is performed. In the process of pooling, even if a lot of data is reduced, the statistical attributes of the features can still describe the image, and because the data dimension is reduced, overfitting is effectively avoided.
在实际应用中,池化根据下采样的方法,分为最大值下采样(Max-Pooling)与平均值下采样(Mean-Pooling)。即将所述第二输出矩阵输入至池化层通过最大值下采样或平均值下采样进行池化,得到池化结果。In practical applications, pooling is divided into maximum down sampling (Max-Pooling) and average down sampling (Mean-Pooling) according to the down-sampling method. That is, the second output matrix is input to the pooling layer to perform pooling through maximum down sampling or average down sampling to obtain a pooling result.
例如,上述原始图片是20*20的大小,对其进行最大值下采样,采样窗口为10*10,则将原始图片是20*20的区域划分为上左、上右、下左、下右4个10*10的区域,每一10*10的区域中取最大值作为该区域的特征值则是最大值下采样,每一10*10的区域中取平均值作为该区域的特征值则是最大值下采样。通过上述处理后不仅保留了图像的关键特征,而且实现了降维。For example, the above original picture is 20*20 in size, the maximum value is downsampled, and the sampling window is 10*10, then the area where the original picture is 20*20 is divided into upper left, upper right, lower left, and lower right 4 10*10 areas, the maximum value of each 10*10 area is taken as the characteristic value of the area, then the maximum value is down-sampled, and the average value of each 10*10 area is taken as the characteristic value of the area Is the maximum downsampling. After the above processing, not only the key features of the image are retained, but also the dimensionality reduction is achieved.
S150、将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。S150. Input the pooling result to a fully connected layer to obtain a recognition result corresponding to the original image data, and send the recognition result to an uploader corresponding to the original image data.
在本实施例中,全连接层(fully connected layers,FC)在整个卷积神经网络中起到“分类器”的作用。若卷积层、池化层和激活函数层等操作是将原始数据映射到隐层特征空间的话,全连接层则起到将学到的“分布式特征表示”映射到样本标记空间的作用。在实际使用中,全连接层可由卷积操作实现:对前层是全连接的全连接层可以转化为卷积核为1*1的卷积;而前层是卷积层的全连接层可以转化为卷积核为h*w的全局卷积,h和w分别为前层卷积结果的高和宽。当获取了所述识别结果后,将所述识别结果发送至所述原始图像数据对应的上传端,以通知用户获取识别结果。In this embodiment, fully connected layers (FC) function as a "classifier" in the entire convolutional neural network. If operations such as the convolutional layer, pooling layer, and activation function layer map the original data to the hidden layer feature space, the fully connected layer functions to map the learned "distributed feature representation" to the sample label space. In actual use, the fully connected layer can be realized by the convolution operation: the fully connected layer that is fully connected to the previous layer can be converted into a convolution with a convolution kernel of 1*1; and the fully connected layer of the convolutional layer in the previous layer can be Converted into a global convolution with the convolution kernel h*w, where h and w are the height and width of the previous convolution result. After the recognition result is obtained, the recognition result is sent to the upload terminal corresponding to the original image data to notify the user to obtain the recognition result.
该方法采用可分割卷积网络的图像识别,实现了图像识别过程中计算量的降低。This method adopts the image recognition of the segmentable convolutional network, which reduces the amount of calculation in the image recognition process.
本申请实施例还提供一种基于可分割卷积网络的图像识别装置,该基于可分割卷积网络的图像识别装置用于执行前述基于可分割卷积网络的图像识别方法的任一实施例。具体地,请参阅图6,图6是本申请实施例提供的基于可分割卷积网络的图像识别装置的示意性框图。该基于可分割卷积网络的图像识别装 置100可以配置于服务器中。An embodiment of the present application also provides an image recognition device based on a separable convolutional network, and the image recognition device based on a separable convolutional network is used to execute any embodiment of the aforementioned image recognition method based on a separable convolutional network. Specifically, please refer to FIG. 6, which is a schematic block diagram of an image recognition apparatus based on a separable convolutional network provided by an embodiment of the present application. The image recognition device 100 based on a separable convolutional network can be configured in a server.
如图6所示,基于可分割卷积网络的图像识别装置100包括图片接收单元110、浅层卷积单元120、深层卷积单元130、池化单元140、识别结果获取单元150。As shown in FIG. 6, the image recognition device 100 based on the separable convolutional network includes a picture receiving unit 110, a shallow convolution unit 120, a deep convolution unit 130, a pooling unit 140, and a recognition result obtaining unit 150.
图片接收单元110,用于接收原始图像数据。The picture receiving unit 110 is used to receive original image data.
浅层卷积单元120,用于将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵。The shallow convolution unit 120 is configured to input the pixel matrix corresponding to the original image data to the first convolution network constructed in the convolution layer for convolution to obtain a first output matrix.
在一实施例中,如图7所示,浅层卷积单元120包括:In an embodiment, as shown in FIG. 7, the shallow convolution unit 120 includes:
第一卷积单元121,用于通过3*3的深度卷积核对所述像素矩阵进行卷积,得到第一卷积结果;The first convolution unit 121 is configured to convolve the pixel matrix with a 3*3 deep convolution kernel to obtain a first convolution result;
第一归一化单元122,用于将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果;The first normalization unit 122 is configured to perform normalization processing on each value included in the first convolution result to obtain a first normalization result;
第一激活单元123,用于将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵。The first activation unit 123 is configured to activate the first normalized result through a first activation function to obtain a first output matrix.
在一实施例中,第一卷积单元121还用于:In an embodiment, the first convolution unit 121 is further configured to:
获取所述像素矩阵中的输入通道个数,通过与所述输入通道个数相同个数的3*3的深度卷积核遍历所述像素矩阵进行卷积,得到第一卷积结果。Obtain the number of input channels in the pixel matrix, and traverse the pixel matrix to perform convolution through a 3*3 depth convolution kernel with the same number as the number of input channels to obtain a first convolution result.
在一实施例中,如图8所示,第一归一化单元122包括:In an embodiment, as shown in FIG. 8, the first normalization unit 122 includes:
平均值获取单元1221,用于获取所述第一卷积结果中所有值对应的第一平均值;The average value obtaining unit 1221 is configured to obtain the first average value corresponding to all the values in the first convolution result;
方差获取单元1222,用于获取所述第一卷积结果中所有值对应的第一方差;The variance obtaining unit 1222 is configured to obtain the first variance corresponding to all the values in the first convolution result;
归一化计算单元1223,用于将所述第一卷积结果中每一值均减去第一方差得到的各差值除以所述第一方差,得到第一归一化结果。The normalization calculation unit 1223 is configured to divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain the first normalization result.
在一实施例中,所述第一激活单元123还用于:In an embodiment, the first activation unit 123 is further configured to:
通过所述第一激活函数将所述第一归一化结果中的负值进行置零,以激活得到第一输出矩阵。The negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
深层卷积单元130,用于将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵。The deep convolution unit 130 is configured to input the first output matrix into a second convolutional network pre-built in the convolution layer for convolution to obtain a second output matrix.
在一实施例中,如图9所示,深层卷积单元130包括:In an embodiment, as shown in FIG. 9, the deep convolution unit 130 includes:
第二卷积单元131,用于通过1*1的卷积核对所述第一输出矩阵进行卷积, 得到第二卷积结果;The second convolution unit 131 is configured to convolve the first output matrix with a 1*1 convolution kernel to obtain a second convolution result;
第二归一化单元132,用于132、将所述第二卷积结果中所包括的每一值进行归一化处理,得到第二归一化结果;A second normalization unit 132, configured to 132, normalize each value included in the second convolution result to obtain a second normalization result;
第二激活单元133,用于将所述第二归一化结果通过第二激活函数进行激活,以得到第二输出矩阵。The second activation unit 133 is configured to activate the second normalized result through a second activation function to obtain a second output matrix.
池化单元140,用于将所述第二输出矩阵输入至池化层进行池化,得到池化结果。The pooling unit 140 is configured to input the second output matrix to the pooling layer for pooling, and obtain a pooling result.
识别结果获取单元150,用于将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The recognition result obtaining unit 150 is configured to input the pooling result into the fully connected layer to obtain the recognition result corresponding to the original image data, and send the recognition result to the uploader corresponding to the original image data.
该装置采用可分割卷积网络的图像识别,实现了图像识别过程中计算量的降低。The device adopts image recognition of a segmentable convolutional network, which reduces the amount of calculation in the image recognition process.
上述基于可分割卷积网络的图像识别装置可以实现为计算机程序的形式,该计算机程序可以在如图10所示的计算机设备上运行。The above-mentioned image recognition apparatus based on a separable convolutional network may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in FIG. 10.
请参阅图10,图10是本申请实施例提供的计算机设备的示意性框图。该计算机设备500是服务器,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。Please refer to FIG. 10, which is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
参阅图10,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。10, the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行基于可分割卷积网络的图像识别方法。The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. When the computer program 5032 is executed, the processor 502 can execute an image recognition method based on a separable convolutional network.
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。The processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行基于可分割卷积网络的图像识别方法。The internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can make the processor 502 execute an image recognition method based on a separable convolutional network.
该网络接口505用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的 计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The network interface 505 is used for network communication, such as providing data information transmission. Those skilled in the art can understand that the structure shown in FIG. 10 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied. The specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现本申请实施例中的基于可分割卷积网络的图像识别方法。Wherein, the processor 502 is configured to run a computer program 5032 stored in a memory to implement the image recognition method based on a separable convolutional network in the embodiment of the present application.
本领域技术人员可以理解,图10中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图10所示实施例一致,在此不再赘述。Those skilled in the art can understand that the embodiment of the computer device shown in FIG. 10 does not constitute a limitation on the specific configuration of the computer device. In other embodiments, the computer device may include more or less components than shown in the figure. Or combine certain components, or different component arrangements. For example, in some embodiments, the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 10, and will not be repeated here.
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in this embodiment of the application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现本申请实施例中的基于可分割卷积网络的图像识别方法。In another embodiment of the present application, a computer-readable storage medium is provided. The computer-readable storage medium may be a non-volatile computer-readable storage medium. The computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the image recognition method based on a separable convolutional network in the embodiment of the present application.
所述存储介质为实体的、非瞬时性的存储介质,例如可以是U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的实体存储介质。The storage medium is a physical, non-transitory storage medium, such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk, or an optical disk that can store program codes. medium.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the equipment, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.
Claims (20)
- 一种基于可分割卷积网络的图像识别方法,包括:An image recognition method based on a separable convolutional network, including:接收原始图像数据;Receive original image data;将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵;Input the pixel matrix corresponding to the original image data to the first convolutional network pre-built in the convolutional layer for convolution to obtain the first output matrix;将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵;Inputting the first output matrix to a second convolutional network pre-built in the convolutional layer for convolution to obtain a second output matrix;将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及Input the second output matrix to the pooling layer for pooling to obtain a pooling result; and将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The pooling result is input to the fully connected layer to obtain the recognition result corresponding to the original image data, and the recognition result is sent to the upload terminal corresponding to the original image data.
- 根据权利要求1所述的基于可分割卷积网络的图像识别方法,其中,所述将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵,包括:The image recognition method based on a separable convolutional network according to claim 1, wherein the pixel matrix corresponding to the original image data is input to the first convolutional network constructed in advance in the convolutional layer for convolution , Get the first output matrix, including:通过3*3的深度卷积核对所述像素矩阵进行卷积,得到第一卷积结果;Convolve the pixel matrix with a 3*3 deep convolution kernel to obtain a first convolution result;将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果;Normalize each value included in the first convolution result to obtain a first normalized result;将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵。The first normalized result is activated through a first activation function to obtain a first output matrix.
- 根据权利要求1所述的基于可分割卷积网络的图像识别方法,其中,所述将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵,包括:The image recognition method based on a separable convolutional network according to claim 1, wherein said inputting said first output matrix into a second convolutional network pre-built in the convolutional layer for convolution to obtain a second The output matrix includes:通过1*1的卷积核对所述第一输出矩阵进行卷积,得到第二卷积结果;Convolve the first output matrix with a 1*1 convolution kernel to obtain a second convolution result;将所述第二卷积结果中所包括的每一值进行归一化处理,得到第二归一化结果;Normalize each value included in the second convolution result to obtain a second normalized result;将所述第二归一化结果通过第二激活函数进行激活,以得到第二输出矩阵。The second normalized result is activated through a second activation function to obtain a second output matrix.
- 根据权利要求2所述的基于可分割卷积网络的图像识别方法,其中,所述通过3*3的深度卷积核对所述像素矩阵进行卷积,得到第一卷积结果,包括:The image recognition method based on a separable convolutional network according to claim 2, wherein the convolution of the pixel matrix by a 3*3 deep convolution kernel to obtain the first convolution result comprises:获取所述像素矩阵中的输入通道个数,通过与所述输入通道个数相同个数的3*3的深度卷积核遍历所述像素矩阵进行卷积,得到第一卷积结果。Obtain the number of input channels in the pixel matrix, and traverse the pixel matrix to perform convolution through a 3*3 depth convolution kernel with the same number as the number of input channels to obtain a first convolution result.
- 根据权利要求2所述的基于可分割卷积网络的图像识别方法,其中,所 述将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果,包括:The image recognition method based on a separable convolutional network according to claim 2, wherein the normalization process is performed on each value included in the first convolution result to obtain a first normalization result ,include:获取所述第一卷积结果中所有值对应的第一平均值;Obtaining a first average value corresponding to all values in the first convolution result;获取所述第一卷积结果中所有值对应的第一方差;Obtaining first variances corresponding to all values in the first convolution result;将所述第一卷积结果中每一值均减去第一方差得到的各差值除以所述第一方差,得到第一归一化结果。Divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain a first normalized result.
- 根据权利要求2所述的基于可分割卷积网络的图像识别方法,其中,所述将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵,包括:The image recognition method based on a separable convolutional network according to claim 2, wherein said activating said first normalized result through a first activation function to obtain a first output matrix comprises:通过所述第一激活函数将所述第一归一化结果中的负值进行置零,以激活得到第一输出矩阵。The negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
- 根据权利要求1所述的基于可分割卷积网络的图像识别方法,其中,所述将所述第二输出矩阵输入至池化层进行池化,得到池化结果,包括:The image recognition method based on a separable convolutional network according to claim 1, wherein said inputting said second output matrix to a pooling layer for pooling to obtain a pooling result comprises:将所述第二输出矩阵输入至池化层通过最大值下采样或平均值下采样进行池化,得到池化结果。The second output matrix is input to the pooling layer to perform pooling through maximum down sampling or average down sampling to obtain a pooling result.
- 根据权利要求3所述的基于可分割卷积网络的图像识别方法,其中,所述通过1*1的卷积核对所述第一输出矩阵进行卷积,得到第二卷积结果,包括:The image recognition method based on a separable convolutional network according to claim 3, wherein the convolution of the first output matrix through a 1*1 convolution kernel to obtain a second convolution result comprises:获取所述第一输出矩阵中的输入通道个数,通过与第一输出矩阵中的输入通道个数相同个数的1*1的卷积核遍历所述第一输出矩阵进行卷积,得到第二卷积结果。Obtain the number of input channels in the first output matrix, and traverse the first output matrix to perform convolution through a 1*1 convolution kernel with the same number of input channels as the number of input channels in the first output matrix. 2. Convolution result.
- 根据权利要求1所述的基于可分割卷积网络的图像识别方法,其中,所述将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,包括:The image recognition method based on a separable convolutional network according to claim 1, wherein the inputting the pooling result to the fully connected layer to obtain the recognition result corresponding to the original image data comprises:将所述池化结果输入至全连接层进行全局卷积,得到与所述原始图像数据对应的识别结果。The pooling result is input to a fully connected layer for global convolution, and a recognition result corresponding to the original image data is obtained.
- 一种基于可分割卷积网络的图像识别装置,包括:An image recognition device based on a separable convolutional network, including:图片接收单元,用于接收原始图像数据;Picture receiving unit for receiving original image data;浅层卷积单元,用于将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵;The shallow convolution unit is configured to input the pixel matrix corresponding to the original image data to the first convolution network constructed in the convolution layer for convolution to obtain the first output matrix;深层卷积单元,用于将所述第一输出矩阵输入至卷积层中预先构建的第二 卷积网络进行卷积,得到第二输出矩阵;A deep convolution unit, configured to input the first output matrix into a second convolution network constructed in advance in the convolution layer for convolution to obtain a second output matrix;池化单元,用于将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及A pooling unit for inputting the second output matrix to the pooling layer for pooling, and obtaining a pooling result; and识别结果获取单元,用于将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The recognition result obtaining unit is configured to input the pooling result to the fully connected layer to obtain the recognition result corresponding to the original image data, and send the recognition result to the uploader corresponding to the original image data.
- 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行以下步骤:A computer device includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the following steps:接收原始图像数据;Receive original image data;将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵;Input the pixel matrix corresponding to the original image data to the first convolutional network pre-built in the convolutional layer for convolution to obtain the first output matrix;将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵;Inputting the first output matrix to a second convolutional network pre-built in the convolutional layer for convolution to obtain a second output matrix;将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及Input the second output matrix to the pooling layer for pooling to obtain a pooling result; and将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The pooling result is input to the fully connected layer to obtain the recognition result corresponding to the original image data, and the recognition result is sent to the upload terminal corresponding to the original image data.
- 根据权利要求11所述的计算机设备,其中,所述将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷积网络进行卷积,得到第一输出矩阵,包括:11. The computer device according to claim 11, wherein the input of the pixel matrix corresponding to the original image data to a first convolutional network pre-built in the convolutional layer for convolution to obtain the first output matrix comprises :通过3*3的深度卷积核对所述像素矩阵进行卷积,得到第一卷积结果;Convolve the pixel matrix with a 3*3 deep convolution kernel to obtain a first convolution result;将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果;Normalize each value included in the first convolution result to obtain a first normalized result;将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵。The first normalized result is activated through a first activation function to obtain a first output matrix.
- 根据权利要求11所述的计算机设备,其中,所述将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵,包括:11. The computer device according to claim 11, wherein said inputting said first output matrix to a second convolutional network pre-built in a convolutional layer for convolution to obtain a second output matrix comprises:通过1*1的卷积核对所述第一输出矩阵进行卷积,得到第二卷积结果;Convolve the first output matrix with a 1*1 convolution kernel to obtain a second convolution result;将所述第二卷积结果中所包括的每一值进行归一化处理,得到第二归一化结果;Normalize each value included in the second convolution result to obtain a second normalized result;将所述第二归一化结果通过第二激活函数进行激活,以得到第二输出矩阵。The second normalized result is activated through a second activation function to obtain a second output matrix.
- 根据权利要求12所述的计算机设备,其中,所述通过3*3的深度卷积 核对所述像素矩阵进行卷积,得到第一卷积结果,包括:The computer device according to claim 12, wherein the convolution of the pixel matrix with a 3*3 depth convolution kernel to obtain the first convolution result comprises:获取所述像素矩阵中的输入通道个数,通过与所述输入通道个数相同个数的3*3的深度卷积核遍历所述像素矩阵进行卷积,得到第一卷积结果。Obtain the number of input channels in the pixel matrix, and traverse the pixel matrix to perform convolution through a 3*3 depth convolution kernel with the same number as the number of input channels to obtain a first convolution result.
- 根据权利要求12所述的计算机设备,其中,所述将所述第一卷积结果中所包括的每一值进行归一化处理,得到第一归一化结果,包括:The computer device according to claim 12, wherein said normalizing each value included in said first convolution result to obtain a first normalized result comprises:获取所述第一卷积结果中所有值对应的第一平均值;Obtaining a first average value corresponding to all values in the first convolution result;获取所述第一卷积结果中所有值对应的第一方差;Obtaining first variances corresponding to all values in the first convolution result;将所述第一卷积结果中每一值均减去第一方差得到的各差值除以所述第一方差,得到第一归一化结果。Divide each difference obtained by subtracting the first variance from each value in the first convolution result by the first variance to obtain a first normalized result.
- 根据权利要求12所述的计算机设备,其中,所述将所述第一归一化结果通过第一激活函数进行激活,以得到第一输出矩阵,包括:11. The computer device according to claim 12, wherein said activating said first normalized result through a first activation function to obtain a first output matrix comprises:通过所述第一激活函数将所述第一归一化结果中的负值进行置零,以激活得到第一输出矩阵。The negative value in the first normalized result is set to zero by the first activation function to activate the first output matrix.
- 根据权利要求11所述的计算机设备,其中,所述将所述第二输出矩阵输入至池化层进行池化,得到池化结果,包括:11. The computer device according to claim 11, wherein said inputting said second output matrix to a pooling layer for pooling to obtain a pooling result comprises:将所述第二输出矩阵输入至池化层通过最大值下采样或平均值下采样进行池化,得到池化结果。The second output matrix is input to the pooling layer to perform pooling through maximum down sampling or average down sampling to obtain a pooling result.
- 根据权利要求13所述的计算机设备,其中,所述通过1*1的卷积核对所述第一输出矩阵进行卷积,得到第二卷积结果,包括:The computer device according to claim 13, wherein the convolution of the first output matrix by a 1*1 convolution kernel to obtain a second convolution result comprises:获取所述第一输出矩阵中的输入通道个数,通过与第一输出矩阵中的输入通道个数相同个数的1*1的卷积核遍历所述第一输出矩阵进行卷积,得到第二卷积结果。Obtain the number of input channels in the first output matrix, and traverse the first output matrix to perform convolution through a 1*1 convolution kernel with the same number of input channels as the number of input channels in the first output matrix. 2. Convolution result.
- 根据权利要求11所述的计算机设备,其中,所述将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,包括:11. The computer device according to claim 11, wherein said inputting said pooling result to a fully connected layer to obtain a recognition result corresponding to said original image data comprises:将所述池化结果输入至全连接层进行全局卷积,得到与所述原始图像数据对应的识别结果。The pooling result is input to a fully connected layer for global convolution, and a recognition result corresponding to the original image data is obtained.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行以下操作:A computer-readable storage medium that stores a computer program that, when executed by a processor, causes the processor to perform the following operations:接收原始图像数据;Receive original image data;将与所述原始图像数据对应的像素矩阵输入至卷积层中预先构建的第一卷 积网络进行卷积,得到第一输出矩阵;Inputting the pixel matrix corresponding to the original image data to the first convolutional network pre-built in the convolutional layer for convolution to obtain the first output matrix;将所述第一输出矩阵输入至卷积层中预先构建的第二卷积网络进行卷积,得到第二输出矩阵;Inputting the first output matrix to a second convolutional network pre-built in the convolutional layer for convolution to obtain a second output matrix;将所述第二输出矩阵输入至池化层进行池化,得到池化结果;以及Input the second output matrix to the pooling layer for pooling to obtain a pooling result; and将所述池化结果输入至全连接层,得到与所述原始图像数据对应的识别结果,将所述识别结果发送至所述原始图像数据对应的上传端。The pooling result is input to the fully connected layer to obtain the recognition result corresponding to the original image data, and the recognition result is sent to the upload terminal corresponding to the original image data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910433281.2A CN110298346A (en) | 2019-05-23 | 2019-05-23 | Image-recognizing method, device and computer equipment based on divisible convolutional network |
CN201910433281.2 | 2019-05-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020233010A1 true WO2020233010A1 (en) | 2020-11-26 |
Family
ID=68027095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/117743 WO2020233010A1 (en) | 2019-05-23 | 2019-11-13 | Image recognition method and apparatus based on segmentable convolutional network, and computer device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110298346A (en) |
WO (1) | WO2020233010A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112819199A (en) * | 2020-12-31 | 2021-05-18 | 上海眼控科技股份有限公司 | Precipitation prediction method, device, equipment and storage medium |
CN112819006A (en) * | 2020-12-31 | 2021-05-18 | 北京声智科技有限公司 | Image processing method and device and electronic equipment |
CN113344092A (en) * | 2021-06-18 | 2021-09-03 | 中科迈航信息技术有限公司 | AI image recognition method and device |
CN113591987A (en) * | 2021-07-30 | 2021-11-02 | 金地(集团)股份有限公司 | Image recognition method, image recognition device, electronic equipment and medium |
CN113989940A (en) * | 2021-11-17 | 2022-01-28 | 中国科学技术大学 | Method, system, equipment and storage medium for recognizing actions in video data |
CN114170582A (en) * | 2021-12-10 | 2022-03-11 | 智道网联科技(北京)有限公司 | Guideboard angular point identification method, device, equipment and storage medium |
CN114387504A (en) * | 2021-12-08 | 2022-04-22 | 深圳供电局有限公司 | Distribution network cable identification method and device based on semantic segmentation |
CN114580487A (en) * | 2020-11-30 | 2022-06-03 | 深圳市瑞图生物技术有限公司 | Chromosome recognition method, device, equipment and storage medium based on deep learning |
CN114758304A (en) * | 2022-06-13 | 2022-07-15 | 江苏中腾石英材料科技股份有限公司 | High-purity rounded quartz powder sieving equipment and sieving control method thereof |
CN115151950A (en) * | 2022-06-02 | 2022-10-04 | 深圳市正浩创新科技股份有限公司 | Image recognition method, self-moving device and storage medium |
CN115987511A (en) * | 2023-03-07 | 2023-04-18 | 北京数牍科技有限公司 | Image reasoning method and device, electronic equipment and computer readable storage medium |
CN115984105A (en) * | 2022-12-07 | 2023-04-18 | 深圳大学 | Method and device for optimizing hole convolution, computer equipment and storage medium |
CN116433661A (en) * | 2023-06-12 | 2023-07-14 | 锋睿领创(珠海)科技有限公司 | Method, device, equipment and medium for detecting semiconductor wafer by multitasking |
CN112686936B (en) * | 2020-12-18 | 2023-08-04 | 北京百度网讯科技有限公司 | Image depth completion method, apparatus, computer device, medium, and program product |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298346A (en) * | 2019-05-23 | 2019-10-01 | 平安科技(深圳)有限公司 | Image-recognizing method, device and computer equipment based on divisible convolutional network |
CN115049831A (en) * | 2019-09-20 | 2022-09-13 | 成都芯云微电子有限公司 | Bottleneck edge shrinkage model, neuron network and construction method thereof |
CN111008924B (en) * | 2019-12-02 | 2023-09-12 | 西安交通大学深圳研究院 | Image processing method and device, electronic equipment and storage medium |
CN111833360B (en) * | 2020-07-14 | 2024-03-26 | 腾讯科技(深圳)有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN117492899B (en) * | 2024-01-02 | 2024-04-09 | 中移(苏州)软件技术有限公司 | Instant transmission and display method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446937A (en) * | 2016-09-08 | 2017-02-22 | 天津大学 | Multi-convolution identifying system for AER image sensor |
CN106599900A (en) * | 2015-10-20 | 2017-04-26 | 华中科技大学 | Method and device for recognizing character string in image |
US9984325B1 (en) * | 2017-10-04 | 2018-05-29 | StradVision, Inc. | Learning method and learning device for improving performance of CNN by using feature upsampling networks, and testing method and testing device using the same |
CN109711422A (en) * | 2017-10-26 | 2019-05-03 | 北京邮电大学 | Image real time transfer, the method for building up of model, device, computer equipment and storage medium |
CN110298346A (en) * | 2019-05-23 | 2019-10-01 | 平安科技(深圳)有限公司 | Image-recognizing method, device and computer equipment based on divisible convolutional network |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256544B (en) * | 2016-12-29 | 2019-07-23 | 杭州光启人工智能研究院 | Picture classification method and device, robot |
CN107909016B (en) * | 2017-11-03 | 2020-09-01 | 车智互联(北京)科技有限公司 | Convolutional neural network generation method and vehicle system identification method |
CN109033940B (en) * | 2018-06-04 | 2019-07-05 | 上海依图网络科技有限公司 | A kind of image-recognizing method, calculates equipment and storage medium at device |
-
2019
- 2019-05-23 CN CN201910433281.2A patent/CN110298346A/en active Pending
- 2019-11-13 WO PCT/CN2019/117743 patent/WO2020233010A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599900A (en) * | 2015-10-20 | 2017-04-26 | 华中科技大学 | Method and device for recognizing character string in image |
CN106446937A (en) * | 2016-09-08 | 2017-02-22 | 天津大学 | Multi-convolution identifying system for AER image sensor |
US9984325B1 (en) * | 2017-10-04 | 2018-05-29 | StradVision, Inc. | Learning method and learning device for improving performance of CNN by using feature upsampling networks, and testing method and testing device using the same |
CN109711422A (en) * | 2017-10-26 | 2019-05-03 | 北京邮电大学 | Image real time transfer, the method for building up of model, device, computer equipment and storage medium |
CN110298346A (en) * | 2019-05-23 | 2019-10-01 | 平安科技(深圳)有限公司 | Image-recognizing method, device and computer equipment based on divisible convolutional network |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114580487A (en) * | 2020-11-30 | 2022-06-03 | 深圳市瑞图生物技术有限公司 | Chromosome recognition method, device, equipment and storage medium based on deep learning |
CN112686936B (en) * | 2020-12-18 | 2023-08-04 | 北京百度网讯科技有限公司 | Image depth completion method, apparatus, computer device, medium, and program product |
CN112819006A (en) * | 2020-12-31 | 2021-05-18 | 北京声智科技有限公司 | Image processing method and device and electronic equipment |
CN112819199A (en) * | 2020-12-31 | 2021-05-18 | 上海眼控科技股份有限公司 | Precipitation prediction method, device, equipment and storage medium |
CN112819006B (en) * | 2020-12-31 | 2023-12-22 | 北京声智科技有限公司 | Image processing method and device and electronic equipment |
CN113344092A (en) * | 2021-06-18 | 2021-09-03 | 中科迈航信息技术有限公司 | AI image recognition method and device |
CN113344092B (en) * | 2021-06-18 | 2022-10-11 | 中科迈航信息技术有限公司 | AI image recognition method and terminal device |
CN113591987A (en) * | 2021-07-30 | 2021-11-02 | 金地(集团)股份有限公司 | Image recognition method, image recognition device, electronic equipment and medium |
CN113591987B (en) * | 2021-07-30 | 2023-12-12 | 金地(集团)股份有限公司 | Image recognition method, device, electronic equipment and medium |
CN113989940A (en) * | 2021-11-17 | 2022-01-28 | 中国科学技术大学 | Method, system, equipment and storage medium for recognizing actions in video data |
CN113989940B (en) * | 2021-11-17 | 2024-03-29 | 中国科学技术大学 | Method, system, device and storage medium for identifying actions in video data |
CN114387504A (en) * | 2021-12-08 | 2022-04-22 | 深圳供电局有限公司 | Distribution network cable identification method and device based on semantic segmentation |
CN114170582A (en) * | 2021-12-10 | 2022-03-11 | 智道网联科技(北京)有限公司 | Guideboard angular point identification method, device, equipment and storage medium |
CN115151950A (en) * | 2022-06-02 | 2022-10-04 | 深圳市正浩创新科技股份有限公司 | Image recognition method, self-moving device and storage medium |
CN114758304B (en) * | 2022-06-13 | 2022-09-02 | 江苏中腾石英材料科技股份有限公司 | High-purity rounded quartz powder sieving equipment and sieving control method thereof |
CN114758304A (en) * | 2022-06-13 | 2022-07-15 | 江苏中腾石英材料科技股份有限公司 | High-purity rounded quartz powder sieving equipment and sieving control method thereof |
CN115984105A (en) * | 2022-12-07 | 2023-04-18 | 深圳大学 | Method and device for optimizing hole convolution, computer equipment and storage medium |
CN115987511A (en) * | 2023-03-07 | 2023-04-18 | 北京数牍科技有限公司 | Image reasoning method and device, electronic equipment and computer readable storage medium |
CN116433661A (en) * | 2023-06-12 | 2023-07-14 | 锋睿领创(珠海)科技有限公司 | Method, device, equipment and medium for detecting semiconductor wafer by multitasking |
CN116433661B (en) * | 2023-06-12 | 2023-08-18 | 锋睿领创(珠海)科技有限公司 | Method, device, equipment and medium for detecting semiconductor wafer by multitasking |
Also Published As
Publication number | Publication date |
---|---|
CN110298346A (en) | 2019-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020233010A1 (en) | Image recognition method and apparatus based on segmentable convolutional network, and computer device | |
WO2020228522A1 (en) | Target tracking method and apparatus, storage medium and electronic device | |
US20230401446A1 (en) | Convolutional neural network pruning processing method, data processing method, and device | |
US11443438B2 (en) | Network module and distribution method and apparatus, electronic device, and storage medium | |
WO2022105125A1 (en) | Image segmentation method and apparatus, computer device, and storage medium | |
CN111950723A (en) | Neural network model training method, image processing method, device and terminal equipment | |
WO2020228181A1 (en) | Palm image cropping method and apparatus, computer device and storage medium | |
CN109766925B (en) | Feature fusion method and device, electronic equipment and storage medium | |
CN107909016A (en) | A kind of convolutional neural networks generation method and the recognition methods of car system | |
CN113379627A (en) | Training method of image enhancement model and method for enhancing image | |
CN110211017B (en) | Image processing method and device and electronic equipment | |
CN112668588A (en) | Parking space information generation method, device, equipment and computer readable medium | |
US20240135698A1 (en) | Image classification method, model training method, device, storage medium, and computer program | |
CN110489955B (en) | Image processing, device, computing device and medium applied to electronic equipment | |
TW202240535A (en) | Method and system for image processing | |
CN113421267B (en) | Point cloud semantic and instance joint segmentation method and system based on improved PointConv | |
US10789678B2 (en) | Superpixel sampling networks | |
CN112967299B (en) | Image cropping method and device, electronic equipment and computer readable medium | |
WO2022116492A1 (en) | Image template selection method and apparatus, device and storage medium | |
CN113297973A (en) | Key point detection method, device, equipment and computer readable medium | |
JP7150896B2 (en) | Face recognition method and device, electronic device, and storage medium | |
CN113327194A (en) | Image style migration method, device, equipment and storage medium | |
CN112036316A (en) | Finger vein identification method and device, electronic equipment and readable storage medium | |
CN115830715A (en) | Unmanned vehicle control method, device and equipment based on gesture recognition | |
CN115546554A (en) | Sensitive image identification method, device, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19929524 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19929524 Country of ref document: EP Kind code of ref document: A1 |