WO2022095584A1 - 一种基于流卷积的图像识别方法 - Google Patents
一种基于流卷积的图像识别方法 Download PDFInfo
- Publication number
- WO2022095584A1 WO2022095584A1 PCT/CN2021/117028 CN2021117028W WO2022095584A1 WO 2022095584 A1 WO2022095584 A1 WO 2022095584A1 CN 2021117028 W CN2021117028 W CN 2021117028W WO 2022095584 A1 WO2022095584 A1 WO 2022095584A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- convolution
- stream
- image recognition
- stream convolution
- method based
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013507 mapping Methods 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 abstract description 4
- 238000004220 aggregation Methods 0.000 abstract description 4
- 238000004364 calculation method Methods 0.000 abstract description 3
- 230000002265 prevention Effects 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the invention relates to the field of artificial intelligence, in particular to an image recognition method based on stream convolution, which belongs to a feature extraction network of a convolutional neural network.
- CSPNet A New Backbone that can Enhance Learning Capability of CNN
- CSPNet cross-stage local network
- the technical problem to be solved by the present invention is to provide an image recognition method based on stream convolution.
- the channel path aggregation method is used to enable the sub-group channel feature information to be well communicated with each other. , so as to improve the speed of image recognition and reduce the computational cost.
- the technical solution adopted in the present invention is: an image recognition method based on stream convolution, comprising the following steps:
- the input feature enters the image recognition network, and the input feature is X (h, w, c) , where h represents the height of the input feature, w represents the width of the input feature, and c represents the number of channels of the input feature;
- each sub-group performs convolution operation mapping, which is expressed as:
- F i represents the normal convolution operation map, i ⁇ [1,2,...,g];
- the image recognition network outputs the image recognition result.
- the pairwise connection mode of the stream convolution is: the channel feature information flow mode of the first stream convolution is top-down, and the channel feature information flow mode of the second stream convolution is bottom-up.
- the pairwise connection mode of the stream convolution is: the channel feature information flow mode of the first stream convolution is bottom-up, and the channel feature information flow mode of the second stream convolution is top-down.
- the pairwise connection mode of the stream convolution is: the channel feature information circulation mode of the first stream convolution is bottom-up, and the channel feature information circulation mode of the second stream convolution is top-down.
- pairwise connection mode of the stream convolution is as follows: the channel feature information flow modes of the two stream convolutions are both top-down.
- the pairwise connection mode of the stream convolution is as follows: the channel feature information flow mode of the two stream convolutions is bottom-up.
- the present invention discloses an image recognition method based on stream convolution.
- the stream convolution utilizes channel path aggregation operations on the basis of packet convolution, that is, connects the current sub-group input feature with the previous layer
- the subgroup output feature is used to obtain the subgroup output feature of the current layer to strengthen the feature information flow between subgroups.
- Target prevention based on stream convolution not only effectively reduces network parameters and computational costs, but also further improves image recognition accuracy.
- Figure 1 is a schematic diagram of a stream convolution connection
- Figure 2 is a schematic diagram of an image recognition network structure.
- the present embodiment discloses an image recognition method based on stream convolution, comprising the following steps:
- the input feature enters the image recognition network, and the input feature is X (h, w, c) , where h represents the height of the input feature, w represents the width of the input feature, and c represents the number of channels of the input feature;
- the image recognition network is a classical network residual network (ResNet50), and FIG. 2 is a schematic structural diagram of the ResNet50 network.
- ResNet50 classical network residual network
- each sub-group performs convolution operation mapping, which is expressed as:
- F i represents the normal convolution operation map, i ⁇ [1,2,...,g];
- the image recognition network outputs the image recognition result.
- the method described in this embodiment mainly replaces all the convolutional layers of ResNet50 with stride 1 with stream convolution, downsampling layer (convolutional layer with stride 2), pooling layer, average pooling, fully connected layer and The final output layers remain unchanged, so these operations are detailed in this example.
- the pairwise connection mode of the stream convolution can be any one of those shown in Figure 2.
- connection methods of stream convolution As shown in Figure 1, which are:
- Flow Type A The channel feature information flow mode of the first flow convolution is top-down, and the channel feature information flow mode of the second flow convolution is bottom-up.
- Flow Type B The channel feature information flow mode of the first flow convolution is bottom-up, and the channel feature information flow mode of the second flow convolution is top-down.
- Flow Type C The channel feature information flow mode of the first flow convolution is bottom-up, and the channel feature information flow mode of the second flow convolution is top-down.
- Flow Type D The channel feature information flow of the two flow convolutions is bottom-up.
- the target prediction method described in this embodiment uses the channel path aggregation operation on the basis of grouping convolution, that is, connecting the input features of the current subgroup and the output features of the previous subgroup to obtain the output features of the subgroup of the current layer, to strengthen each subgroup Feature information flow between packets.
- Target prevention based on stream convolution not only effectively reduces network parameters and computational costs, but also further improves image recognition accuracy.
Abstract
一种基于流卷积的图像识别方法,所述流卷积在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。
Description
本发明涉及人工智能领域,具体地说,是一种基于流卷积的图像识别方法,属于卷积神经网络的特征提取网络。
近年来,卷积神经网络在各类计算视觉任务中取得重大突破。卷积神经网络的设计变得越来越复杂。然而,在现实的应用场景中,受限于有限设备计算资源,分组卷积的方式越来越受到大家的关注。尽管这种分组卷积的方式可以有效的减少网络参数量以及计算成本,但是这种分离通道特征的方式会导致组通道特征信息无法有效的流通起来,从而导致网络性能下降明显。
论文《MobileDets:Searching for Object Detection Architectures for Mobile Accelerators》中利用深度分离卷积来减少网络参数以及计算成本,但是该方法整体性能下降也很明显。
论文《CSPNet:A New Backbone that can Enhance Learning Capability of CNN》中提出了跨阶段局部网络(CSPNet),以缓解以往的工作需要从网络架构的角度进行大量推理计算的问题。但是该方法仍然基于传统的卷积操作进行网络设计,很难进一步减少模型参数与计算成本。
发明内容
本发明要解决的技术问题是提供一种基于流卷积的图像识别方法,在减少参数以及计算成本的前提下,利用通道路径聚合的方式,使得子组通道特征信息能够很好地相互流通起来,从而提高图像识别的速度、减低计算成本。
为了解决所述技术问题,本发明采用的技术方案是:一种基于流卷积的图像识别方法,包括以下步骤:
S01)、输入特征进入图像识别网络,输入特征为X
(h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;
S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:
c
1+c
2+...+c
g=c;
S22)、各子分组执行卷积操作映射,用公式表示为:
其中,F
i表示正常的卷积操作映射,i∈[1,2,...,g];
S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y
1,Y
2,...Y
g];
S03)、图像识别网络输出图像识别结果。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向上。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
进一步的,流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自顶向下。
进一步的,流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自底向上。
本发明的有益效果:本发明公开一种基于流卷积的图像识别方法,所述流卷积在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。
图1为流卷积连接方式示意图;
图2为图像识别网络结构示意图。
下面结合附图和具体实施例对本发明作进一步的说明。
实施例1
本实施例公开一种基于流卷积的图像识别方法,包括以下步骤:
S01)、输入特征进入图像识别网络,输入特征为X
(h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;
本实施例中,所述图像识别网络为经典网络残差网络(ResNet50),图2为ResNet50网络的结构示意图。
S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:
c
1+c
2+...+c
g=c;
S22)、各子分组执行卷积操作映射,用公式表示为:
其中,F
i表示正常的卷积操作映射,i∈[1,2,...,g];
S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y
1,Y
2,...Y
g];
S03)、图像识别网络输出图像识别结果。
本实施例所述方法主要对ResNet50步长为1的卷积层全部替换为流卷积,下采样层(步长为2的卷积层)、池化层、平均池化、全连接层以及最后的输出层均保持不变,因此本实施例对这些操作详述。流卷积的两两连接方式采用图2中的任意一种即可。
本实施例中,流卷积的两两连接方式有4种,如图1所示,分别为:
Flow Type A:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向上。
Flow Type B:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
Flow Type C:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
Flow Type D:两个流卷积的通道特征信息流通方式均为自底向上。
本实施例所述目标预测方法在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得 到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。
以上描述的仅是本发明的基本原理和优选实施例,本领域技术人员根据本发明作出的改进或替换,属于本发明的保护范围。
Claims (6)
- 一种基于流卷积的图像识别方法,其特征在于:包括以下步骤:S01)、输入特征进入图像识别网络,输入特征为X (h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:c 1+c 2+...+c g=c;S22)、各子分组执行卷积操作映射,用公式表示为:其中,F i表示正常的卷积操作映射,i∈[1,2,...,g];S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y 1,Y 2,...Y g];S03)、图像识别网络输出图像识别结果。
- 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向 上。
- 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
- 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自顶向下。
- 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自底向上。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011235520.2A CN112288028A (zh) | 2020-11-06 | 2020-11-06 | 一种基于流卷积的图像识别方法 |
CN202011235520.2 | 2020-11-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022095584A1 true WO2022095584A1 (zh) | 2022-05-12 |
Family
ID=74350767
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/117028 WO2022095584A1 (zh) | 2020-11-06 | 2021-09-07 | 一种基于流卷积的图像识别方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112288028A (zh) |
WO (1) | WO2022095584A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112288028A (zh) * | 2020-11-06 | 2021-01-29 | 神思电子技术股份有限公司 | 一种基于流卷积的图像识别方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3065084A1 (en) * | 2015-03-06 | 2016-09-07 | Panasonic Intellectual Property Management Co., Ltd. | Image recognition method, image recognition device, and recording medium |
CN108009594A (zh) * | 2017-12-25 | 2018-05-08 | 北京航空航天大学 | 一种基于变分组卷积的图像识别方法 |
CN110991418A (zh) * | 2019-12-23 | 2020-04-10 | 中国科学院自动化研究所 | 合成孔径雷达目标图像识别方法及系统 |
CN111652236A (zh) * | 2020-04-21 | 2020-09-11 | 东南大学 | 弱监督场景下跨层特征交互的轻量级细粒度图像识别方法 |
CN112288028A (zh) * | 2020-11-06 | 2021-01-29 | 神思电子技术股份有限公司 | 一种基于流卷积的图像识别方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7391883B2 (ja) * | 2018-09-13 | 2023-12-05 | インテル コーポレイション | 顔認識のための圧縮-拡張深さ方向畳み込みニューラルネットワーク |
CN110782001B (zh) * | 2019-09-11 | 2024-04-09 | 东南大学 | 一种基于组卷积神经网络使用共享卷积核的改进方法 |
CN110728354B (zh) * | 2019-09-11 | 2024-04-09 | 东南大学 | 一种基于改进的滑动式分组卷积神经网络的图像处理方法 |
CN110647893B (zh) * | 2019-09-20 | 2022-04-05 | 北京地平线机器人技术研发有限公司 | 目标对象识别方法、装置、存储介质和设备 |
CN116416561A (zh) * | 2019-11-22 | 2023-07-11 | 迪爱斯信息技术股份有限公司 | 一种视频图像处理方法和装置 |
-
2020
- 2020-11-06 CN CN202011235520.2A patent/CN112288028A/zh active Pending
-
2021
- 2021-09-07 WO PCT/CN2021/117028 patent/WO2022095584A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3065084A1 (en) * | 2015-03-06 | 2016-09-07 | Panasonic Intellectual Property Management Co., Ltd. | Image recognition method, image recognition device, and recording medium |
CN108009594A (zh) * | 2017-12-25 | 2018-05-08 | 北京航空航天大学 | 一种基于变分组卷积的图像识别方法 |
CN110991418A (zh) * | 2019-12-23 | 2020-04-10 | 中国科学院自动化研究所 | 合成孔径雷达目标图像识别方法及系统 |
CN111652236A (zh) * | 2020-04-21 | 2020-09-11 | 东南大学 | 弱监督场景下跨层特征交互的轻量级细粒度图像识别方法 |
CN112288028A (zh) * | 2020-11-06 | 2021-01-29 | 神思电子技术股份有限公司 | 一种基于流卷积的图像识别方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112288028A (zh) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022151535A1 (zh) | 基于深度学习的人脸特征点检测方法 | |
CN109214353B (zh) | 一种基于剪枝模型的人脸图像快速检测训练方法和装置 | |
CN107239736A (zh) | 基于多任务级联卷积神经网络的人脸检测方法及检测装置 | |
CN106709478A (zh) | 一种行人图像特征分类方法和系统 | |
WO2022095584A1 (zh) | 一种基于流卷积的图像识别方法 | |
WO2022095583A1 (zh) | 一种基于流卷积的目标检测方法 | |
CN111402126A (zh) | 一种基于分块的视频超分辨率方法和系统 | |
Cai et al. | Softer pruning, incremental regularization | |
CN114419413A (zh) | 感受野自适应的变电站绝缘子缺陷检测神经网络构建方法 | |
CN112464954A (zh) | 一种应用于嵌入式设备的轻量级目标检测网络及训练方法 | |
Lou et al. | AR-C3D: Action recognition accelerator for human-computer interaction on FPGA | |
CN111931551B (zh) | 一种基于轻量级级联网络的人脸检测方法 | |
WO2022120988A1 (zh) | 基于混合2d卷积和伪3d卷积的立体匹配方法 | |
CN104348695A (zh) | 一种基于人工免疫系统的虚拟网络映射方法及其系统 | |
CN112241959A (zh) | 基于超像素的注意力机制生成语义分割方法 | |
CN114937153B (zh) | 弱纹理环境下基于神经网络的视觉特征处理系统及方法 | |
Chen et al. | Reweighted dynamic group convolution | |
CN113723474A (zh) | 一种跨通道聚合相似性网络系统 | |
CN114202071B (zh) | 一种基于数据流模式的深度卷积神经网络推理加速方法 | |
Zhu et al. | Semantic segmentation of urban street scene images based on improved U-Net network | |
Shuangyan et al. | Lighter and Faster Face Mask Detection Method Based on YOLOv5 | |
CN111507984B (zh) | 一种基于多接受野的交替更新网络的场景分割方法 | |
Tu et al. | Lightweight object detection algorithm for automatic driving scenarios | |
Luo et al. | DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications | |
Cheng et al. | Accelerate Multi-view Inference with End-edge Collaborative Computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21888283 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21888283 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.11.2023) |