WO2022095584A1 - 一种基于流卷积的图像识别方法 - Google Patents

一种基于流卷积的图像识别方法 Download PDF

Info

Publication number
WO2022095584A1
WO2022095584A1 PCT/CN2021/117028 CN2021117028W WO2022095584A1 WO 2022095584 A1 WO2022095584 A1 WO 2022095584A1 CN 2021117028 W CN2021117028 W CN 2021117028W WO 2022095584 A1 WO2022095584 A1 WO 2022095584A1
Authority
WO
WIPO (PCT)
Prior art keywords
convolution
stream
image recognition
stream convolution
method based
Prior art date
Application number
PCT/CN2021/117028
Other languages
English (en)
French (fr)
Inventor
陈英鹏
许野平
井焜
刘辰飞
席道亮
高朋
张朝瑞
Original Assignee
神思电子技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 神思电子技术股份有限公司 filed Critical 神思电子技术股份有限公司
Publication of WO2022095584A1 publication Critical patent/WO2022095584A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the invention relates to the field of artificial intelligence, in particular to an image recognition method based on stream convolution, which belongs to a feature extraction network of a convolutional neural network.
  • CSPNet A New Backbone that can Enhance Learning Capability of CNN
  • CSPNet cross-stage local network
  • the technical problem to be solved by the present invention is to provide an image recognition method based on stream convolution.
  • the channel path aggregation method is used to enable the sub-group channel feature information to be well communicated with each other. , so as to improve the speed of image recognition and reduce the computational cost.
  • the technical solution adopted in the present invention is: an image recognition method based on stream convolution, comprising the following steps:
  • the input feature enters the image recognition network, and the input feature is X (h, w, c) , where h represents the height of the input feature, w represents the width of the input feature, and c represents the number of channels of the input feature;
  • each sub-group performs convolution operation mapping, which is expressed as:
  • F i represents the normal convolution operation map, i ⁇ [1,2,...,g];
  • the image recognition network outputs the image recognition result.
  • the pairwise connection mode of the stream convolution is: the channel feature information flow mode of the first stream convolution is top-down, and the channel feature information flow mode of the second stream convolution is bottom-up.
  • the pairwise connection mode of the stream convolution is: the channel feature information flow mode of the first stream convolution is bottom-up, and the channel feature information flow mode of the second stream convolution is top-down.
  • the pairwise connection mode of the stream convolution is: the channel feature information circulation mode of the first stream convolution is bottom-up, and the channel feature information circulation mode of the second stream convolution is top-down.
  • pairwise connection mode of the stream convolution is as follows: the channel feature information flow modes of the two stream convolutions are both top-down.
  • the pairwise connection mode of the stream convolution is as follows: the channel feature information flow mode of the two stream convolutions is bottom-up.
  • the present invention discloses an image recognition method based on stream convolution.
  • the stream convolution utilizes channel path aggregation operations on the basis of packet convolution, that is, connects the current sub-group input feature with the previous layer
  • the subgroup output feature is used to obtain the subgroup output feature of the current layer to strengthen the feature information flow between subgroups.
  • Target prevention based on stream convolution not only effectively reduces network parameters and computational costs, but also further improves image recognition accuracy.
  • Figure 1 is a schematic diagram of a stream convolution connection
  • Figure 2 is a schematic diagram of an image recognition network structure.
  • the present embodiment discloses an image recognition method based on stream convolution, comprising the following steps:
  • the input feature enters the image recognition network, and the input feature is X (h, w, c) , where h represents the height of the input feature, w represents the width of the input feature, and c represents the number of channels of the input feature;
  • the image recognition network is a classical network residual network (ResNet50), and FIG. 2 is a schematic structural diagram of the ResNet50 network.
  • ResNet50 classical network residual network
  • each sub-group performs convolution operation mapping, which is expressed as:
  • F i represents the normal convolution operation map, i ⁇ [1,2,...,g];
  • the image recognition network outputs the image recognition result.
  • the method described in this embodiment mainly replaces all the convolutional layers of ResNet50 with stride 1 with stream convolution, downsampling layer (convolutional layer with stride 2), pooling layer, average pooling, fully connected layer and The final output layers remain unchanged, so these operations are detailed in this example.
  • the pairwise connection mode of the stream convolution can be any one of those shown in Figure 2.
  • connection methods of stream convolution As shown in Figure 1, which are:
  • Flow Type A The channel feature information flow mode of the first flow convolution is top-down, and the channel feature information flow mode of the second flow convolution is bottom-up.
  • Flow Type B The channel feature information flow mode of the first flow convolution is bottom-up, and the channel feature information flow mode of the second flow convolution is top-down.
  • Flow Type C The channel feature information flow mode of the first flow convolution is bottom-up, and the channel feature information flow mode of the second flow convolution is top-down.
  • Flow Type D The channel feature information flow of the two flow convolutions is bottom-up.
  • the target prediction method described in this embodiment uses the channel path aggregation operation on the basis of grouping convolution, that is, connecting the input features of the current subgroup and the output features of the previous subgroup to obtain the output features of the subgroup of the current layer, to strengthen each subgroup Feature information flow between packets.
  • Target prevention based on stream convolution not only effectively reduces network parameters and computational costs, but also further improves image recognition accuracy.

Abstract

一种基于流卷积的图像识别方法,所述流卷积在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。

Description

一种基于流卷积的图像识别方法 技术领域
本发明涉及人工智能领域,具体地说,是一种基于流卷积的图像识别方法,属于卷积神经网络的特征提取网络。
背景技术
近年来,卷积神经网络在各类计算视觉任务中取得重大突破。卷积神经网络的设计变得越来越复杂。然而,在现实的应用场景中,受限于有限设备计算资源,分组卷积的方式越来越受到大家的关注。尽管这种分组卷积的方式可以有效的减少网络参数量以及计算成本,但是这种分离通道特征的方式会导致组通道特征信息无法有效的流通起来,从而导致网络性能下降明显。
论文《MobileDets:Searching for Object Detection Architectures for Mobile Accelerators》中利用深度分离卷积来减少网络参数以及计算成本,但是该方法整体性能下降也很明显。
论文《CSPNet:A New Backbone that can Enhance Learning Capability of CNN》中提出了跨阶段局部网络(CSPNet),以缓解以往的工作需要从网络架构的角度进行大量推理计算的问题。但是该方法仍然基于传统的卷积操作进行网络设计,很难进一步减少模型参数与计算成本。
发明内容
本发明要解决的技术问题是提供一种基于流卷积的图像识别方法,在减少参数以及计算成本的前提下,利用通道路径聚合的方式,使得子组通道特征信息能够很好地相互流通起来,从而提高图像识别的速度、减低计算成本。
为了解决所述技术问题,本发明采用的技术方案是:一种基于流卷积的图像识别方法,包括以下步骤:
S01)、输入特征进入图像识别网络,输入特征为X (h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;
S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:
S21)、将执行流卷积操作的卷积层拆分为g个子分组,每个子分组对应的输入特征为
Figure PCTCN2021117028-appb-000001
各子分组的宽高与输入特征宽高保持一致,各子分组通道数与输入特征通道数的关系为:
c 1+c 2+...+c g=c;
S22)、各子分组执行卷积操作映射,用公式表示为:
Figure PCTCN2021117028-appb-000002
其中,F i表示正常的卷积操作映射,i∈[1,2,...,g];
S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y 1,Y 2,...Y g];
S03)、图像识别网络输出图像识别结果。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向上。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
进一步的,流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
进一步的,流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自顶向下。
进一步的,流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自底向上。
进一步的,
Figure PCTCN2021117028-appb-000003
本发明的有益效果:本发明公开一种基于流卷积的图像识别方法,所述流卷积在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。
附图说明
图1为流卷积连接方式示意图;
图2为图像识别网络结构示意图。
具体实施方式
下面结合附图和具体实施例对本发明作进一步的说明。
实施例1
本实施例公开一种基于流卷积的图像识别方法,包括以下步骤:
S01)、输入特征进入图像识别网络,输入特征为X (h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;
本实施例中,所述图像识别网络为经典网络残差网络(ResNet50),图2为ResNet50网络的结构示意图。
S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:
S21)、将执行流卷积操作的卷积层拆分为g个子分组,每个子分组对应的输入特征为
Figure PCTCN2021117028-appb-000004
各子分组的宽高与输入特征宽高保持一致,各子分组通道数与输入特征通道数的关系为:
c 1+c 2+...+c g=c;
流卷积的分组方式有多种,可以使得每个子分组的通道数相等或者不相等均可。为了便于计算,本方法中选取各子分组通道均相等的方式,即:
Figure PCTCN2021117028-appb-000005
S22)、各子分组执行卷积操作映射,用公式表示为:
Figure PCTCN2021117028-appb-000006
其中,F i表示正常的卷积操作映射,i∈[1,2,...,g];
S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y 1,Y 2,...Y g];
S03)、图像识别网络输出图像识别结果。
本实施例所述方法主要对ResNet50步长为1的卷积层全部替换为流卷积,下采样层(步长为2的卷积层)、池化层、平均池化、全连接层以及最后的输出层均保持不变,因此本实施例对这些操作详述。流卷积的两两连接方式采用图2中的任意一种即可。
本实施例中,流卷积的两两连接方式有4种,如图1所示,分别为:
Flow Type A:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向上。
Flow Type B:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
Flow Type C:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
Flow Type D:两个流卷积的通道特征信息流通方式均为自底向上。
本实施例所述目标预测方法在分组卷积的基础上,利用通道路径聚合操作,即连接当前子分组输入特征与上一层子分组输出特征去得 到当前层子分组输出特征,去加强各子分组之间的特征信息流通。基于流卷积的目标预防不仅有效减少了网络参数与计算成本,而且使得图像识别精度得到进一步提升。
以上描述的仅是本发明的基本原理和优选实施例,本领域技术人员根据本发明作出的改进或替换,属于本发明的保护范围。

Claims (6)

  1. 一种基于流卷积的图像识别方法,其特征在于:包括以下步骤:
    S01)、输入特征进入图像识别网络,输入特征为X (h,w,c),其中h表示输入特征的高度,w表示输入特征的宽度,c表示输入特征的通道数;
    S02)、图像识别网络中步长为1的卷积层的卷积操作替换成流卷积操作,流卷积操作为:
    S21)、将执行流卷积操作的输入特征拆分为g个子分组,每个子分组对应的输入特征为
    Figure PCTCN2021117028-appb-100001
    各子分组的宽高与输入特征宽高保持一致,各子分组通道数与输入特征通道数的关系为:
    c 1+c 2+...+c g=c;
    S22)、各子分组执行卷积操作映射,用公式表示为:
    Figure PCTCN2021117028-appb-100002
    其中,F i表示正常的卷积操作映射,i∈[1,2,...,g];
    S23)、流卷积最后的输出特征为各子分组输出特征的连接,即Y=[Y 1,Y 2,...Y g];
    S03)、图像识别网络输出图像识别结果。
  2. 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自顶向下,第二个流卷积的通道特征信息流通方式为自底向 上。
  3. 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:第一个流卷积的通道特征信息流通方式为自底向上,第二个流卷积的通道特征信息流通方式为自顶向下。
  4. 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自顶向下。
  5. 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:流卷积的两两连接方式为:两个流卷积的通道特征信息流通方式均为自底向上。
  6. 根据权利要求1所述的基于流卷积的图像识别方法,其特征在于:
    Figure PCTCN2021117028-appb-100003
PCT/CN2021/117028 2020-11-06 2021-09-07 一种基于流卷积的图像识别方法 WO2022095584A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011235520.2A CN112288028A (zh) 2020-11-06 2020-11-06 一种基于流卷积的图像识别方法
CN202011235520.2 2020-11-06

Publications (1)

Publication Number Publication Date
WO2022095584A1 true WO2022095584A1 (zh) 2022-05-12

Family

ID=74350767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117028 WO2022095584A1 (zh) 2020-11-06 2021-09-07 一种基于流卷积的图像识别方法

Country Status (2)

Country Link
CN (1) CN112288028A (zh)
WO (1) WO2022095584A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288028A (zh) * 2020-11-06 2021-01-29 神思电子技术股份有限公司 一种基于流卷积的图像识别方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3065084A1 (en) * 2015-03-06 2016-09-07 Panasonic Intellectual Property Management Co., Ltd. Image recognition method, image recognition device, and recording medium
CN108009594A (zh) * 2017-12-25 2018-05-08 北京航空航天大学 一种基于变分组卷积的图像识别方法
CN110991418A (zh) * 2019-12-23 2020-04-10 中国科学院自动化研究所 合成孔径雷达目标图像识别方法及系统
CN111652236A (zh) * 2020-04-21 2020-09-11 东南大学 弱监督场景下跨层特征交互的轻量级细粒度图像识别方法
CN112288028A (zh) * 2020-11-06 2021-01-29 神思电子技术股份有限公司 一种基于流卷积的图像识别方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7391883B2 (ja) * 2018-09-13 2023-12-05 インテル コーポレイション 顔認識のための圧縮-拡張深さ方向畳み込みニューラルネットワーク
CN110782001B (zh) * 2019-09-11 2024-04-09 东南大学 一种基于组卷积神经网络使用共享卷积核的改进方法
CN110728354B (zh) * 2019-09-11 2024-04-09 东南大学 一种基于改进的滑动式分组卷积神经网络的图像处理方法
CN110647893B (zh) * 2019-09-20 2022-04-05 北京地平线机器人技术研发有限公司 目标对象识别方法、装置、存储介质和设备
CN116416561A (zh) * 2019-11-22 2023-07-11 迪爱斯信息技术股份有限公司 一种视频图像处理方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3065084A1 (en) * 2015-03-06 2016-09-07 Panasonic Intellectual Property Management Co., Ltd. Image recognition method, image recognition device, and recording medium
CN108009594A (zh) * 2017-12-25 2018-05-08 北京航空航天大学 一种基于变分组卷积的图像识别方法
CN110991418A (zh) * 2019-12-23 2020-04-10 中国科学院自动化研究所 合成孔径雷达目标图像识别方法及系统
CN111652236A (zh) * 2020-04-21 2020-09-11 东南大学 弱监督场景下跨层特征交互的轻量级细粒度图像识别方法
CN112288028A (zh) * 2020-11-06 2021-01-29 神思电子技术股份有限公司 一种基于流卷积的图像识别方法

Also Published As

Publication number Publication date
CN112288028A (zh) 2021-01-29

Similar Documents

Publication Publication Date Title
WO2022151535A1 (zh) 基于深度学习的人脸特征点检测方法
CN109214353B (zh) 一种基于剪枝模型的人脸图像快速检测训练方法和装置
CN107239736A (zh) 基于多任务级联卷积神经网络的人脸检测方法及检测装置
CN106709478A (zh) 一种行人图像特征分类方法和系统
WO2022095584A1 (zh) 一种基于流卷积的图像识别方法
WO2022095583A1 (zh) 一种基于流卷积的目标检测方法
CN111402126A (zh) 一种基于分块的视频超分辨率方法和系统
Cai et al. Softer pruning, incremental regularization
CN114419413A (zh) 感受野自适应的变电站绝缘子缺陷检测神经网络构建方法
CN112464954A (zh) 一种应用于嵌入式设备的轻量级目标检测网络及训练方法
Lou et al. AR-C3D: Action recognition accelerator for human-computer interaction on FPGA
CN111931551B (zh) 一种基于轻量级级联网络的人脸检测方法
WO2022120988A1 (zh) 基于混合2d卷积和伪3d卷积的立体匹配方法
CN104348695A (zh) 一种基于人工免疫系统的虚拟网络映射方法及其系统
CN112241959A (zh) 基于超像素的注意力机制生成语义分割方法
CN114937153B (zh) 弱纹理环境下基于神经网络的视觉特征处理系统及方法
Chen et al. Reweighted dynamic group convolution
CN113723474A (zh) 一种跨通道聚合相似性网络系统
CN114202071B (zh) 一种基于数据流模式的深度卷积神经网络推理加速方法
Zhu et al. Semantic segmentation of urban street scene images based on improved U-Net network
Shuangyan et al. Lighter and Faster Face Mask Detection Method Based on YOLOv5
CN111507984B (zh) 一种基于多接受野的交替更新网络的场景分割方法
Tu et al. Lightweight object detection algorithm for automatic driving scenarios
Luo et al. DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications
Cheng et al. Accelerate Multi-view Inference with End-edge Collaborative Computing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888283

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21888283

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.11.2023)