CN116403064A

CN116403064A - Picture processing method, model, basic block structure, device and medium

Info

Publication number: CN116403064A
Application number: CN202310668217.9A
Authority: CN
Inventors: 王立; 范宝余; 郭振华; 李仁刚
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-06-07
Filing date: 2023-06-07
Publication date: 2023-07-07
Anticipated expiration: 2043-06-07
Also published as: CN116403064B

Abstract

The invention relates to the field of computer vision, and discloses a picture processing method, model, basic block structure, equipment and medium. The method includes: obtaining n branch features extracted from the target image under n branches, where n is a positive integer; calculating branch feature channel weight value vectors corresponding to each branch, and each branch feature channel weight value vector is used to represent the corresponding branch The channel importance corresponding to each channel of the feature; use the branch feature channel weight value vector to perform channel-level feature weighting on the corresponding branch features, and obtain n channel weighted branch features; fuse the n channel weighted branch features to obtain the target The output features of the image. Based on the technical solution provided by the present invention, it is possible to enhance the expressive ability of channels in the network that can improve performance, and at the same time suppress the expressive ability of channels that have little influence on the final result, thereby improving the image processing effect.

Description

Image processing method, model, basic block structure, equipment and medium

技术领域technical field

本发明涉及计算机视觉领域，具体涉及图片处理方法、模型、基本块结构、设备及介质。The invention relates to the field of computer vision, in particular to a picture processing method, model, basic block structure, equipment and media.

背景技术Background technique

近年来，深度学习在解决图片分类、图像分割和物体检测等计算机视觉领域得到广泛的应用。In recent years, deep learning has been widely used in computer vision fields such as image classification, image segmentation, and object detection.

在使用深度学习模型对图片进行学习时，往往通过加深或者加宽网络的方向，以增强模型的学习能力，但是此方法也不可避免地引入了过多的图片特征。When using a deep learning model to learn pictures, it is often necessary to deepen or widen the direction of the network to enhance the learning ability of the model, but this method inevitably introduces too many picture features.

在面对大量的图片特征时，如何从中挑选出具有鉴别力的图片特征，亟需提供解决方案。In the face of a large number of image features, how to select discriminative image features from them needs to provide a solution.

发明内容Contents of the invention

有鉴于此，本发明提供了一种图片处理方法、模型、基本块结构、设备及介质，设计了一种注意力机制，通过对图片特征的通道进行加权来提升对图片的处理性能。In view of this, the present invention provides a picture processing method, model, basic block structure, equipment and medium, and designs an attention mechanism to improve the picture processing performance by weighting the channels of picture features.

第一方面，本发明提供了一种图片处理方法，所述方法包括：In a first aspect, the present invention provides an image processing method, the method comprising:

获取目标图片在n个分支下提取到的n个分支特征，n为正整数；Obtain n branch features extracted from the target image under n branches, where n is a positive integer;

计算各个分支对应的分支特征通道权重值向量，每个所述分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度；Calculating the branch feature channel weight value vector corresponding to each branch, each of the branch feature channel weight value vectors is used to represent the channel importance corresponding to each channel of the corresponding branch feature;

使用所述分支特征通道权重值向量，对对应的所述分支特征进行通道级别的特征加权，得到n个通道加权分支特征；Using the branch feature channel weight value vector, performing channel-level feature weighting on the corresponding branch features to obtain n channel weighted branch features;

将n个所述通道加权分支特征进行融合，得到所述目标图片的输出特征。The n channel weighted branch features are fused to obtain the output features of the target picture.

第二方面，本发明提供了一种注意力模型，所述注意力模型包括：In a second aspect, the present invention provides an attention model, which includes:

输入模块，用于获取目标图片在n个分支下提取到的n个分支特征，n为正整数；The input module is used to obtain n branch features extracted from the target picture under n branches, and n is a positive integer;

通道权重值计算模块，用于计算各个分支对应的分支特征通道权重值向量，每个所述分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度；The channel weight value calculation module is used to calculate the branch feature channel weight value vector corresponding to each branch, and each branch feature channel weight value vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature;

特征加权模块，用于使用所述分支特征通道权重值向量，对对应的所述分支特征进行通道级别的特征加权，得到n个通道加权分支特征；A feature weighting module, configured to use the branch feature channel weight value vector to perform channel-level feature weighting on the corresponding branch features to obtain n channel-weighted branch features;

输出模块，用于将n个所述通道加权分支特征进行融合，得到所述目标图片的输出特征。An output module, configured to fuse the n channel weighted branch features to obtain the output features of the target picture.

第三方面，本发明提供了一种第一基本块结构，所述第一基本块结构中包括：如上述方面所述的注意力模型、相加模块；In a third aspect, the present invention provides a first basic block structure, which includes: the attention model and the addition module as described in the above aspect;

所述注意力模型，用于在输入目标图片在n个分支下提取到的分支特征的情况下，输出所述目标图片的输出特征，n为正整数；The attention model is used to output the output features of the target picture when the input target picture is extracted under the branch features of n branches, and n is a positive integer;

所述相加模块，用于将所述目标图片的输出特征、所述目标图片相加，得到所述第一基本块结构针对所述目标图片的输出结果。The adding module is configured to add the output feature of the target picture to the target picture to obtain an output result of the first basic block structure for the target picture.

第四方面，本发明提供了一种第二基本块结构，所述第二基本块结构中包括：如上述方面所述的注意力模型、相加模块；In a fourth aspect, the present invention provides a second basic block structure, which includes: the attention model and the addition module as described in the above aspect;

所述相加模块，用于将所述目标图片的输出特征、所述目标图片在批量标准化后的结果相加，得到所述第二基本块结构针对所述目标图片的输出结果。The adding module is configured to add the output feature of the target picture and the result of batch normalization of the target picture to obtain the output result of the second basic block structure for the target picture.

第五方面，本发明提供了一种计算机设备，包括：存储器和处理器，存储器和处理器之间互相通信连接，存储器中存储有计算机指令，处理器通过执行计算机指令，从而执行上述第一方面或其对应的任一实施方式的图片处理方法。In a fifth aspect, the present invention provides a computer device, including: a memory and a processor, the memory and the processor are connected to each other, computer instructions are stored in the memory, and the processor executes the computer instructions to perform the above-mentioned first aspect or the image processing method in any implementation manner corresponding thereto.

第六方面，本发明提供了一种计算机可读存储介质，该计算机可读存储介质上存储有计算机指令，计算机指令用于使计算机执行上述第一方面或其对应的任一实施方式的图片处理方法。In a sixth aspect, the present invention provides a computer-readable storage medium, where computer instructions are stored on the computer-readable storage medium, and the computer instructions are used to cause a computer to execute the image processing in the above-mentioned first aspect or any corresponding implementation manner method.

本发明实施例提供的技术方案至少可以具备如下有益效果：The technical solutions provided by the embodiments of the present invention can at least have the following beneficial effects:

在目标图片提取有n个分支下的n个分支特征的情况下，对各个分支分别计算分支特征通道权重值向量，该分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度，使用分支特征通道权重值向量，对分支特征进行通道级别的特征加权，得到n个通道加权分支特征，将n个通道加权分支特征进行相加，得到目标图片的输出特征，从而基于注意力机制对不同分支特征的通道进行加权，增强对性能有提升的通道的表达能力，同时抑制对性能影响不大的通道的表达能力，最终提升基于通道加权后的分支特征得到的输出特征的性能，该输出特征应用于分类等计算机视觉任务时能保障任务的处理效果。In the case where there are n branch features under n branches extracted from the target image, the branch feature channel weight value vector is calculated for each branch, and the branch feature channel weight value vector is used to represent the importance of the channel corresponding to each channel of the corresponding branch feature degree, use the branch feature channel weight value vector to carry out channel-level feature weighting on the branch features to obtain n channel weighted branch features, add the n channel weighted branch features to obtain the output features of the target image, and thus based on attention The mechanism weights channels with different branch features, enhances the expressive ability of channels that improve performance, and at the same time suppresses the expressive ability of channels that have little impact on performance, and finally improves the performance of output features based on channel-weighted branch features. When the output feature is applied to computer vision tasks such as classification, it can guarantee the processing effect of the task.

附图说明Description of drawings

为了更清楚地说明本发明具体实施方式或现有技术中的技术方案，下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施方式，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific implementation of the present invention or the technical solutions in the prior art, the following will briefly introduce the accompanying drawings that need to be used in the specific implementation or description of the prior art. Obviously, the accompanying drawings in the following description The drawings show some implementations of the present invention, and those skilled in the art can obtain other drawings based on these drawings without any creative effort.

图1是根据本发明实施例的图片处理方法的流程示意图；FIG. 1 is a schematic flow diagram of an image processing method according to an embodiment of the present invention;

图2是根据本发明实施例的另一图片处理方法的流程示意图；FIG. 2 is a schematic flowchart of another image processing method according to an embodiment of the present invention;

图3是根据本发明实施例的另一图片处理方法的流程示意图；FIG. 3 is a schematic flowchart of another image processing method according to an embodiment of the present invention;

图4是根据本发明实施例的注意力模型的结构示意图；Fig. 4 is a schematic structural diagram of an attention model according to an embodiment of the present invention;

图5是根据本发明实施例的图片处理过程的示意图；FIG. 5 is a schematic diagram of an image processing process according to an embodiment of the present invention;

图6是根据本发明实施例的第一基本块结构的结构示意图；6 is a schematic structural diagram of a first basic block structure according to an embodiment of the present invention;

图7是根据本发明实施例的第二基本块结构的结构示意图；7 is a schematic structural diagram of a second basic block structure according to an embodiment of the present invention;

图8是根据本发明实施例的分类网络的结构示意图；Fig. 8 is a schematic structural diagram of a classification network according to an embodiment of the present invention;

图9是根据本发明实施例的计算机设备的硬件结构示意图。Fig. 9 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the present invention.

首先，对本发明中所涉及的注意力机制的概念进行简单介绍：First, a brief introduction to the concept of the attention mechanism involved in the present invention:

深度学习在解决图片分类、图像分割和物体检测等计算机视觉领域的问题并且取得了巨大的成功。近年来涌现出了很多优秀的深度学习模型。Deep learning has achieved great success in solving computer vision problems such as image classification, image segmentation, and object detection. In recent years, many excellent deep learning models have emerged.

近年来，注意力模型（深度学习模型的一种）（Attention Model）被广泛使用在自然语言处理、图像识别及语音识别等各种不同类型的深度学习任务中，是深度学习技术中最值得关注与深入了解的核心技术之一，因此基于注意力机制（attention mechanism）的深度学习模型近些年来收到广泛的重视，成为重要的研究方向。In recent years, the attention model (a type of deep learning model) (Attention Model) has been widely used in various types of deep learning tasks such as natural language processing, image recognition and speech recognition, and is the most noteworthy in deep learning technology. Therefore, the deep learning model based on the attention mechanism has received extensive attention in recent years and has become an important research direction.

在人类视觉中，注意力机制表现为人类视觉注意力机制，其是人类视觉所特有的大脑信号处理机制。人类视觉通过快速扫描全局图像，获得需要重点关注的目标区域，也就是一般所说的注意力焦点，而后对这一区域投入更多注意力资源，以获取更多所需要关注目标的细节信息，而抑制其他无用信息。这是人类利用有限的注意力资源从大量信息中快速筛选出高价值信息的手段，是人类在长期进化中形成的一种生存机制，人类视觉注意力机制极大地提高了视觉信息处理的效率与准确性。In human vision, the attention mechanism is manifested as the human visual attention mechanism, which is a brain signal processing mechanism unique to human vision. Human vision quickly scans the global image to obtain the target area that needs to be focused on, which is generally referred to as the focus of attention, and then invests more attention resources in this area to obtain more detailed information about the target that needs to be focused on. And suppress other useless information. This is a means for humans to use limited attention resources to quickly screen out high-value information from a large amount of information. It is a survival mechanism formed in the long-term evolution of humans. The human visual attention mechanism has greatly improved the efficiency and efficiency of visual information processing. accuracy.

而对于深度学习而言，注意力机制的设计，通俗的讲就是把注意力集中放在重要的点上，而忽略其他不重要的因素。其中重要程度的判断取决于不同的网络结构或应用场景。For deep learning, the design of the attention mechanism, in layman's terms, is to focus on important points and ignore other unimportant factors. The judgment of the importance depends on different network structures or application scenarios.

随着深度学习技术的不断发展，深度学习模型层出不穷，但是为了进一步提高精度，研究人员往往向着加深或者加宽网络的方向去设计。不可否认随着网络变深或者变宽，模型的学习能力也不断增强，但是模型的计算量和参数量也飞速增长，不利于在实际应用中进行部署，同时，随着模型层数变大，也不可避免的引入了大量的噪声（即：很多的无用特征），过多的特征通常不但不会提高网络模型的能力，反而会迷惑分类器，从而降低网络的识别能力。With the continuous development of deep learning technology, deep learning models emerge in endlessly, but in order to further improve the accuracy, researchers often design in the direction of deepening or widening the network. It is undeniable that as the network becomes deeper or wider, the learning ability of the model continues to increase, but the amount of computation and parameters of the model also increase rapidly, which is not conducive to deployment in practical applications. At the same time, as the number of model layers increases, It is also inevitable to introduce a lot of noise (that is: many useless features), too many features usually not only will not improve the ability of the network model, but will confuse the classifier, thereby reducing the recognition ability of the network.

因此，只有能挑选出有限数量的有鉴别力的特征，才能实现好的鉴别，最大的发挥模型的能力。而注意力机制（attention mechanism）在特征选择能力上展示出巨大的优势因而被广泛的采用。Therefore, only a limited number of discriminative features can be selected to achieve good discrimination and maximize the ability of the model. The attention mechanism (attention mechanism) shows a huge advantage in feature selection ability and is widely adopted.

基于此，本发明实施例提供了一种图片处理方法，旨在设计更优秀的注意力模型的结构，去增强网络中能够对性能有提升的通道的表达能力，同时抑制对最终结果影响不大的通道的表达能力，从而进一步提升图片的处理效果。Based on this, the embodiment of the present invention provides an image processing method, which aims to design a better attention model structure, to enhance the expressive ability of channels in the network that can improve performance, and at the same time suppress The expressive ability of the channel can further improve the image processing effect.

根据本发明实施例，提供了一种图片处理方法，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, a picture processing method is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and although the steps shown in the flow chart Although a logical order is shown, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

在本实施例中提供了一种图片处理方法，可用于计算机设备中，图1是根据本发明实施例的图片处理方法的流程图，如图1所示，该流程包括如下步骤：In this embodiment, a picture processing method is provided, which can be used in a computer device. FIG. 1 is a flowchart of a picture processing method according to an embodiment of the present invention. As shown in FIG. 1 , the process includes the following steps:

步骤S101，获取目标图片在n个分支下提取到的n个分支特征，n为正整数。Step S101, acquiring n branch features extracted from n branches of the target picture, where n is a positive integer.

其中，目标图片是需要进行特征鉴别，挑选出输出特征的图片。本发明实施例对目标图片的具体内容，最后挑选出的输出特征的具体应用场景等不加以限制。例如：将输出特征应用于行人重识别任务中，在行人重识别任务中，其中行人特征表达和有鉴别力的特征的筛选直接决定了目标行人能否被正确识别，是行人重识别任务的重要环节。Among them, the target picture is a picture that needs to be characterized for feature identification, and the output feature is selected. The embodiment of the present invention does not limit the specific content of the target picture, the specific application scenarios of the finally selected output features, and the like. For example: applying the output features to the task of pedestrian re-identification. In the task of pedestrian re-identification, the expression of pedestrian features and the screening of discriminative features directly determine whether the target pedestrian can be correctly identified, which is an important task of pedestrian re-identification. links.

在本实施方式中，目标图片在n个分支下分别进行特征提取，得到n个分支下的n个分支特征，n为正整数。In this embodiment, feature extraction is performed on the target picture under n branches respectively, and n branch features under n branches are obtained, where n is a positive integer.

其中，各个分支是包含多卷积层的独立计算模块，各个分支通常具备的卷积核大小、卷积层数目等不完全相同，使得在不同分支下提取到的分支特征可以具备不同感受野。Among them, each branch is an independent computing module containing multiple convolutional layers. Each branch usually has different convolution kernel sizes and the number of convolutional layers, so that the branch features extracted under different branches can have different receptive fields.

其中，n可以进一步限制为大于1的正整数，即在多分支下提取分支特征，将本发明实施例提出的基于注意力机制的图片处理方法应用于多分支的场景。可以理解的是，本发明实施例提出的基于注意力机制的图片处理方法，也可以应用于单分支的场景，只是在多分支的场景下，本发明实施例提出的技术方案的性能提升效果更加显著。在下文实施例中，主要以多分支的场景进行说明。Wherein, n can be further limited to a positive integer greater than 1, that is, branch features are extracted under multi-branch, and the image processing method based on the attention mechanism proposed in the embodiment of the present invention is applied to a multi-branch scene. It can be understood that the image processing method based on the attention mechanism proposed in the embodiment of the present invention can also be applied to a single-branch scenario, but in a multi-branch scenario, the performance improvement effect of the technical solution proposed in the embodiment of the present invention is even better. significantly. In the following embodiments, a multi-branch scenario is mainly used for description.

步骤S102，计算各个分支对应的分支特征通道权重值向量，每个分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度。Step S102, calculating branch feature channel weight value vectors corresponding to each branch, and each branch feature channel weight value vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature.

其中，通道是用于描述图像特征的一个维度，且每个分支下的分支特征可以具备多个不同的通道。Among them, channel is a dimension used to describe image features, and the branch features under each branch can have multiple different channels.

在本实施方式中，分别计算n个分支对应的n个分支特征通道权重值向量，各个分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度。In this embodiment, n branch feature channel weight value vectors corresponding to n branches are respectively calculated, and each branch feature channel weight value vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature.

示例性的，共有2个分支：分支1、分支2，计算2个分支分别对应的2个分支特征通道权重值向量：分支特征通道权重值向量1、分支特征通道权重值向量2。其中，分支特征通道权重值向量1用于表征分支1下分支特征的各个通道对应的通道重要程度；分支特征通道权重值向量2用于表征分支2下分支特征的各个通道对应的通道重要程度。Exemplarily, there are two branches: branch 1 and branch 2, and two branch feature channel weight value vectors corresponding to the two branches are calculated: branch feature channel weight value vector 1 and branch feature channel weight value vector 2. Among them, the branch feature channel weight value vector 1 is used to represent the channel importance corresponding to each channel of the branch feature under branch 1; the branch feature channel weight value vector 2 is used to represent the channel importance corresponding to each channel of the branch feature under branch 2.

步骤S103，使用分支特征通道权重值向量，对对应的分支特征进行通道级别的特征加权，得到n个通道加权分支特征。Step S103 , using the branch feature channel weight value vectors to perform channel-level feature weighting on the corresponding branch features to obtain n channel-weighted branch features.

在本实施方式中，在计算得到n个分支特征通道权重值向量之后，使用分支特征通道权重值向量对对应的分支特征进行通道级别的特征加权，得到n个通道加权分支特征。In this embodiment, after n branch feature channel weight value vectors are obtained, channel-level feature weighting is performed on corresponding branch features using the branch feature channel weight value vectors to obtain n channel-weighted branch features.

示例性的，共有2个分支：分支1、分支2，使用分支特征通道权重值向量1对分支1下的分支特征1进行通道级别的特征加权，得到通道加权分支特征1；使用分支特征通道权重值向量2对分支2下的分支特征2进行通道级别的特征加权，得到通道加权分支特征2。Exemplarily, there are two branches: branch 1 and branch 2. Use branch feature channel weight value vector 1 to perform channel-level feature weighting on branch feature 1 under branch 1 to obtain channel-weighted branch feature 1; use branch feature channel weight Value vector 2 performs channel-level feature weighting on branch feature 2 under branch 2 to obtain channel-weighted branch feature 2.

步骤S104，将n个通道加权分支特征进行融合，得到目标图片的输出特征。Step S104, fusing the n channel weighted branch features to obtain the output features of the target picture.

在本实施方式中，将所有的通道加权分支特征进行融合，从而得到最终的目标图片的输出特征。In this embodiment, all channel weighted branch features are fused to obtain the final output features of the target picture.

其中，对n个通道加权分支特征进行融合的一种方式可以是将n个通道加权分支特征进行相加。One way of fusing the n channel weighted branch features may be to add the n channel weighted branch features.

综上所述，本实施例提供的图片处理方法，在目标图片提取有n个分支下的n个分支特征的情况下，对各个分支分别计算分支特征通道权重值向量，该分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度，使用分支特征通道权重值向量，对分支特征进行通道级别的特征加权，得到n个通道加权分支特征，将n个通道加权分支特征进行相加，得到目标图片的输出特征，从而基于注意力机制对不同分支特征的通道进行加权，增强对性能有提升的通道的表达能力，同时抑制对性能影响不大的通道的表达能力，最终提升基于通道加权后的分支特征得到的输出特征的性能，该输出特征应用于分类等计算机视觉任务时能保障任务的处理效果。To sum up, in the image processing method provided by this embodiment, when the target image extracts n branch features under n branches, the branch feature channel weight value vector is calculated for each branch, and the branch feature channel weight value The vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature. Using the branch feature channel weight value vector, the branch feature is weighted at the channel level to obtain n channel weighted branch features, and the n channel weighted branch features are calculated. Add them together to get the output features of the target image, so as to weight the channels of different branch features based on the attention mechanism, enhance the expressive ability of the channel that improves the performance, and suppress the expressive ability of the channel that has little impact on the performance, and finally improve Based on the performance of the output feature obtained by channel-weighted branch features, the output feature can guarantee the processing effect of the task when it is applied to computer vision tasks such as classification.

在本实施例中提供了一种图片处理方法，可用于计算机设备中，图2是根据本发明实施例的图片处理方法的流程图，如图2所示，该流程包括如下步骤：In this embodiment, a picture processing method is provided, which can be used in a computer device. FIG. 2 is a flowchart of a picture processing method according to an embodiment of the present invention. As shown in FIG. 2 , the process includes the following steps:

步骤S201，获取目标图片在n个分支下提取到的分支特征，n为正整数。Step S201, acquiring branch features extracted from n branches of the target picture, where n is a positive integer.

详细请参见图1所示实施例的步骤S101，在此不再赘述。For details, refer to step S101 in the embodiment shown in FIG. 1 , and details are not repeated here.

步骤S202，对n个分支特征进行特征压缩，得到压缩特征。Step S202, performing feature compression on n branch features to obtain compressed features.

其中，压缩特征是将n个分支特征进行压缩后得到的特征。可以理解的是，在后续的处理步骤中，从计算量的角度，压缩特征比n个单独的分支特征的计算量要少。Wherein, the compressed feature is a feature obtained by compressing n branch features. It can be understood that in subsequent processing steps, from the perspective of calculation amount, the calculation amount of compressed features is less than that of n individual branch features.

步骤S203，基于压缩特征，计算压缩特征通道权重值向量，压缩特征通道权重值向量用于表征压缩特征的各个通道对应的通道重要程度。Step S203, based on the compressed feature, calculate the compressed feature channel weight value vector, which is used to represent the channel importance corresponding to each channel of the compressed feature.

在本实施例中，在得到n个分支特征对应的压缩特征之后，计算表征压缩特征的各个通道对应的通道重要程度的压缩特征通道权重值向量。In this embodiment, after the compressed features corresponding to the n branch features are obtained, a compressed feature channel weight value vector representing the channel importance corresponding to each channel of the compressed feature is calculated.

示例性的，压缩特征共有3个通道：通道1、通道2、通道3，则计算出可以表征通道1的通道重要程度、通道2的通道重要程度、通道3的通道重要程度的压缩特征通道权重值向量。Exemplarily, there are 3 channels in the compressed feature: channel 1, channel 2, and channel 3, then calculate the channel weights of the compressed feature that can represent the channel importance of channel 1, the channel importance of channel 2, and the channel importance of channel 3 a vector of values.

在一种可选的实施方式中，压缩特征的计算过程，包括：将n个分支特征相加，得到n个分支特征对应的融合特征；或，将n个分支特征基于通道维度进行连接，得到n个分支特征对应的融合特征。In an optional implementation, the calculation process of compressing features includes: adding n branch features to obtain fusion features corresponding to n branch features; or connecting n branch features based on channel dimensions to obtain Fusion features corresponding to n branch features.

在本实施方式中，可以采用融合的方式对n个分支特征进行压缩，具体的，可以将n个分支特征直接相加，也可以将n个分支特征基于通道维度进行连接，从而保障n个分支特征进行有效压缩。In this embodiment, the n branch features can be compressed in a fusion manner. Specifically, the n branch features can be added directly, or the n branch features can be connected based on the channel dimension, so as to ensure that the n branch features Features are effectively compressed.

在一种可选的实施方式中，在压缩特征的通道数为m，m为正整数的情况下，基于压缩特征，计算压缩特征通道权重值向量，包括：将压缩特征进行通道级别的拆分，得到m个通道特征图；计算各个通道特征图在多种统计量下的统计量信息，得到m个通道统计量向量；基于m个通道统计量向量，计算得到压缩特征通道权重值向量。In an optional implementation, when the number of compressed feature channels is m, and m is a positive integer, calculating the compressed feature channel weight value vector based on the compressed feature includes: splitting the compressed feature at the channel level , to obtain m channel feature maps; calculate statistical information of each channel feature map under various statistics, and obtain m channel statistic vectors; based on the m channel statistic vectors, calculate and obtain compressed feature channel weight value vectors.

其中，多统计量下的统计量信息指的是：从多种统计量的角度，对通道特征图进行统计，得到的统计量信息中包含多种统计量的相应信息。Wherein, the statistic information under the multi-statistic refers to: performing statistics on the channel feature map from the perspective of multiple statistic, and the obtained statistic information includes the corresponding information of the multiple statistic.

在本实施方式中，将压缩特征先按照通道的维度拆分出来，再对每个通道特征图进行多统计量下的统计量信息的计算，得到的通道统计量向量可以用于反应通道特征图的特性，再使用各个通道统计量向量，计算压缩特征通道权重值向量，从而通过分支内多概率评估通道重要性的方式，完成压缩特征通道权重值向量的准确计算。In this embodiment, the compressed features are first split according to the dimension of the channel, and then the statistics information under the multi-statistics is calculated for each channel feature map, and the obtained channel statistics vector can be used to reflect the channel feature map characteristics, and then use the statistics vector of each channel to calculate the weight value vector of the compressed feature channel, so as to complete the accurate calculation of the weight value vector of the compressed feature channel by evaluating the importance of the channel through multiple probabilities in the branch.

在一种可选的实施方式中，基于m个通道统计量向量，计算得到压缩特征通道权重值向量，包括：将m个通道统计量向量分别输入至少一个全连接层，输出得到m个通道分别对应的通道重要程度；将m个通道分别对应的通道重要程度按照通道进行排列，得到压缩特征通道权重值向量。In an optional implementation manner, based on the m channel statistic vectors, calculating the compressed feature channel weight value vector includes: respectively inputting the m channel statistic vectors into at least one fully connected layer, and outputting m channels respectively Corresponding channel importance; the channel importance corresponding to the m channels is arranged according to the channel, and the compressed feature channel weight value vector is obtained.

在本实施方式中，通过至少一个全连接层对多统计量下的通道统计量进行学习，得到各个通道分别对应的通道重要程度，所有通道的通道重要程度在按照通道进行排列后即为压缩特征通道权重值向量，从而在多统计量的设计下，可以根据多统计量准确地确定通道重要程度。In this embodiment, channel statistics under multi-statistics are learned through at least one fully connected layer, and the channel importance corresponding to each channel is obtained. The channel importance of all channels is the compressed feature after being arranged according to the channel. Channel weight value vector, so that under the design of multi-statistics, the importance of channels can be accurately determined according to multi-statistics.

在一种可选的实施方式中，统计量包括如下中的至少一种：均值、方差、变异系数、偏度、峰值、最大值、最小值、中位数、四分位数。In an optional implementation manner, the statistics include at least one of the following: mean, variance, coefficient of variation, skewness, peak value, maximum value, minimum value, median, and quartile.

在本实施方式中，通过不同角度的统计量，全面地反应各个通道特征图的数据分布特性，相较于根据单种统计量来确定通道的通道重要程度的方式而言，多统计量的统计量信息可以更加细致地反应通道重要程度。In this embodiment, statistics from different angles are used to comprehensively reflect the data distribution characteristics of each channel feature map. Quantitative information can reflect the importance of channels in more detail.

步骤S204，基于压缩特征通道权重值向量，计算各个分支对应的分支特征通道权重值向量。Step S204, based on the compressed feature channel weight value vector, calculate branch feature channel weight value vectors corresponding to each branch.

在本实施方式中，在计算得到压缩特征通道权重值向量之后，基于此信息，计算分支特征的各个通道对应的通道重要程度，得到n个分支特征通道权重值向量。In this embodiment, after the compressed feature channel weight value vector is calculated, based on this information, the channel importance corresponding to each channel of the branch feature is calculated to obtain n branch feature channel weight value vectors.

步骤S205，使用分支特征通道权重值向量，对对应的分支特征进行通道级别的特征加权，得到n个通道加权分支特征。Step S205, using the branch feature channel weight value vectors to perform channel-level feature weighting on the corresponding branch features to obtain n channel-weighted branch features.

详细请参见图1所示实施例的步骤S103，在此不再赘述。For details, please refer to step S103 in the embodiment shown in FIG. 1 , which will not be repeated here.

步骤S206，将n个通道加权分支特征进行融合，得到目标图片的输出特征。Step S206, fusing the n channel weighted branch features to obtain the output features of the target picture.

详细请参见图1所示实施例的步骤S104，在此不再赘述。For details, please refer to step S104 in the embodiment shown in FIG. 1 , which will not be repeated here.

综上所述，本实施例提供的基于注意力机制的图片处理方法，先计算表征融合特征的各个通道对应的通道重要程度的融合特征通道权重值向量，再基于此融合特征通道权重值向量，计算出各个分支对应的分支特征通道权重值向量，从而从融合特征的细粒度特征对比，准确评估通道重要程度。To sum up, the image processing method based on the attention mechanism provided by this embodiment first calculates the fusion feature channel weight value vector representing the channel importance corresponding to each channel of the fusion feature, and then based on the fusion feature channel weight value vector, Calculate the branch feature channel weight value vector corresponding to each branch, so as to accurately evaluate the importance of the channel from the fine-grained feature comparison of the fusion feature.

在本实施例中提供了一种图片处理方法，可用于计算机设备中，图3是根据本发明实施例的图片处理方法的流程图，如图3所示，上述流程步骤S204可以替换实现为如下步骤：In this embodiment, a picture processing method is provided, which can be used in computer equipment. FIG. 3 is a flow chart of the picture processing method according to the embodiment of the present invention. As shown in FIG. 3, the above process step S204 can be replaced as follows step:

步骤S301，基于压缩特征通道权重值向量，对压缩特征进行通道级别的特征加权，得到加权后的压缩特征。Step S301, based on the compressed feature channel weight value vector, perform channel-level feature weighting on the compressed feature to obtain weighted compressed features.

在本实施例中，按照压缩特征通道权重值向量与压缩特征的通道对应关系，进行通道级别的特征加权，得到加权后的压缩特征。In this embodiment, channel-level feature weighting is performed according to the corresponding relationship between compressed feature channel weight value vectors and compressed feature channels to obtain weighted compressed features.

在一种可选的实施方式中，加权后的压缩特征还包括如下加权过程：对完成通道级别的特征加权的压缩特征进行局部空间位置的特征加权，得到加权后的压缩特征。In an optional implementation manner, the weighted compressed features further include the following weighting process: performing feature weighting of local spatial positions on the compressed features that have undergone channel-level feature weighting to obtain weighted compressed features.

在本实施方式中，在对压缩特征进行通道级别的第一重加权后，再进行局部空间位置的第二重加权，使用两重加权后的压缩特征进行后续的计算，通过增加第二重的局部空间位置的特征加权，进一步提升压缩特征的性能。In this embodiment, after the first weighting of the channel level is performed on the compressed features, the second weighting of the local spatial position is carried out, and the subsequent calculation is performed using the compressed features after double weighting. By adding the second weighting The feature weighting of the local spatial position further improves the performance of compressed features.

在一种可选的实施方式中，对完成通道级别的特征加权的压缩特征进行局部空间位置的特征加权，得到加权后的压缩特征，包括：In an optional implementation manner, the feature weighting of the local spatial position is performed on the compressed features that have completed the channel-level feature weighting, and the weighted compressed features are obtained, including:

（1）针对完成通道级别的特征加权的压缩特征中的任意一个空间位置像素，计算空间位置像素对应的局部近邻关系向量，局部近邻关系向量用于表征空间位置像素与邻域空间位置像素之间的相关性，邻域空间位置像素是空间位置像素周围的空间位置像素。(1) For any spatial position pixel in the compressed feature with channel-level feature weighting, calculate the local neighbor relationship vector corresponding to the spatial position pixel, and the local neighbor relationship vector is used to represent the spatial position pixel and the neighborhood spatial position pixel. The correlation of the neighborhood spatial loxel is the spatial loxel around the spatial loxel.

在本实施方式中，在完成第一重的通道级别的特征加权后，对压缩特征中的每个空间位置像素，均计算其与周围的空间位置像素之间的相关性，得到各个空间位置像素对应的局部近邻关系向量。In this embodiment, after the first channel-level feature weighting is completed, for each spatial position pixel in the compressed feature, the correlation between it and the surrounding spatial position pixels is calculated to obtain each spatial position pixel The corresponding local neighbor relation vector.

在一种可选的实施方式中，针对完成通道级别的特征加权的压缩特征中的任意一个空间位置像素，计算空间位置像素对应的局部近邻关系向量，包括：针对目标空间位置像素，分别计算目标空间位置像素与k个邻域空间位置像素之间的相关性，得到目标空间位置像素的k个局部近邻关系标量，k为正整数；将目标空间位置像素的k个局部近邻关系标量进行拼接，得到目标空间位置像素对应的局部近邻关系向量。In an optional implementation manner, for any spatial position pixel in the compressed feature that has completed channel-level feature weighting, calculating the local neighbor relationship vector corresponding to the spatial position pixel includes: for the target spatial position pixel, respectively calculating the target The correlation between the spatial position pixel and the k neighborhood spatial position pixels obtains k local neighbor relationship scalars of the target spatial position pixel, k is a positive integer; the k local neighbor relationship scalars of the target spatial position pixel are spliced, The local neighbor relationship vector corresponding to the target spatial position pixel is obtained.

其中，目标空间位置像素是压缩特征中的任意一个空间位置像素。Wherein, the target spatial location pixel is any spatial location pixel in the compressed feature.

在本实施方式中，目标空间位置像素的局部近邻关系向量的计算过程参考如下：首先确定其周围的k个空间位置像素作为k个邻域空间位置像素，再分别计算该目标空间位置像素与各个邻域空间位置像素之间的相关性，得到k个局部近邻关系标量，再对k个局部近邻关系标量按列进行拼接，即可得到目标空间位置像素的局部近邻关系向量，从而准确地生成空间位置像素对应的局部近邻关系向量。In this embodiment, the calculation process of the local neighbor relationship vector of the target spatial position pixel is as follows: first determine the k spatial position pixels around it as k neighborhood spatial position pixels, and then calculate the target spatial position pixel and each The correlation between the pixels in the neighborhood space position is used to obtain k local neighbor relationship scalars, and then the k local neighbor relationship scalars are spliced in columns to obtain the local neighbor relationship vector of the target spatial position pixel, thereby accurately generating the spatial The local neighbor relationship vector corresponding to the loxel.

在一种可选的实施方式中，计算空间位置像素对应的局部近邻关系向量之前，该方法还包括：通过局部邻域选择网络为空间位置像素从多种局部邻域关系中选择出一种目标局部邻域关系，局部邻域关系用于定义空间位置像素与邻域空间位置像素之间的关系；将与空间位置像素满足选择出的目标局部邻域关系的压缩特征，作为空间位置像素的邻域空间位置像素。In an optional implementation manner, before calculating the local neighbor relationship vector corresponding to the spatial position pixel, the method further includes: selecting a target from various local neighborhood relationships for the spatial position pixel through the local neighborhood selection network The local neighborhood relationship, the local neighborhood relationship is used to define the relationship between the spatial position pixel and the neighborhood spatial position pixel; the compressed feature that satisfies the selected target local neighborhood relationship with the spatial position pixel is used as the spatial position pixel’s neighbor Domain space loxel.

其中，局部邻域选择网络是一种具备局部邻域关系匹配功能的神经网络。Among them, the local neighborhood selection network is a neural network with a local neighborhood relationship matching function.

在本实施方式中，预先定义至少一种局部邻域关系，先通过局部邻域选择网络为空间位置像素匹配目标局部邻域关系，再通过目标局部邻域关系确定该空间位置像素对应的邻域空间位置像素，从而为各个空间位置像素选择出合理的邻域空间位置像素，邻域空间位置像素的选择自适应可变。In this embodiment, at least one local neighborhood relationship is defined in advance, and the local neighborhood selection network is used to match the target local neighborhood relationship for the spatial position pixel, and then the corresponding neighborhood of the spatial position pixel is determined through the target local neighborhood relationship spatial position pixels, so as to select a reasonable neighborhood spatial position pixel for each spatial position pixel, and the selection of the neighborhood spatial position pixels is adaptive and variable.

（2）对各个空间位置像素对应的局部近邻关系向量进行归一化，得到各个空间位置像素对应的归一化后的局部近邻关系向量。(2) Normalize the local neighbor relationship vectors corresponding to each spatial position pixel to obtain the normalized local neighbor relationship vectors corresponding to each spatial position pixel.

在本实施方式中，对于压缩特征中的每个空间位置像素，对其局部近邻关系向量中指示的与不同邻域空间位置像素之间的相关性进行归一化，最终得到各个空间位置像素对应的归一化后的局部近邻关系向量。In this embodiment, for each spatial position pixel in the compressed feature, the correlation between the local neighbor relation vector and different neighborhood spatial position pixels is normalized, and finally the corresponding spatial position pixel The normalized local neighbor relation vector of .

其中，归一化处理可以通过Softmax 函数来实现。Among them, the normalization process can be realized through the Softmax function.

（3）使用归一化后的局部近邻关系向量，对对应的空间位置像素进行特征加权，得到加权后的压缩特征。(3) Use the normalized local neighbor relationship vector to perform feature weighting on the corresponding spatial position pixels to obtain weighted compressed features.

在本实施方式中，对于压缩特征中的每个空间位置像素，均使用其对应的归一化后的局部近邻关系向量进行特征加权，在完成所有空间位置像素的特征加权后，得到加权后的压缩特征。In this embodiment, for each spatial position pixel in the compressed feature, its corresponding normalized local neighbor relationship vector is used for feature weighting. After the feature weighting of all spatial position pixels is completed, the weighted Compression features.

在一种可选的实施方式中，使用归一化后的局部近邻关系向量，对对应的空间位置像素进行特征加权，得到加权后的压缩特征，包括：针对目标空间位置像素，抽出目标空间位置像素的邻域空间位置像素在通道维度的向量，组成目标空间位置像素的局部区域特征；将目标空间位置像素对应的局部近邻关系向量与目标空间位置像素的局部区域特征进行特征加权，得到目标空间位置像素的局部区域加权特征；将所有的空间位置像素的局部加权特征进行拼接，得到加权后的压缩特征。In an optional implementation manner, the normalized local neighbor relationship vector is used to perform feature weighting on the corresponding spatial position pixels to obtain the weighted compressed features, including: extracting the target spatial position pixel for the target spatial position pixel The vector of the pixel in the neighborhood spatial position of the pixel in the channel dimension constitutes the local area feature of the target spatial position pixel; the local neighbor relationship vector corresponding to the target spatial position pixel is weighted with the local area feature of the target spatial position pixel to obtain the target space The local area weighted features of the position pixels; the local weighted features of all the spatial position pixels are spliced to obtain the weighted compressed features.

在本实施方式中，使用归一化后的局部近邻关系向量对目标空间位置像素进行特征加权的计算过程参考如下：针对目标空间位置像素周围的k个邻域空间位置像素，按照通道的维度抽取这k个邻域空间位置像素的向量，得到抽取后的局域区域特征，使用目标空间位置像素对应的归一化后的局部近邻关系向量，与目标空间位置像素对应的局域区域特征进行矩阵乘法，实现准确地加权融合。In this embodiment, the calculation process of using the normalized local neighbor relationship vector to perform feature weighting on the target spatial position pixel is as follows: for the k neighboring spatial position pixels around the target spatial position pixel, extract according to the dimension of the channel The vectors of these k neighborhood spatial position pixels are obtained to obtain the extracted local area features, and the normalized local neighbor relationship vector corresponding to the target spatial position pixels is used to perform matrix with the local area feature corresponding to the target spatial position pixels Multiplication for accurate weighted fusion.

步骤S302，对加权后的压缩特征在高度、宽度的维度下进行下采样，得到下采样后的压缩特征。Step S302 , downsampling the weighted compressed features in the dimensions of height and width to obtain the downsampled compressed features.

在本实施方式中，加权后的压缩特征的维度是高度×宽度×通道，对此进行高度×宽度下采样，得到1×通道维度的下采样后的压缩特征，以便于后续进行通道维度上的处理。In this embodiment, the dimension of the weighted compressed feature is height×width×channel, which is down-sampled by height×width to obtain the compressed feature after downsampling of 1×channel dimension, so as to facilitate the subsequent channel dimension. deal with.

步骤S303，将下采样后的压缩特征拆分成n组拆分特征，每组拆分特征对应于一个分支。Step S303, split the down-sampled compressed features into n groups of split features, each group of split features corresponds to a branch.

在本实施方式中，将下采样后的压缩特征平均拆分成n组，每组中的拆分特征可以用于对相应分支的通道重要性进行训练。In this embodiment, the downsampled compressed features are evenly split into n groups, and the split features in each group can be used to train the channel importance of the corresponding branch.

步骤S304，对n组拆分特征分别进行特征还原，得到n个分支分别对应的n个分支特征通道权重值向量。Step S304, perform feature restoration on the n groups of split features respectively, and obtain n branch feature channel weight value vectors corresponding to the n branches respectively.

在本实施方式中，由于拆分特征经过了拆分，其尺度无法用于表征分支特征的各个通道对应的通道重要程度，因此，对n组拆分特征分别进行特征还原，还原为与分支特征通道权重值向量匹配的尺度，以获取对应分支下的分支特征通道权重值向量。In this embodiment, since the split feature has been split, its scale cannot be used to characterize the channel importance corresponding to each channel of the branch feature. Therefore, the feature restoration is performed on the n groups of split features, and the restoration is the same as the branch feature The channel weight value vector matches the scale to obtain the branch feature channel weight value vector under the corresponding branch.

在一种可选的实施方式中，对n组拆分特征分别进行特征还原，得到n个分支分别对应的n个分支特征通道权重值向量，包括：将n组拆分特征分别输入至少一个全连接层，输出得到n个预备分支特征通道权重值向量；对n个预备分支特征通道权重值向量进行分支间的对比加权处理，得到n个分支特征通道权重值向量。In an optional implementation manner, feature restoration is performed on n groups of split features respectively, and n branch feature channel weight value vectors corresponding to n branches are respectively obtained, including: inputting n groups of split features into at least one full The connection layer is output to obtain n feature channel weight value vectors of the preliminary branches; the weight value vectors of the feature channel of the n preliminary branches are compared and weighted between branches, and n branch feature channel weight value vectors are obtained.

在本实施方式中，通过至少一个全连接层对拆分特征进行学习，学习后的信息作为预备分支特征通道权重值向量，对n个预备分支特征通道权重值向量进一步进行分支间的对比加权处理，将得到的信息作为最终的分支特征通道权重值向量，从而增加分支特征间的细粒度特征对比，全方位地准确评估通道重要程度。In this embodiment, the split feature is learned through at least one fully connected layer, and the learned information is used as the weight value vector of the feature channel of the preliminary branch, and the weight value vector of the feature channel of the preliminary branch is further subjected to the comparative weighting process between branches , and use the obtained information as the final branch feature channel weight value vector, thereby increasing the fine-grained feature comparison between branch features and accurately evaluating the importance of channels in all directions.

在一种可选的实施方式中，在压缩特征的通道数为m，m为正整数的情况下，对n个预备分支特征通道权重值向量进行分支间的对比加权处理，得到n个分支特征通道权重值向量，包括：分别提取n个预备分支特征通道权重值向量在同一通道上的通道权重值，得到与m个通道分别对应的m个重组后通道权重值向量；对各个重组后通道权重值向量分别进行归一化，得到归一化后的m个重组后通道权重值向量；使用归一化后的m个重组后通道权重值向量，对n个预备分支特征通道权重值向量进行特征替换，得到n个分支特征通道权重值向量。In an optional implementation, when the number of compressed feature channels is m, and m is a positive integer, the comparison and weighting process between branches is performed on the n preparatory branch feature channel weight value vectors to obtain n branch features Channel weight value vectors, including: respectively extracting the channel weight values of the n preparatory branch feature channel weight value vectors on the same channel to obtain m reorganized channel weight value vectors corresponding to m channels respectively; The value vectors are normalized respectively to obtain normalized m reorganized channel weight value vectors; use the normalized m reorganized channel weight value vectors to perform feature Replace to get n branch feature channel weight value vectors.

在本实施方式中，对所有的预备分支特征通道权重值向量，抽取对应于同一个通道的元素，在完成m个通道的抽取后，得到m个重组后通道权重值向量，对m个重组后通道权重值向量进行归一化，再将归一化后的m个重组后通道权重值向量还原到n个预备分支特征通道权重值向量中的原来位置，从而通过分支间的预备分支特征通道权重值向量的横向对比，实现分支间对比特征加权。In this embodiment, for all the pre-branch feature channel weight value vectors, elements corresponding to the same channel are extracted, and after m channel extraction is completed, m reorganized channel weight value vectors are obtained, and m reorganized channel weight value vectors are obtained. The channel weight value vectors are normalized, and then the normalized m reorganized channel weight value vectors are restored to the original positions in the n preliminary branch feature channel weight value vectors, so that the pre-branch feature channel weights between branches The horizontal comparison of value vectors realizes the comparison feature weighting between branches.

综上所述，本实施例提供的基于注意力机制的图片处理方法，基于压缩特征通道权重值向量，对压缩特征进行通道级别的特征加权，得到加权后的压缩特征，再对加权后的压缩特征在高度、宽度的维度下进行下采样，得到下采样后的压缩特征，将下采样后的压缩特征拆分成n组拆分特征，将n组拆分特征还原得到n个分支分别对应的n个分支特征通道权重值向量，从而通过特征的压缩与还原，计算得到每个分支中通道的通道重要程度。To sum up, the image processing method based on the attention mechanism provided by this embodiment, based on the compressed feature channel weight value vector, performs channel-level feature weighting on the compressed features, obtains the weighted compressed features, and then weights the compressed features The features are down-sampled in the dimensions of height and width to obtain the compressed features after down-sampling, split the compressed features after the down-sampling into n groups of split features, and restore the n groups of split features to obtain the corresponding N branch feature channel weight value vectors, so that the channel importance of the channel in each branch is calculated through feature compression and restoration.

可以理解的是，上述方法实施例可以单独实施，也可以组合实施，本发明对此不加以限制。It can be understood that the above method embodiments may be implemented individually or in combination, which is not limited in the present invention.

在本实施例中还提供了一种注意力模型，该注意力模型用于实现上述实施例及优选实施方式，已经进行过说明的不再赘述。An attention model is also provided in this embodiment, and the attention model is used to implement the above embodiments and preferred implementation manners, and what has been described will not be repeated.

本实施例提供一种注意力模型，如图4所示，该注意力模型包括：This embodiment provides an attention model, as shown in Figure 4, the attention model includes:

输入模块401，用于获取目标图片在n个分支下提取到的n个分支特征，n为正整数；The input module 401 is used to obtain n branch features extracted from the target picture under n branches, where n is a positive integer;

通道权重值计算模块402，用于计算各个分支对应的分支特征通道权重值向量，每个分支特征通道权重值向量用于表征对应分支特征的各个通道对应的通道重要程度；The channel weight value calculation module 402 is used to calculate the branch feature channel weight value vector corresponding to each branch, and each branch feature channel weight value vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature;

特征加权模块403，用于使用分支特征通道权重值向量，对对应的分支特征进行通道级别的特征加权，得到n个通道加权分支特征；The feature weighting module 403 is configured to use the branch feature channel weight value vector to perform channel-level feature weighting on the corresponding branch features to obtain n channel-weighted branch features;

输出模块404，用于将n个通道加权分支特征进行融合，得到目标图片的输出特征。The output module 404 is configured to fuse the features of the n channel weighted branches to obtain the output features of the target picture.

在一些可选的实施方式中，通道权重值计算模块402，包括：In some optional implementation manners, the channel weight calculation module 402 includes:

特征压缩单元，用于对n个分支特征进行特征压缩，得到压缩特征；A feature compression unit is used to perform feature compression on n branch features to obtain compressed features;

压缩特征通道权重值计算单元，用于基于压缩特征，计算压缩特征通道权重值向量，压缩特征通道权重值向量用于表征压缩特征的各个通道对应的通道重要程度；The compressed feature channel weight value calculation unit is used to calculate the compressed feature channel weight value vector based on the compressed feature, and the compressed feature channel weight value vector is used to represent the channel importance corresponding to each channel of the compressed feature;

分支特征通道权重值计算单元，用于基于压缩特征通道权重值向量，计算各个分支对应的分支特征通道权重值向量。The branch feature channel weight value calculation unit is configured to calculate the branch feature channel weight value vector corresponding to each branch based on the compressed feature channel weight value vector.

在一些可选的实施方式中，分支特征通道权重值计算单元，包括：In some optional implementation manners, the branch feature channel weight value calculation unit includes:

加权计算子单元，用于基于压缩特征通道权重值向量，对压缩特征进行通道级别的特征加权，得到加权后的融合特征；The weighted calculation subunit is used to perform channel-level feature weighting on the compressed features based on the compressed feature channel weight value vector to obtain weighted fusion features;

下采样子单元，用于对加权后的压缩特征在高度、宽度的维度下进行下采样，得到下采样后的压缩特征；The downsampling subunit is used to downsample the weighted compressed features in the dimensions of height and width to obtain the downsampled compressed features;

特征拆分子单元，用于将下采样后的压缩特征拆分成n组拆分特征，每组拆分特征对应于一个分支；The feature splitting subunit is used to split the compressed features after downsampling into n groups of split features, each group of split features corresponds to a branch;

特征还原子单元，用于对n组拆分特征分别进行特征还原，得到n个分支分别对应的n个分支特征通道权重值向量。The feature reduction sub-unit is used to perform feature reduction on n groups of split features respectively, and obtain n branch feature channel weight value vectors corresponding to n branches respectively.

在一些可选的实施方式中，特征还原子单元，用于：In some optional embodiments, the feature reduction unit is used for:

将n组拆分特征分别输入至少一个全连接层，输出得到n个预备分支特征通道权重值向量；Input n groups of split features into at least one fully connected layer respectively, and output n preparatory branch feature channel weight value vectors;

对n个预备分支特征通道权重值向量进行分支间的对比加权处理，得到n个分支特征通道权重值向量。The comparison and weighting process between branches is performed on the n preparatory branch feature channel weight value vectors to obtain n branch feature channel weight value vectors.

在一些可选的实施方式中，在压缩特征的通道数为m，m为正整数的情况下，特征还原子单元，用于：In some optional implementations, when the number of channels of the compressed feature is m, and m is a positive integer, the feature reduction unit is used for:

分别提取n个预备分支特征通道权重值向量在同一通道上的通道权重值，得到与m个通道分别对应的m个重组后通道权重值向量；Extracting the channel weight values of the n preparatory branch feature channel weight value vectors on the same channel respectively, and obtaining m reorganized channel weight value vectors corresponding to the m channels respectively;

对各个重组后通道权重值向量分别进行归一化，得到归一化后的m个重组后通道权重值向量；Normalize each reorganized channel weight value vector respectively to obtain normalized m reorganized channel weight value vectors;

使用归一化后的m个重组后通道权重值向量，对n个预备分支特征通道权重值向量进行特征替换，得到n个分支特征通道权重值向量。Using the normalized m reorganized channel weight value vectors, perform feature replacement on the n preparatory branch feature channel weight value vectors to obtain n branch feature channel weight value vectors.

在一些可选的实施方式中，加权计算子单元，用于：In some optional implementation manners, the weighted calculation subunit is used for:

对完成通道级别的特征加权的压缩特征进行局部空间位置的特征加权，得到加权后的压缩特征。The feature weighting of the local spatial position is performed on the compressed features that have completed the channel-level feature weighting, and the weighted compressed features are obtained.

针对完成通道级别的特征加权的压缩特征中的任意一个空间位置像素，计算空间位置像素对应的局部近邻关系向量，局部近邻关系向量用于表征空间位置像素与邻域空间位置像素之间的相关性，邻域空间位置像素是空间位置像素周围的空间位置像素；For any spatial position pixel in the compressed feature with channel-level feature weighting, calculate the local neighbor relationship vector corresponding to the spatial position pixel, and the local neighbor relationship vector is used to represent the correlation between the spatial position pixel and the neighborhood spatial position pixel , the neighborhood spatial location pixel is the spatial location pixel around the spatial location pixel;

对各个空间位置像素对应的局部近邻关系向量进行归一化，得到各个空间位置像素对应的归一化后的局部近邻关系向量；Normalizing the local neighbor relationship vectors corresponding to each spatial position pixel to obtain the normalized local neighbor relationship vectors corresponding to each spatial position pixel;

使用归一化后的局部近邻关系向量，对对应的空间位置像素进行特征加权，得到加权后的压缩特征。Use the normalized local neighbor relationship vector to perform feature weighting on the corresponding spatial position pixels to obtain weighted compressed features.

通过局部邻域选择网络，为空间位置像素从多种局部邻域关系中选择出一种目标局部邻域关系，局部邻域关系用于定义空间位置像素与邻域空间位置像素之间的关系；Through the local neighborhood selection network, a target local neighborhood relationship is selected from a variety of local neighborhood relationships for the spatial location pixel, and the local neighborhood relationship is used to define the relationship between the spatial location pixel and the neighborhood spatial location pixel;

将与空间位置像素满足选择出的目标局部邻域关系的压缩特征，作为空间位置像素的邻域空间位置像素。The compressed feature that satisfies the selected target local neighborhood relationship with the spatial position pixel is used as the neighborhood spatial position pixel of the spatial position pixel.

针对目标空间位置像素，分别计算目标空间位置像素与k个邻域空间位置像素之间的相关性，得到目标空间位置像素的k个局部近邻关系标量，k为正整数；For the target spatial position pixel, respectively calculate the correlation between the target spatial position pixel and k neighborhood spatial position pixels, and obtain k local neighbor relationship scalars of the target spatial position pixel, k is a positive integer;

将目标空间位置像素的k个局部近邻关系标量进行拼接，得到目标空间位置像素对应的局部近邻关系向量。The k local neighbor relationship scalars of the target spatial position pixel are spliced to obtain the local neighbor relationship vector corresponding to the target spatial position pixel.

针对目标空间位置像素，抽出目标空间位置像素的邻域空间位置像素在通道维度的向量，组成目标空间位置像素的局部区域特征；For the target spatial position pixel, extract the vector of the neighborhood spatial position pixel of the target spatial position pixel in the channel dimension to form the local area feature of the target spatial position pixel;

将目标空间位置像素对应的局部近邻关系向量与目标空间位置像素的局部区域特征进行特征加权，得到目标空间位置像素的局部区域加权特征；Perform feature weighting on the local neighbor relationship vector corresponding to the target spatial position pixel and the local area feature of the target spatial position pixel to obtain the local area weighted feature of the target spatial position pixel;

将所有的空间位置像素的局部加权特征进行拼接，得到加权后的压缩特征。The local weighted features of all spatial position pixels are spliced to obtain weighted compressed features.

在一些可选的实施方式中，在压缩特征的通道数为m，m为正整数的情况下，压缩特征通道权重值计算单元，包括：In some optional implementation manners, when the number of compressed feature channels is m, and m is a positive integer, the compressed feature channel weight value calculation unit includes:

通道拆分子单元，用于将压缩特征进行通道级别的拆分，得到m个通道特征图；The channel splitting subunit is used to split the compressed features at the channel level to obtain m channel feature maps;

多统计量统计子单元，用于计算各个通道特征图在多种统计量下的统计量信息，得到m个通道统计量向量；A multi-statistic statistics subunit is used to calculate the statistics information of each channel feature map under multiple statistics, and obtain m channel statistics vectors;

融合特征通道权重值计算子单元，用于基于m个通道统计量向量，计算得到压缩特征通道权重值向量。The fusion feature channel weight value calculation subunit is used to calculate and obtain the compressed feature channel weight value vector based on the m channel statistic vectors.

在一些可选的实施方式中，压缩特征通道权重值计算子单元，用于：In some optional implementation manners, the compressed feature channel weight value calculation subunit is used for:

将m个通道统计量向量分别输入至少一个全连接层，输出得到m个通道分别对应的通道重要程度；Input the m channel statistics vectors into at least one fully connected layer, and output the channel importance corresponding to the m channels respectively;

将m个通道分别对应的通道重要程度按照通道进行排列，得到压缩特征通道权重值向量。The channel importance corresponding to the m channels is arranged according to the channel, and the compressed feature channel weight value vector is obtained.

在一些可选的实施方式中，统计量包括如下中的至少一种：In some optional embodiments, the statistics include at least one of the following:

均值、方差、变异系数、偏度、峰值、最大值、最小值、中位数和四分位数。Mean, variance, coefficient of variation, skewness, peak, maximum, minimum, median, and quartiles.

在一些可选的实施方式中，通道权重值计算模块402还包括特征融合单元，特征融合单元，用于：In some optional implementation manners, the channel weight value calculation module 402 also includes a feature fusion unit, a feature fusion unit, for:

将n个分支特征相加，得到n个分支特征对应的融合特征；Add the n branch features to obtain the fusion features corresponding to the n branch features;

或，or,

将n个分支特征基于通道维度进行连接，得到n个分支特征对应的融合特征。The n branch features are connected based on the channel dimension to obtain the fusion features corresponding to the n branch features.

上述各个模块和单元的更进一步的功能描述与上述对应实施例相同，在此不再赘述。The further functional descriptions of the above-mentioned modules and units are the same as those of the above-mentioned corresponding embodiments, and will not be repeated here.

结合上文实施例中所述的图片处理方法，可以将注意力模型中执行的过程分为四个阶段：1. 特征压缩；2.特征拆分；3.特征筛选；4.特征加权。Combining the image processing methods described in the above embodiments, the process performed in the attention model can be divided into four stages: 1. Feature compression; 2. Feature splitting; 3. Feature screening; 4. Feature weighting.

下面，结合参考图5，对上文实施例进行示例性的说明。Next, with reference to FIG. 5 , an exemplary description will be given to the above embodiment.

（1）多分支特征的提取阶段(1) Extraction stage of multi-branch features

从任意的特征抽取模块获取多分支的图片特征的特征输入，本发明不限制使用的特征抽取模块的具体实现形式。The feature input of multi-branch picture features is obtained from any feature extraction module, and the present invention does not limit the specific implementation form of the feature extraction module used.

可以理解的是，在特征提取阶段，每个分支所起的作用不同，每个分支通常实现的作用也不同，如每个分支提供不同感受野的特征，能够在融合阶段提供更丰富的特征。但是分支变多（通常意味着网络变宽）提供丰富特征的同时，毫无疑问的引入了大量的噪声，冗余的特征在更多情况下不仅不会提高网络的性能，通常还会损害分类器的分类能力。因此需要设计一种机制，去除冗余的特征，保留下最优鉴别力的特征，在下文阶段中的注意力模型即可实现此效果。It is understandable that in the feature extraction stage, each branch plays a different role, and each branch usually achieves different functions. For example, each branch provides features of different receptive fields, which can provide richer features in the fusion stage. However, when more branches (usually means a wider network) provide rich features, there is no doubt that a lot of noise is introduced. Redundant features will not only improve the performance of the network in more cases, but also usually damage the classification. classification ability of the device. Therefore, it is necessary to design a mechanism to remove redundant features and retain the features of optimal discrimination. The attention model in the following stage can achieve this effect.

在图5中，假设每个分支输入特征图是c×H×W，且共有4个分支。In Figure 5, it is assumed that each branch input feature map is c×H×W, and there are 4 branches in total.

（2）特征压缩阶段(2) Feature compression stage

在该阶段，输入的多分支特征图会进行融合，融合操作可以采用两种：a)所有特征图相加；b)所有特征图在通道维度连接到一起（concat），融合后的特征使用F表示，F的维度是C×H×W。At this stage, the input multi-branch feature maps will be fused. There are two types of fusion operations: a) adding all feature maps; b) all feature maps are connected together in the channel dimension (concat), and the fused features use F Indicates that the dimension of F is C×H×W.

当4个分支特征图通道数相同的时候可以直接相加，当4个分支特征图通道数各不相同，可以在通道维度连接到一起。When the channel numbers of the four branch feature maps are the same, they can be added directly. When the channel numbers of the four branch feature maps are different, they can be connected together in the channel dimension.

（3）特征拆分阶段(3) Feature splitting stage

融合后的特征包含着大量的信息，如何能从该信息中筛选出最有效的特征图通道，同时抑制对最终输出结果贡献不大的通道，需要设计一种有效的通道选择机制即注意力机制。The fused features contain a lot of information. How to filter out the most effective feature map channels from this information while suppressing the channels that do not contribute much to the final output requires designing an effective channel selection mechanism, namely the attention mechanism. .

融合后特征F维度是C×H×W，C代表融合后的通道数，H代表融合后的高度，W代表融合后的宽度。通常的注意力机制通过遍历每个通道C，求取对应通道特征图 H×W的均值来表示该通道的重要性，求取所有的C个通道的均值后，得到一个向量，通过对该向量进行多层全连接层的计算学习出对应通道的重要程度，最后进行加权。The dimension of the fused feature F is C×H×W, where C represents the number of channels after fusion, H represents the height after fusion, and W represents the width after fusion. The usual attention mechanism represents the importance of the channel by traversing each channel C and calculating the mean value of the corresponding channel feature map H×W. After calculating the mean value of all C channels, a vector is obtained. Carry out multi-layer fully connected layer calculations to learn the importance of the corresponding channels, and finally weight them.

本发明提出一种多参数概率统计的注意力机制，可以真正实现多分支通道级别的特征选择。The present invention proposes an attention mechanism of multi-parameter probability statistics, which can truly realize multi-branch channel-level feature selection.

第一步将融合特征F按照通道C的维度拆分出来，第二步，对每个通道做多参数的概率统计，传统的注意力机制使用均值统计特征图的重要程度，但是方差、变异系数、偏度等统计量都能更精确的反应出该通道特征图的特性。单一地使用均值作为通道重要程度的训练的基值太为笼统，因此，本发命采用了多种统计量计算每个通道特征图的重要程度。The first step is to split the fusion feature F according to the dimension of channel C. The second step is to do multi-parameter probability statistics for each channel. The traditional attention mechanism uses the mean value to count the importance of the feature map, but the variance and coefficient of variation , skewness and other statistics can more accurately reflect the characteristics of the channel feature map. It is too general to use the mean value as the training base value of channel importance alone. Therefore, this invention uses a variety of statistics to calculate the importance of each channel feature map.

下面，对多种统计量进行介绍：Below, a variety of statistics are introduced:

1.均值：用于描述数据取值的平均量。1. Mean: used to describe the average value of the data.

其中，

为第i个像素值，M为像素值的数量，/>

为平均值。in,

is the i-th pixel value, M is the number of pixel values, />

is the average value.

2.方差：用于反应数据的波动与稳定状况。2. Variance: used to reflect the fluctuation and stability of the data.

其中，

为第i个像素值，N为像素值的数量，/>

为平均值，/>

为方差。in,

is the i-th pixel value, N is the number of pixel values, />

is the average value, />

is the variance.

3.变异系数：标准差与均值的比值称为变异系数，是一个无量纲的量，用来刻画数据的相对分散性。3. Coefficient of variation: The ratio of the standard deviation to the mean is called the coefficient of variation, which is a dimensionless quantity used to describe the relative dispersion of data.

其中，

为平均值，/>

为标准差，/>

为变异系数。in,

is the average value, />

is the standard deviation, />

is the coefficient of variation.

4.偏度：用来刻画数据对称性。4. Skewness: used to describe the symmetry of the data.

其中，

为第i个像素值，M为像素值的数量，/>

为平均值，/>

为偏度。in,

is the i-th pixel value, M is the number of pixel values, />

is the average value, />

is the skewness.

5.峰度：用于描述样本数据分布形态相对于正态分布的陡峭程度。5. Kurtosis: It is used to describe the steepness of the sample data distribution shape relative to the normal distribution.

其中，

为第i个像素值，M为像素值的数量，/>

为平均值，/>

为峰度。in,

is the i-th pixel value, M is the number of pixel values, />

is the average value, />

is the kurtosis.

6.最大值（Maximum）：用于表示特征图的最大像素值。6. Maximum (Maximum): used to represent the maximum pixel value of the feature map.

7.最小值（Minimum）：用于表示特征图的最小像素值。7. Minimum: The minimum pixel value used to represent the feature map.

8.中位数（Median）：为特征图像素值的中位数，用于反映特征图数值分布的中心位置。8. Median: It is the median of the pixel values of the feature map, which is used to reflect the center position of the numerical distribution of the feature map.

9.四分位数（Quartiles）：将特征图像素值从小到大排列，将排列后的数值分为四等份，第一、二、三四分位数分别是25%、50%、75%位置的数值，可以反映特征图数值分布的分散程度。9. Quartiles: Arrange the pixel values of the feature map from small to large, and divide the arranged values into four equal parts. The first, second, and third quartiles are 25%, 50%, and 75, respectively. The value of the % position can reflect the degree of dispersion of the value distribution of the feature map.

这些统计量可以通过计算特征图的每个通道的值来得到，并且可以用来描述特征图的数值分布情况。These statistics can be obtained by calculating the value of each channel of the feature map, and can be used to describe the numerical distribution of the feature map.

对于每个通道，计算每个通道的多统计量信息，并组成一个新的通道统计量向量P，该指标可以全面的反应该通道数据的详细分布，用来全面反应该通道的重要程度。For each channel, calculate the multi-statistic information of each channel, and form a new channel statistics vector P, which can fully reflect the detailed distribution of the channel data, and is used to fully reflect the importance of the channel.

通过一个全连接层，对多统计量特征的通道统计量向量P进行学习，输出1个数字，该数字用于代表多统计量特征的非线性组合，反映了该通道的重要程度。通常的注意力机制直接求取每个通道的平均值重组成一个向量，本发明提出的多统计量特征，能更为细致的反应通道重要程度信息，但是统计量有多个，到底哪个统计量能反映通道的重要程度并不知道，所以可以基于全连接的学习机制，来动态学习这些统计量的非线性组合，从而学习该通道的重要程度。Through a fully connected layer, the channel statistics vector P of multi-statistic features is learned, and a number is output, which is used to represent the nonlinear combination of multi-statistic features, reflecting the importance of the channel. The usual attention mechanism directly calculates the average value of each channel and recombines it into a vector. The multi-statistic feature proposed by the present invention can reflect the importance of the channel in more detail. However, there are multiple statistics. Which one is the statistic? It is not known to reflect the importance of the channel, so the nonlinear combination of these statistics can be dynamically learned based on the fully connected learning mechanism, so as to learn the importance of the channel.

遍历所有通道，都经过如上操作，最后得到每个通道学习到的权重值。权重值按照通道进行排列生成融合特征通道权重值向量V，V的维度是C×1。After traversing all channels, the above operations are performed, and finally the weight value learned by each channel is obtained. The weight values are arranged according to the channel to generate the fusion feature channel weight value vector V, and the dimension of V is C×1.

使用融合特征通道权重值向量V对融合特征进行加权，加权方法是，按照V与融合后特征的通道对应关系，进行一一对应相乘，来进行加权。Use the fusion feature channel weight value vector V to weight the fusion features. The weighting method is to perform one-to-one correspondence multiplication according to the channel correspondence between V and the fusion feature to perform weighting.

以上进行完毕分支内多概率通道加权，下面可以进行进一步的局部空间加权。The multi-probability channel weighting in the branch is completed above, and further local space weighting can be performed below.

首先，获取上一步的多概率通道加权后的融合特征。下面针对这个融合特征继续进行处理。First, obtain the fusion features weighted by the multi-probability channel in the previous step. Next, we will continue to process this fusion feature.

1）位置生成，建立局部邻域选择网络，本发明预先定义k种形状的局部邻域关系。对于融合特征，可以设计局部邻域选择网络，通过2次全局下采样和2个全连接层，实现该融合特征的局部区域选择。1) Location generation, establishing a local neighborhood selection network, the present invention pre-defines local neighborhood relationships of k shapes. For fusion features, a local neighborhood selection network can be designed to achieve local area selection of the fusion features through 2 global downsampling and 2 fully connected layers.

2）局部近邻关系向量的计算：对于每个位置对应的融合特征i，计算它与周围的 k个位置之间的相关性。假设特征图的大小为 H x W x C，那么在 i 位置周围的局部区域中的 k 个位置可以表示为 P_i = {j_1, j_2, ..., j_k}。对于每个位置 j_k，可以通过两个 1x1 的卷积层将 i 和 j 位置的特征向量映射为 d 维的向量 f_i 和 f_j。然后，将这两个向量进行拼接，得到一个 2d 维的向量 h_{ij} = [f_i, f_j]，并将其输入到一个小型神经网络中进行处理，生成一个局部近邻关系标量，用于表示位置 i 和位置 j 之间的相关性。2) Calculation of the local neighbor relationship vector: For the fusion feature i corresponding to each position, calculate the correlation between it and the surrounding k positions. Assuming the size of the feature map is H x W x C, then k positions in a local area around position i can be expressed as P_i = {j_1, j_2, ..., j_k}. For each position j_k, the feature vectors at positions i and j can be mapped into d-dimensional vectors f_i and f_j by two 1x1 convolutional layers. Then, concatenate these two vectors to get a 2d-dimensional vector h_{ij} = [f_i, f_j], and input it into a small neural network for processing to generate a local neighbor relationship scalar, which is used to represent Correlation between position i and position j.

局部近邻关系向量的生成：对于每个位置 i，将 i 与周围的 k 个位置之间的相关性 w_{ij} 按列拼接，得到一个 k*1 的列向量作为局部近邻关系向量。Generation of local neighbor relationship vectors: For each position i, the correlation w_{ij} between i and the surrounding k positions is concatenated by column, and a k*1 column vector is obtained as a local neighbor relationship vector.

3）局部近邻关系向量w_{ij}的归一化：对于每个位置 i，使用 Softmax 函数将其周围 k 个位置之间的相关性进行规一化，还是得到一个 k * 1 的列向量。3) Normalization of the local neighbor relationship vector w_{ij}: For each position i, use the Softmax function to normalize the correlation between the k positions around it, and still get a k * 1 column vector.

4）特征加权：将局部近邻关系向量w_{ij}与输入特征图在位置 i 处进行加权，得到一个 C 维的局部区域加权特征，其中每个维度的加权方式为:4) Feature weighting: weight the local neighbor relationship vector w_{ij} and the input feature map at position i to obtain a C-dimensional local area weighted feature, where the weighting method of each dimension is:

遍历融合特征图的每个位置i，在i位置周围的局部区域中的k个位置可以表示为P_i = {j_1, j_2, ..., j_k}，按照通道的维度，抽取这k个位置的向量，融合特征图的维度为H*W*C, 抽取完毕后，得到抽取后的局部区域特征T_i，其维度为k*C。在第i个位置空间局部近邻关系向量w_{ij}（其维度为k*1）与局部区域特征T_i（其维度为k*c）进行矩阵乘法，实现加权融合，即

，得到位置i的局部区域加权特征，其维度为1*C，遍历所有H*W个位置，完成所有空间位置的局部区域加权。Traversing each position i of the fusion feature map, the k positions in the local area around the i position can be expressed as P_i = {j_1, j_2, ..., j_k}, according to the dimension of the channel, extract the k positions Vector, the dimension of the fusion feature map is H*W*C, after the extraction is completed, the extracted local area feature T_i is obtained, and its dimension is k*C. In the i-th position space, the local neighbor relationship vector w_{ij} (its dimension is k*1) is matrix multiplied with the local area feature T_i (its dimension is k*c) to realize weighted fusion, namely

, to obtain the local area weighted feature of position i, whose dimension is 1*C, traverse all H*W positions, and complete the local area weighting of all spatial positions.

输出特征图：将所有位置的局部区域加权特征拼接成一个H*W*C的张量作为输出特征图，该输出特征图即为进行完毕分支内多概率通道加权、局部空间加权的融合特征。Output feature map: The local area weighted features of all positions are spliced into an H*W*C tensor as the output feature map. The output feature map is the fusion feature of multi-probability channel weighting and local space weighting in the completed branch.

对进行完毕分支内多概率通道加权、局部空间加权的融合特征进行高度、宽度维度的下采样，再平均拆分为4份，每份对应于1个个组，分别称为

、/>

、/>

、/>

，每个组的维度为C/4维，下一步使用每个组的拆分特征/>

对每个分支的通道重要性进行训练。After the multi-probability channel weighting and local space weighting fusion features in the completed branch are down-sampled in the height and width dimensions, they are split into 4 parts on average, and each part corresponds to a group, called

, />

, the dimension of each group is C/4 dimension, and the next step is to use the split feature of each group />

Channel importance is trained on each branch.

（4）特征筛选阶段(4) Feature screening stage

首先根据每个组的拆分特征

，建立多个分支，每个分支相互独立，用来计算输入特征图的通道的重要程度，即注意力机制。示例性的，每个分支包含2个全连接层，具体结构如下全连接层->整流 ->全连接层 ->整流，该结构对输入特征/>

进行学习，获取对应输入特征图通道的重要性，通过该操作可以获得预备分支特征通道权重值向量I_i。First according to the split feature of each group

, establish multiple branches, each branch is independent of each other, and is used to calculate the importance of the channel of the input feature map, that is, the attention mechanism. Exemplarily, each branch contains 2 fully connected layers, and the specific structure is as follows: fully connected layer -> rectification -> fully connected layer -> rectification, this structure has an impact on the input features />

Carry out learning to obtain the importance of the channel of the corresponding input feature map, and through this operation, the weight value vector I _i of the feature channel of the preliminary branch can be obtained.

对所有的I_i抽取对应位置的元素重组成一个新的重组后通道权重值向量Vi。抽取对应位置的元素的意思是：给出I_i的维度是C×1，共有I₁到 I_N共N个特征图，对应N个分支，遍历I₁到 I_N，分别取它们第一个元素（即遍历所有向量I_i取第一个元素的值），组成一个新的向量 V_i，V_i的维度为N×1，在本例当中V_i的维度为4×1。Extract the elements at the corresponding positions for all I _i and recombine them into a new reorganized channel weight value vector Vi. The meaning of extracting the element at the corresponding position is: given that the dimension of I _i is C×1, there are a total of N feature maps from I ₁ to I _N , corresponding to N branches, traversing I ₁ to I _N , and taking the first of them respectively elements (that is, traverse all vectors I _i to take the value of the first element), form a new vector V _i , and the dimension of V _i is N×1. In this example, the dimension of V _i is 4×1.

下面将每一个重组后的向量V_i通过softmax函数进行归一化。最后将归一化后的特征一一对应的替换原来的特征向量I_i，即最后将归一化后的特征还原到原来位置，得到分支特征通道权重值向量W₁到 W_N。Next, each reorganized vector V _i is normalized by the softmax function. Finally, replace the original feature vector I _i with the normalized features one by one, that is, finally restore the normalized features to their original positions, and obtain branch feature channel weight value vectors W ₁ to W _N .

（5）特征加权阶段(5) Feature weighting stage

使用经过训练和筛选的通道重要性特征W₁到 W_N对输入特征进行加权。加权方法包括：第一步，W₁到 W_N分别和对应的输入特征图b₁到 b_N按照通道的对应性进行像素级的相乘，得到通道加权后的向量Wb₁到 Wb_N；第二步，所有通道加权后的特征进行相加并输出最后的结果。该阶段可用如下公式表示：The input features are weighted using the trained and filtered channel importance features W ₁ to W _N . The weighting method includes: in the first step, W ₁ to W _N are respectively multiplied with the corresponding input feature maps b ₁ to b _N at the pixel level according to the correspondence of the channels, and the channel-weighted vectors Wb ₁ to Wb _N are obtained; the second In the second step, the weighted features of all channels are added and the final result is output. This stage can be expressed by the following formula:

其中，

代表哈达玛积；N为分支数量，/>

为第i个分支对应通道权重值，/>

为第i个分支对应分支特征；/>

为输出特征。in,

Represents Hadamard product; N is the number of branches, />

is the channel weight value corresponding to the i-th branch, />

is the branch feature corresponding to the i-th branch; />

is the output feature.

可以理解的是，上文所述过程也可以概括为：将目标图片拆分成多个分支；对多分支特征进行融合；对融合特征进行分支内多参数概率统计注意力学习，得到通道级别加权后的融合特征；通过局部邻域选择网络，对通道级别加权后的融合特征再次进行空间局部位置加权；对完成双重加权后的融合特征进行下采样；对下采样后的融合特征进行特征拆分、特征还原；对还原得到的通道权重值向量进行分支间注意力重组、对比、加权，得到处理完成的各分支对应的通道权重值向量；使用各分支对应的通道权重值向量，对各分支特征进行加权融合，得到目标图片的输出特征。It is understandable that the process described above can also be summarized as: splitting the target image into multiple branches; fusing the multi-branch features; performing multi-parameter probability statistical attention learning within the branch on the fused features, and obtaining channel-level weighted The final fusion features; through the local neighborhood selection network, the fusion features weighted at the channel level are weighted again in the spatial local position; the fusion features after the double weighting are completed; the fusion features after the downsampling are subjected to feature splitting , feature restoration; perform attention reorganization, comparison, and weighting between branches on the restored channel weight value vectors, and obtain the channel weight value vectors corresponding to each branch that has been processed; Perform weighted fusion to obtain the output features of the target image.

基于上文的注意力模型，可以构建基本的网络层，本发明实施例中提出了两种典型的网络层：第一基本块结构、第二基本块结构。Based on the above attention model, the basic network layer can be constructed. In the embodiment of the present invention, two typical network layers are proposed: the first basic block structure and the second basic block structure.

（1）第一基本块结构(1) The first basic block structure

第一基本块结构中包括：如上文实施例的注意力模型、相加模块；注意力模型，用于在输入目标图片在n个分支下提取到的分支特征的情况下，输出目标图片的输出特征，n为正整数；相加模块，用于将目标图片的输出特征、目标图片相加，得到第一基本块结构针对目标图片的输出结果。The first basic block structure includes: the attention model and the addition module of the above embodiment; the attention model is used to output the output of the target picture in the case of the branch features extracted under the n branches of the input target picture feature, n is a positive integer; the addition module is used to add the output feature of the target picture and the target picture to obtain the output result of the first basic block structure for the target picture.

示例性的，如图6所示，通过不同尺寸的卷积核对目标图片进行特征提权，得到多个分支特征，将多个分支特征输入注意力模型，得到目标图片的输出特征，再将目标图片的输出特征、目标图片相加，得到最终的输出结果。Exemplarily, as shown in Figure 6, the feature weight of the target image is lifted through convolution kernels of different sizes to obtain multiple branch features, and the multiple branch features are input into the attention model to obtain the output features of the target image, and then the target image The output features of the image and the target image are added to obtain the final output result.

（2）第二基本块结构(2) The second basic block structure

第二基本块结构中包括：如上文实施例的注意力模型、相加模块；注意力模型，用于在输入目标图片在n个分支下提取到的分支特征的情况下，输出目标图片的输出特征，n为正整数；相加模块，用于将目标图片的输出特征、目标图片在批量标准化后的结果相加，得到第二基本块结构针对目标图片的输出结果。The second basic block structure includes: the attention model and the addition module of the above embodiment; the attention model is used to output the output of the target picture in the case of the branch features extracted under the n branches of the input target picture feature, n is a positive integer; the addition module is used to add the output feature of the target picture and the result of the target picture after batch normalization to obtain the output result of the second basic block structure for the target picture.

示例性的，如图7所示，通过不同尺寸的卷积核对目标图片进行特征提权，得到多个分支特征，将多个分支特征输入注意力模型，得到目标图片的输出特征，再将目标图片的输出特征、目标图片在批量标准化后的结果相加，得到最终的输出结果。Exemplarily, as shown in Figure 7, the feature weight of the target image is lifted through convolution kernels of different sizes to obtain multiple branch features, and the multiple branch features are input into the attention model to obtain the output features of the target image, and then the target image The output features of the image and the results of the target image after batch normalization are added to obtain the final output result.

可以理解的是，如图6、图7所示，本发明设计的注意力模型是即插即用的，即对应多分支的结构，直接插入到多分支的输出位置，最终会得到加权后的特征图。It can be understood that, as shown in Figure 6 and Figure 7, the attention model designed by the present invention is plug-and-play, that is, the structure corresponding to the multi-branch is directly inserted into the output position of the multi-branch, and finally the weighted feature map.

示例性的，本发明实施例提供的基本块结构可以用于图像分类，其具体使用如图8所示，网络的具体结构可以包括：卷积->批量标准化，整流 ->最大池化->第二基本块结构->第一基本块结构 ->第一基本块结构 ->平均池化 ->卷积 ->第二基本块结构->第一基本块结构 ->第一基本块结构->第一基本块结构->全局平均池化 ->全连接层 ->全连接层->归一化->分类概率。Exemplarily, the basic block structure provided by the embodiment of the present invention can be used for image classification, and its specific use is shown in Figure 8. The specific structure of the network can include: convolution -> batch normalization, rectification -> maximum pooling -> Second basic block structure -> first basic block structure -> first basic block structure -> average pooling -> convolution -> second basic block structure -> first basic block structure -> first basic block structure - >First basic block structure -> global average pooling -> fully connected layer -> fully connected layer -> normalization -> classification probability.

可以理解的是，本发明实施例对基本块结构的具体应用并不加以限制，图8仅为示例性的说明。It can be understood that, the embodiment of the present invention does not limit the specific application of the basic block structure, and FIG. 8 is only an exemplary illustration.

本发明实施例还提供一种计算机设备，具有上述图4所示的注意力模型。An embodiment of the present invention also provides a computer device having the attention model shown in FIG. 4 above.

请参阅图9，图9是本发明可选实施例提供的一种计算机设备的结构示意图，如图9所示，该计算机设备包括：一个或多个处理器10、存储器20，以及用于连接各部件的接口，包括高速接口和低速接口。各个部件利用不同的总线互相通信连接，并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在计算机设备内执行的指令进行处理，包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如，耦合至接口的显示设备)上显示GUI的图形信息的指令。在一些可选的实施方式中，若需要，可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样，可以连接多个计算机设备，各个设备提供部分必要的操作(例如，作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图9中以一个处理器10为例。Please refer to FIG. 9. FIG. 9 is a schematic structural diagram of a computer device provided in an optional embodiment of the present invention. As shown in FIG. 9, the computer device includes: one or more processors 10, memory 20, and a Interfaces of various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively connected to each other using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the computer device, including instructions stored in or on the memory, to display graphical information of a GUI on an external input/output device such as a display device coupled to an interface. In some alternative implementations, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple computing devices may be connected, with each device providing some of the necessary operations (eg, as a server array, a set of blade servers, or a multi-processor system). A processor 10 is taken as an example in FIG. 9 .

处理器10可以是中央处理器，网络处理器或其组合。其中，处理器10还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路，可编程逻辑器件或其组合。上述可编程逻辑器件可以是复杂可编程逻辑器件，现场可编程逻辑门阵列，通用阵列逻辑或其任意组合。Processor 10 may be a central processing unit, a network processor or a combination thereof. Wherein, the processor 10 may further include a hardware chip. The aforementioned hardware chip may be an application specific integrated circuit, a programmable logic device or a combination thereof. The aforementioned programmable logic device may be complex programmable logic device, field programmable logic gate array, general array logic or any combination thereof.

其中，所述存储器20存储有可由至少一个处理器10执行的指令，以使所述至少一个处理器10执行实现上述实施例示出的方法。Wherein, the memory 20 stores instructions executable by at least one processor 10, so that the at least one processor 10 executes and implements the methods shown in the above-mentioned embodiments.

存储器20可以包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需要的应用程序；存储数据区可存储根据一种小程序落地页的展现的计算机设备的使用所创建的数据等。此外，存储器20可以包括高速随机存取存储器，还可以包括非瞬时存储器，例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些可选的实施方式中，存储器20可选包括相对于处理器10远程设置的存储器，这些远程存储器可以通过网络连接至该计算机设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 20 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; created data, etc. In addition, the memory 20 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices. In some optional implementations, the memory 20 may optionally include memories that are remotely located relative to the processor 10, and these remote memories may be connected to the computer device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

存储器20可以包括易失性存储器，例如，随机存取存储器；存储器也可以包括非易失性存储器，例如，快闪存储器，硬盘或固态硬盘；存储器20还可以包括上述种类的存储器的组合。The memory 20 may include a volatile memory, such as a random access memory; the memory may also include a non-volatile memory, such as a flash memory, a hard disk or a solid state disk; the memory 20 may also include a combination of the above types of memories.

该计算机设备还包括通信接口30，用于该计算机设备与其他设备或通信网络通信。The computer device also includes a communication interface 30 for the computer device to communicate with other devices or a communication network.

本发明实施例还提供了一种计算机可读存储介质，上述根据本发明实施例的方法可在硬件、固件中实现，或者被实现为可记录在存储介质，或者被实现通过网络下载的原始存储在远程存储介质或非暂时机器可读存储介质中并将被存储在本地存储介质中的计算机代码，从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件的存储介质上的这样的软件处理。其中，存储介质可为磁碟、光盘、只读存储记忆体、随机存储记忆体、快闪存储器、硬盘或固态硬盘等；进一步地，存储介质还可以包括上述种类的存储器的组合。可以理解，计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件，当软件或计算机代码被计算机、处理器或硬件访问且执行时，实现上述实施例示出的方法。The embodiment of the present invention also provides a computer-readable storage medium. The above-mentioned method according to the embodiment of the present invention can be implemented in hardware or firmware, or can be recorded in a storage medium, or can be downloaded through the network for original storage. Computer code on a remote storage medium or a non-transitory machine-readable storage medium to be stored on a local storage medium so that the methods described herein can be stored on a computer using a general purpose computer, a special purpose processor, or programmable or dedicated hardware Such software processing on storage media. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk or a solid-state hard disk, etc.; further, the storage medium may also include a combination of the above types of memories. It can be understood that a computer, processor, microprocessor controller or programmable hardware includes a storage component that can store or receive software or computer code, and when the software or computer code is accessed and executed by the computer, processor or hardware, the above-mentioned implementation Example method.

虽然结合附图描述了本发明的实施例，但是本领域技术人员可以在不脱离本发明的精神和范围的情况下做出各种修改和变型，这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the present invention. within the bounds of the requirements.

Claims

1. A picture processing method, is characterized in that, described method comprises:

Obtain n branch features extracted from the target image under n branches, where n is a positive integer;

Calculating the branch feature channel weight value vector corresponding to each branch, each of the branch feature channel weight value vectors is used to represent the channel importance corresponding to each channel of the corresponding branch feature;

Using the branch feature channel weight value vector, performing channel-level feature weighting on the corresponding branch features to obtain n channel weighted branch features;

The n channel weighted branch features are fused to obtain the output features of the target picture.

2. The method according to claim 1, wherein the calculation of branch feature channel weight value vectors corresponding to each branch comprises:

Perform feature compression on n branch features to obtain compressed features;

Based on the compressed feature, calculate a compressed feature channel weight value vector, and the compressed feature channel weight value vector is used to represent the channel importance corresponding to each channel of the compressed feature;

Based on the compressed feature channel weight value vector, a branch feature channel weight value vector corresponding to each branch is calculated.

3. The method according to claim 2, wherein the calculation of the branch feature channel weight value vector corresponding to each branch based on the compressed feature channel weight value vector includes:

Based on the compressed feature channel weight value vector, perform channel-level feature weighting on the compressed feature to obtain weighted compressed features;

Downsampling the weighted compression feature in the dimensions of height and width to obtain the downsampled compression feature;

The compressed features after the downsampling are split into n groups of split features, each group of split features corresponds to a branch;

Perform feature restoration on the n groups of split features respectively to obtain n branch feature channel weight value vectors corresponding to the n branches respectively.

4. The method according to claim 3, wherein said splitting features of said n groups are respectively subjected to feature restoration to obtain n branch feature channel weight value vectors corresponding to n branches respectively, including:

Input the split features of n groups into at least one fully connected layer respectively, and output n preparatory branch feature channel weight value vectors;

Perform comparison and weighting processing between branches on the n preparatory branch feature channel weight value vectors to obtain the n branch feature channel weight value vectors.

5. The method according to claim 4, wherein, when the number of channels of the compressed feature is m, and m is a positive integer, the n preparatory branch feature channel weight value vectors are carried out between branches. Contrast weighting processing to obtain the n branch feature channel weight value vectors, including:

Extracting the channel weight values of the n preparatory branch feature channel weight value vectors on the same channel respectively, and obtaining m reorganized channel weight value vectors corresponding to the m channels respectively;

Normalize each reorganized channel weight value vector respectively to obtain normalized m reorganized channel weight value vectors;

Using the normalized m reorganized channel weight value vectors, perform feature replacement on the n preparatory branch feature channel weight value vectors to obtain the n branch feature channel weight value vectors.

6. method according to claim 3, is characterized in that, the compression feature after described weighting also comprises following weighting process:

The feature weighting of the local spatial position is performed on the compressed features that have been weighted at the channel level to obtain the weighted compressed features.

7. The method according to claim 6, wherein the feature weighting of the local spatial position is performed on the compressed features that have completed the feature weighting of the channel level, and the compressed features after the weighting are obtained, including:

For any spatial position pixel in the compressed feature weighted by the channel level, calculate the local neighbor relationship vector corresponding to the spatial position pixel, and the local neighbor relationship vector is used to represent the spatial position pixel and the neighborhood a correlation between spatially located pixels, said neighborhood spatially located pixels being spatially located pixels surrounding said spatially located pixel;

Normalizing the local neighbor relationship vectors corresponding to each spatial position pixel to obtain the normalized local neighbor relationship vectors corresponding to each spatial position pixel;

Using the normalized local neighbor relationship vectors, perform feature weighting on corresponding spatial position pixels to obtain the weighted compressed features.

8. The method according to claim 7, wherein, before calculating the local neighbor relationship vector corresponding to the spatial position pixel for any spatial position pixel in the compressed feature of the feature weighting of the channel level, The method also includes:

Through the local neighborhood selection network, a target local neighborhood relationship is selected from a variety of local neighborhood relationships for the spatial location pixel, and the local neighborhood relationship is used to define the relationship between the spatial location pixel and the neighborhood spatial location pixel. relationship between

The compressed feature that satisfies the selected target local neighborhood relationship with the spatial position pixel is used as the neighborhood spatial position pixel of the spatial position pixel.

9. The method according to claim 7, wherein, for any one spatial position pixel in the compressed feature of the feature weighting of the completed channel level, calculating the local neighbor relationship vector corresponding to the spatial position pixel comprises:

For the target spatial position pixel, respectively calculate the correlation between the target spatial position pixel and k neighborhood spatial position pixels to obtain k local neighbor relationship scalars of the target spatial position pixel, k is a positive integer;

The k local neighbor relationship scalars of the target spatial position pixel are spliced to obtain a local neighbor relationship vector corresponding to the target spatial position pixel.

10. The method according to claim 7, characterized in that, using the normalized local neighbor relationship vector to perform feature weighting on corresponding spatial position pixels to obtain the weighted compressed features, comprising :

For the target spatial position pixel, extract the vector of the neighborhood spatial position pixel of the target spatial position pixel in the channel dimension to form the local area feature of the target spatial position pixel;

Performing feature weighting on the local neighbor relationship vector corresponding to the target spatial position pixel and the local area feature of the target spatial position pixel to obtain the local area weighted feature of the target spatial position pixel;

The local weighted features of all spatial position pixels are concatenated to obtain the weighted compressed features.

11. The method according to claim 2, wherein, when the channel number of the compressed feature is m, and m is a positive integer, the compressed feature channel weight value vector is calculated based on the compressed feature, include:

Splitting the compressed features at the channel level to obtain m channel feature maps;

Calculate the statistics information of each channel feature map under various statistics, and obtain m channel statistics vectors;

Based on the m channel statistic vectors, the compressed feature channel weight value vector is obtained by calculation.

12. The method according to claim 11, wherein the calculation of the compressed feature channel weight value vector based on the m channel statistic vectors comprises:

Input the m channel statistic vectors into at least one fully connected layer respectively, and output the channel importance corresponding to the m channels respectively;

The channel importances corresponding to the m channels are arranged according to the channels to obtain the compressed feature channel weight value vector.

13. The method according to claim 11, wherein the statistic comprises at least one of the following:

Mean, variance, coefficient of variation, skewness, peak, maximum, minimum, median, and quartiles.

14. The method according to claim 2, wherein the calculation process of the compressed features comprises:

The n branch features are added to obtain the fusion feature corresponding to the n branch features.

15. The method according to claim 2, wherein the calculation process of the compressed features comprises:

The n branch features are connected based on the channel dimension to obtain fusion features corresponding to the n branch features.

16. A kind of attention model, is characterized in that, described attention model comprises:

The input module is used to obtain n branch features extracted from the target picture under n branches, and n is a positive integer;

The channel weight value calculation module is used to calculate the branch feature channel weight value vector corresponding to each branch, and each branch feature channel weight value vector is used to represent the channel importance corresponding to each channel of the corresponding branch feature;

A feature weighting module, configured to use the branch feature channel weight value vector to perform channel-level feature weighting on the corresponding branch features to obtain n channel-weighted branch features;

An output module, configured to fuse the n channel weighted branch features to obtain the output features of the target picture.

17. A first basic block structure, characterized in that the first basic block structure comprises: an attention model as claimed in claim 16, an addition module;

The attention model is used to output the output features of the target picture when the input target picture is extracted under the branch features of n branches, and n is a positive integer;

The adding module is configured to add the output feature of the target picture to the target picture to obtain an output result of the first basic block structure for the target picture.

18. A second basic block structure, characterized in that the second basic block structure comprises: an attention model as claimed in claim 16, an addition module;

The adding module is configured to add the output feature of the target picture and the result of batch normalization of the target picture to obtain the output result of the second basic block structure for the target picture.

19. A computer device, comprising:

A memory and a processor, the memory and the processor are connected in communication with each other, computer instructions are stored in the memory, and the processor performs any one of claims 1 to 15 by executing the computer instructions The image processing method described above.

20. A computer-readable storage medium, wherein computer instructions are stored on the computer-readable storage medium, and the computer instructions are used to make a computer perform the image processing according to any one of claims 1 to 15 method.