CN115205547A

CN115205547A - A target image detection method, device, electronic device and storage medium

Info

Publication number: CN115205547A
Application number: CN202210916890.5A
Authority: CN
Inventors: 王学彬; 王秋明
Original assignee: Beijing Yuanjian Information Technology Co Ltd
Current assignee: Beijing Yuanjian Information Technology Co Ltd
Priority date: 2022-08-01
Filing date: 2022-08-01
Publication date: 2022-10-18

Abstract

The application provides a target image detection method, a target image detection device, electronic equipment and a storage medium, which are applied to the technical field of image detection, wherein the detection method comprises the following steps: acquiring a target image to be detected; inputting a target image to be detected into a backbone network of a pre-trained target detection model, performing feature extraction on the target image to be detected, and determining a shallow feature map and a deep feature map of the target image to be detected; inputting the shallow feature map and the deep feature map into a feature pyramid network of a target detection model, and weighting the shallow feature map and the deep feature map with different weights respectively to determine a fusion feature map; and inputting the fusion characteristic graph into a head network of the target detection model, and determining the target object in the target image to be detected. The shallow feature map and the deep feature map are weighted according to different weights in the feature pyramid network to obtain a fusion feature map, and the detection efficiency and accuracy of the target image are improved.

Description

A target image detection method, device, electronic device and storage medium

技术领域technical field

本申请涉及图像检测技术领域，尤其是涉及一种目标图像的检测方法、装置、电子设备及存储介质。The present application relates to the technical field of image detection, and in particular, to a method, device, electronic device and storage medium for detecting a target image.

背景技术Background technique

在图像审核过程中，审核人员面对每天庞大的数据量，通常依靠人力进行审核，在审核过程中比较费时费力，或者是利用深度学习的神经网络对图像进行审计，深度学习的神经网络中分别包括骨干网络、颈网络和头网络，其中颈网络起着承上启下的作用，因此一个好的颈网络能够更好的传递特征。In the process of image review, the reviewers usually rely on manpower to review the huge amount of data every day, which is time-consuming and labor-intensive in the review process, or use the deep learning neural network to audit the image. Including the backbone network, neck network and head network, the neck network plays a role in connecting the above, so a good neck network can better transfer features.

现阶段，在当前的图像检测方法中，颈网络普遍都是采用特征金字塔(FeaturePyramid Networks，简称FPN)，其主要作用是融合浅层的局部特征和深层的语义特征，但是，在进行特征融合的过程中，不同特征的重要程度往往是不同的，进而会导致融合后的特征不是最优的，从而影响了头网络对图像的检测结果，所以，如何提高图像检测的准确率成为了不容小觑的技术问题。At this stage, in the current image detection methods, the neck network generally adopts Feature Pyramid Networks (FPN), and its main function is to fuse shallow local features and deep semantic features. However, in the process of feature fusion In the process, the importance of different features is often different, which will lead to the non-optimal features after fusion, thus affecting the detection results of the image by the head network. Therefore, how to improve the accuracy of image detection has become a problem that cannot be underestimated. technical issues.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本申请的目的在于提供一种目标图像的检测方法、装置、电子设备及存储介质，通过将目标图像的浅层特征图以及深层特征图输入至特征金字塔网络之中，在特征金字塔网络中对浅层特征图以及深层特征图进行不同权重的加权处理，得到融合特征图，提高了目标图像的检测效率以及准确率。In view of this, the purpose of this application is to provide a detection method, device, electronic device and storage medium of a target image, by inputting the shallow feature map and the deep feature map of the target image into the feature pyramid network, in the feature pyramid In the network, the shallow feature map and the deep feature map are weighted with different weights to obtain the fusion feature map, which improves the detection efficiency and accuracy of the target image.

本申请实施例提供了一种目标图像的检测方法，所述检测方法包括：An embodiment of the present application provides a method for detecting a target image, and the detecting method includes:

获取待检测的目标图像；Obtain the target image to be detected;

将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图；Input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the shallow feature map and deep feature of the target image to be detected. picture;

将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图；Inputting the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, respectively weighting the shallow feature map and the deep feature map with different weights to determine fusion features picture;

将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。The fusion feature map is input to the head network of the target detection model, and the target item in the target image to be detected is determined.

在一种可能的实施方式中，所述将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图，包括：In a possible implementation manner, inputting the shallow feature map and the deep feature map to a feature pyramid network of the target detection model, respectively, for the shallow feature map and the deep feature map Perform weighting processing with different weights to determine the fusion feature map, including:

将所述浅层特征图以及所述深层特征图输入至所述特征金字塔网络进行特征拼接，确定出拼接特征图；Inputting the shallow feature map and the deep feature map to the feature pyramid network for feature splicing, and determining the splicing feature map;

将所述拼接特征图输入至所述特征金字塔网络的卷积层，对所述拼接特征图进行卷积处理，确定出第一目标特征图；Inputting the splicing feature map to the convolution layer of the feature pyramid network, performing convolution processing on the splicing feature map, and determining the first target feature map;

将所述第一目标特征图输入至所述特征金字塔网络的池化层，对所述第一目标特征图进行全局平均池化处理，确定出第二目标特征图；Inputting the first target feature map to the pooling layer of the feature pyramid network, performing a global average pooling process on the first target feature map, and determining a second target feature map;

将所述第二目标特征图输入至所述特征金字塔网络的激活层，对所述第二目标特征图进行激活处理，确定出第一目标权重和第二目标权重；Inputting the second target feature map to the activation layer of the feature pyramid network, performing activation processing on the second target feature map, and determining the first target weight and the second target weight;

将所述第一目标权重和所述第二目标权重输入至所述特征金字塔网络的融合层，利用所述第一目标权重对所述浅层特征图进行加权处理以及利用所述第二目标权重对所述深层特征图进行加权处理，确定出所述融合特征图。Inputting the first target weight and the second target weight to the fusion layer of the feature pyramid network, using the first target weight to weight the shallow feature map and using the second target weight Perform weighting processing on the deep feature map to determine the fusion feature map.

在一种可能的实施方式中，所述将所述拼接特征图输入至所述特征金字塔网络的卷积层，对所述拼接特征图进行卷积处理，确定出第一目标特征图，包括：In a possible implementation manner, the splicing feature map is input into the convolution layer of the feature pyramid network, the convolution processing is performed on the splicing feature map, and the first target feature map is determined, including:

获取初始化的二维矩阵，其中，所述初始化的二维矩阵的维度是根据卷积核的数量以及所述拼接特征图在不同维度下的特征向量的数量确定的；Obtain an initialized two-dimensional matrix, wherein the dimension of the initialized two-dimensional matrix is determined according to the number of convolution kernels and the number of feature vectors of the splicing feature map in different dimensions;

基于在不同维度下所述拼接特征图的特征向量相对应的特征信息，确定出不同维度下所述拼接特征图的特征向量的目标数值；Based on the feature information corresponding to the feature vectors of the spliced feature maps in different dimensions, determine the target value of the feature vectors of the spliced feature maps in different dimensions;

将所述不同维度下所述拼接特征图的特征向量的目标数值添加至所述初始化的二维矩阵，生成目标二维矩阵；The target value of the feature vector of the splicing feature map under the different dimensions is added to the two-dimensional matrix of the initialization, and the target two-dimensional matrix is generated;

每个所述卷积核根据所述目标二维矩阵中的不同维度下所述拼接特征图的特征向量的目标数值，筛选出相对应的不同维度下所述拼接特征图的特征向量，并进行卷积处理，确定出所述第一目标特征图。Each of the convolution kernels filters out the corresponding feature vectors of the spliced feature maps in different dimensions according to the target values of the feature vectors of the spliced feature maps in different dimensions in the target two-dimensional matrix, and performs Convolution processing is performed to determine the first target feature map.

在一种可能的实施方式中，所述将所述第二目标特征图输入至所述特征金字塔网络的激活层，对所述第二目标特征图进行激活处理，确定出第一目标权重和第二目标权重，包括：In a possible implementation manner, the second target feature map is input to the activation layer of the feature pyramid network, the second target feature map is activated, and the first target weight and the first target weight are determined. Two objective weights, including:

利用sigmoid激活函数对所述第二目标特征图进行归一化处理，确定出权重向量；Use the sigmoid activation function to normalize the second target feature map to determine the weight vector;

基于所述权重向量的维度进行划分，确定出所述第一目标权重和所述第二目标权重。The first target weight and the second target weight are determined by dividing based on the dimension of the weight vector.

在一种可能的实施方式中，所述将所述第一目标权重和所述第二目标权重输入至所述特征金字塔网络的融合层，利用所述第一目标权重对所述浅层特征图以及利用所述第二目标权重对所述深层特征图进行加权处理，确定出所述融合特征图，包括：In a possible implementation manner, the first target weight and the second target weight are input to a fusion layer of the feature pyramid network, and the shallow feature map is processed by using the first target weight. and performing weighting processing on the deep feature map by using the second target weight to determine the fusion feature map, including:

利用所述第一目标权重对所述浅层特征图进行加权处理，确定出加权后的浅层特征图；Perform weighting processing on the shallow feature map by using the first target weight to determine the weighted shallow feature map;

利用所述第二目标权重对所述深层特征图进行加权处理，确定出加权后的深层特征图；Perform weighting processing on the deep feature map using the second target weight to determine a weighted deep feature map;

将所述加权后的浅层特征图以及所述加权后的深层特征图进行特征相加，确定出所述融合特征图。The weighted shallow feature map and the weighted deep feature map are feature added to determine the fusion feature map.

在一种可能的实施方式中，通过以下步骤训练出所述目标检测模型：In a possible implementation, the target detection model is trained through the following steps:

获取多个样本图片以及每个样本图片对应的标签信息，将多个所述样本图片划分为训练集以及验证集；Obtain a plurality of sample pictures and label information corresponding to each sample picture, and divide the plurality of sample pictures into a training set and a verification set;

利用所述训练集以及所述训练集中相对应的所述样本图片对应的标签信息对初始训练模型进行多次训练，确定出所述目标检测模型；Use the training set and the label information corresponding to the sample pictures corresponding to the training set to perform multiple training on the initial training model to determine the target detection model;

利用所述验证集对所述目标检测模型进行测试，验证所述目标检测模型的检测准确率。The target detection model is tested by using the verification set to verify the detection accuracy of the target detection model.

本申请实施例还提供了一种目标图像的检测装置，所述检测装置包括：The embodiment of the present application also provides a detection device for a target image, the detection device includes:

获取模块，用于获取待检测的目标图像；an acquisition module for acquiring the target image to be detected;

特征提取模块，用于将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图；The feature extraction module is used to input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the shallowness of the target image to be detected. Layer feature maps and deep feature maps;

特征融合模块，用于将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图；A feature fusion module is used to input the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, and weight the shallow feature map and the deep feature map with different weights respectively processing to determine the fusion feature map;

确定模块，用于将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。A determination module, configured to input the fusion feature map to the head network of the target detection model, and determine the target item in the target image to be detected.

在一种可能的实施方式中，所述特征融合模块在用于所述将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图时，所述特征融合模块具体用于：In a possible implementation manner, in the feature pyramid network used for inputting the shallow feature map and the deep feature map to the target detection model, the feature fusion module performs the corresponding analysis on the shallow feature map respectively. The feature map and the deep feature map are weighted with different weights, and when the fusion feature map is determined, the feature fusion module is specifically used for:

将所述第一目标权重和所述第二目标权重输入至所述特征金字塔网络的融合层，利用所述第一目标权重对所述浅层特征图以及利用所述第二目标权重对所述深层特征图进行加权处理，确定出所述融合特征图。The first target weight and the second target weight are input to the fusion layer of the feature pyramid network, and the shallow feature map is paired with the first target weight and the second target weight is used for the The deep feature map is weighted to determine the fusion feature map.

本申请实施例还提供一种电子设备，包括：处理器、存储器和总线，所述存储器存储有所述处理器可执行的机器可读指令，当电子设备运行时，所述处理器与所述存储器之间通过总线通信，所述机器可读指令被所述处理器执行时执行如上述的目标图像的检测方法的步骤。Embodiments of the present application further provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the The memories communicate with each other through a bus, and when the machine-readable instructions are executed by the processor, the steps of the above-mentioned method for detecting a target image are performed.

本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时执行如上述的目标图像的检测方法的步骤。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the above-mentioned method for detecting a target image are executed.

本申请实施例提供的一种目标图像的检测方法、装置、电子设备及存储介质，所述检测方法包括：获取待检测的目标图像；将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图；将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图；将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。通过将目标图像的浅层特征图以及深层特征图输入至特征金字塔网络之中，在特征金字塔网络中对浅层特征图以及深层特征图进行不同权重的加权处理，得到融合特征图，提高了目标图像的检测效率以及准确率。A target image detection method, device, electronic device, and storage medium provided by the embodiments of the present application include: acquiring a target image to be detected; inputting the target image to be detected into a pre-trained target The backbone network of the detection model performs feature extraction on the target image to be detected, and determines the shallow feature map and the deep feature map of the target image to be detected; the shallow feature map and the deep feature map are Input to the feature pyramid network of the target detection model, respectively weight the shallow feature map and the deep feature map with different weights to determine a fusion feature map; input the fusion feature map to the target The head network of the detection model determines the target item in the target image to be detected. By inputting the shallow feature map and deep feature map of the target image into the feature pyramid network, the shallow feature map and the deep feature map are weighted with different weights in the feature pyramid network to obtain the fusion feature map, which improves the target performance. Image detection efficiency and accuracy.

为使本申请的上述目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附附图，作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present application more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.

附图说明Description of drawings

为了更清楚地说明本申请实施例的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，应当理解，以下附图仅示出了本申请的某些实施例，因此不应被看作是对范围的限定，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他相关的附图。In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following drawings will briefly introduce the drawings that need to be used in the embodiments. It should be understood that the following drawings only show some embodiments of the present application, and therefore do not It should be regarded as a limitation of the scope, and for those of ordinary skill in the art, other related drawings can also be obtained according to these drawings without any creative effort.

图1为本申请实施例所提供的一种目标图像的检测方法的流程图；1 is a flowchart of a method for detecting a target image provided by an embodiment of the present application;

图2为本申请实施例所提供的一种目标图像的检测方法中卷积核进行卷积的示意图；2 is a schematic diagram of convolution performed by a convolution kernel in a method for detecting a target image provided by an embodiment of the present application;

图3为本申请实施例所提供的一种目标图像的检测方法中融合特征图确定过程的示意图；3 is a schematic diagram of a process of determining a fusion feature map in a method for detecting a target image provided by an embodiment of the present application;

图4为本申请实施例所提供的一种目标图像的检测装置的结构示意图之一；4 is one of the schematic structural diagrams of a device for detecting a target image provided by an embodiment of the present application;

图5为本申请实施例所提供的一种目标图像的检测装置的结构示意图之二；FIG. 5 is a second schematic structural diagram of a device for detecting a target image provided by an embodiment of the present application;

图6为本申请实施例所提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，应当理解，本申请中的附图仅起到说明和描述的目的，并不用于限定本申请的保护范围。另外，应当理解，示意性的附图并未按实物比例绘制。本申请中使用的流程图示出了根据本申请的一些实施例实现的操作。应当理解，流程图的操作可以不按顺序实现，没有逻辑的上下文关系的步骤可以反转顺序或者同时实施。此外，本领域技术人员在本申请内容的指引下，可以向流程图添加一个或多个其他操作，也可以从流程图中移除一个或多个操作。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings in the embodiments of the present application. It should be understood that the The accompanying drawings are only for the purpose of illustration and description, and are not used to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this application illustrate operations implemented in accordance with some embodiments of the application. It should be understood that the operations of the flowcharts may be performed out of order and that steps without logical context may be performed in reverse order or concurrently. In addition, those skilled in the art can add one or more other operations to the flowchart, and can also remove one or more operations from the flowchart under the guidance of the content of the present application.

另外，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。因此，以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围，而是仅仅表示本申请的选定实施例。基于本申请的实施例，本领域技术人员在没有做出创造性劳动的前提下所获得的全部其他实施例，都属于本申请保护的范围。In addition, the described embodiments are only some of the embodiments of the present application, but not all of the embodiments. The components of the embodiments of the present application generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Thus, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the application as claimed, but is merely representative of selected embodiments of the application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present application.

为了使得本领域技术人员能够使用本申请内容，结合特定应用场景“对图像进行检测”，给出以下实施方式，对于本领域技术人员来说，在不脱离本申请的精神和范围的情况下，可以将这里定义的一般原理应用于其他实施例和应用场景。In order to enable those skilled in the art to use the content of the present application, in combination with a specific application scenario "detecting images", the following embodiments are given. For those skilled in the art, without departing from the spirit and scope of the present application, The general principles defined herein may be applied to other embodiments and application scenarios.

本申请实施例下述方法、装置、电子设备或计算机可读存储介质可以应用于任何需要对图像进行检测的场景，本申请实施例并不对具体的应用场景作限制，任何使用本申请实施例提供的一种目标图像的检测方法、装置、电子设备及存储介质的方案均在本申请保护范围内。The following methods, apparatuses, electronic devices, or computer-readable storage media in the embodiments of the present application can be applied to any scene where images need to be detected. The embodiments of the present application do not limit the specific application scenarios. The solution of a method, device, electronic device and storage medium for detecting a target image are all within the protection scope of the present application.

首先，对本申请可适用的应用场景进行介绍。本申请可应用于图像检测技术领域。First, the applicable application scenarios of this application are introduced. The present application can be applied to the field of image detection technology.

经研究发现，现阶段，在当前的图像检测方法中，颈网络普遍都是采用特征金字塔(Feature Pyramid Networks，简称FPN)，其主要作用是融合浅层的局部特征和深层的语义特征，但是，在进行特征融合的过程中，不同特征的重要程度往往是不同的，进而会导致融合后的特征不是最优的，从而影响了头网络对图像的检测结果，所以，如何提高图像检测的准确率成为了不容小觑的技术问题。The research found that at this stage, in the current image detection methods, the neck network generally adopts the feature pyramid (Feature Pyramid Networks, FPN for short), and its main function is to fuse the local features of the shallow layer and the semantic features of the deep layer, but, In the process of feature fusion, the importance of different features is often different, which will lead to the non-optimal features after fusion, which will affect the detection results of the image by the head network. Therefore, how to improve the accuracy of image detection It has become a technical problem that cannot be underestimated.

基于此，本申请实施例提供了一种目标图像的检测方法，通过将目标图像的浅层特征图以及深层特征图输入至特征金字塔网络之中，在特征金字塔网络中对浅层特征图以及深层特征图进行不同权重的加权处理，得到融合特征图，提高了目标图像的检测效率以及准确率。Based on this, an embodiment of the present application provides a method for detecting a target image. By inputting the shallow feature map and the deep feature map of the target image into the feature pyramid network, the shallow feature map and the deep feature map are detected in the feature pyramid network. The feature map is weighted with different weights to obtain a fusion feature map, which improves the detection efficiency and accuracy of the target image.

请参阅图1，图1为本申请实施例所提供的一种目标图像的检测方法的流程图。如图1所示，本申请实施例提供的检测方法，包括：Please refer to FIG. 1 , which is a flowchart of a method for detecting a target image provided by an embodiment of the present application. As shown in Figure 1, the detection method provided by the embodiment of the present application includes:

S101：获取待检测的目标图像。S101: Acquire a target image to be detected.

该步骤中，获取待检测的目标图像，这里，待检测的目标图像可以为审核人员进行审核的违禁品图像，或者是其他需进行审核检测的图像，这里，不限制目标图像的类型，任何一种需要进行审核检测的图像均可以。In this step, the target image to be detected is obtained. Here, the target image to be detected can be an image of contraband to be reviewed by reviewers, or other images that need to be reviewed and detected. Here, the type of target image is not limited. Any image that needs to be reviewed and tested can be used.

S102：将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图。S102: Input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the shallow feature map of the target image to be detected and Deep feature maps.

该步骤中，将待检测的目标图像输入到预先训练好的目标检测模型的骨干网络之中，骨干网络对待检测的目标图像进行特征提取，确定出待检测的目标图像的浅层特征图以及深层特征图。In this step, the target image to be detected is input into the backbone network of the pre-trained target detection model, the backbone network performs feature extraction on the target image to be detected, and determines the shallow feature map and the deep layer of the target image to be detected. feature map.

这里，浅层特征图为与输入的待检测的目标图像较为接近，包含更多的像素点的信息，一些细粒度的信息，如待检测的目标图像的一些颜色、纹理、边缘以及棱角信息，浅层特征图的感受野较小，感受野重叠区域也较小，所以保证网络捕获到更多的细节。Here, the shallow feature map is closer to the input target image to be detected, and contains more pixel information, some fine-grained information, such as some color, texture, edge and edge information of the target image to be detected, The receptive field of the shallow feature map is smaller, and the overlapping area of the receptive field is also smaller, so it is guaranteed that the network captures more details.

这里，深层特征图包含一些粗粒度的信息，包含的是更抽象的信息，即语义信息，深层特征图的感受野增加，感受野之间重叠区域增加图像信息进行压缩时，获取到的是待检测的目标图像整体性的一些信息。Here, the deep feature map contains some coarse-grained information, including more abstract information, that is, semantic information. The receptive field of the deep feature map increases, and the overlapping area between the receptive fields increases the image information for compression. Some information about the integrity of the detected target image.

A：获取多个样本图片以及每个样本图片对应的标签信息，将多个所述样本图片划分为训练集以及验证集。A: Obtain multiple sample images and label information corresponding to each sample image, and divide the multiple sample images into a training set and a validation set.

这里，在网络中收集到多个样本图片，以及每个样本图片对应的标签信息，将多个样本图片以9:1的比例划分成训练集和验证集。Here, multiple sample images are collected in the network, as well as the label information corresponding to each sample image, and the multiple sample images are divided into a training set and a validation set in a ratio of 9:1.

B：利用所述训练集以及所述训练集中相对应的所述样本图片对应的标签信息对初始训练模型进行多次训练，确定出所述目标检测模型。B: Perform multiple training on the initial training model by using the training set and the label information corresponding to the sample pictures corresponding to the training set to determine the target detection model.

这里，先构建出初始训练模型，初始训练模型为神经网络模型，利用训练集和训练集中多个样本图片相对应的标签信息对初始训练模型进行多次训练，得到目标检测模型。Here, an initial training model is constructed first. The initial training model is a neural network model, and the initial training model is trained multiple times by using the training set and the label information corresponding to multiple sample pictures in the training set to obtain a target detection model.

C：利用所述验证集对所述目标检测模型进行测试，验证所述目标检测模型的检测准确率。C: Use the verification set to test the target detection model to verify the detection accuracy of the target detection model.

这里，训练完成后利用验证集对目标检测模型进行测试，验证目标检测模型的性能，若目标检测模型的性能较高，则说明该目标检测模型可以对目标图像进行检测，若目标检测模型的性能较低，则需继续对目标检测模型进行训练。Here, after the training is completed, use the validation set to test the target detection model to verify the performance of the target detection model. If the performance of the target detection model is high, it means that the target detection model can detect the target image. If the performance of the target detection model is high If it is lower, you need to continue to train the target detection model.

在具体实施例中，获取多个样本枪支的图片，将多个所述样本枪支的图片制作成检测图片数据集，并将所检测图片数据集划分成训练集以及验证集；构建初始训练模型，利用训练集和训练集中多个样本枪支图片相对应的标签信息对初始训练模型进行多次训练，得到目标检测模型，训练完成后利用验证集对目标检测模型进行测试，验证目标检测模型的性能，以使目标检测模型能够根据目标图像检测出违禁品枪支。In a specific embodiment, pictures of a plurality of sample firearms are obtained, a plurality of pictures of the sample firearms are made into a detection picture data set, and the detected picture data set is divided into a training set and a verification set; an initial training model is constructed, Use the label information corresponding to the training set and multiple sample gun pictures in the training set to train the initial training model for multiple times to obtain the target detection model. After the training is completed, use the validation set to test the target detection model to verify the performance of the target detection model. So that the target detection model can detect contraband firearms according to the target image.

S103：将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图。S103: Input the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, and perform weighting processing on the shallow feature map and the deep feature map with different weights, respectively, to determine Fusion feature maps.

该步骤中，将在骨干网络获取到的浅层特征图以及深层特征图输入至特征金字塔网络，在特征金字塔网络中对浅层特征图和深层特征图以不同权重进行加权处理并进行特征融合，确定出融合特征图。In this step, the shallow feature map and the deep feature map obtained in the backbone network are input into the feature pyramid network, and the shallow feature map and the deep feature map are weighted with different weights in the feature pyramid network and feature fusion is performed. The fusion feature map is determined.

这里，对浅层特征图和深层特征图进行加权处理的权重是不相同的。Here, the weights for weighting the shallow feature map and the deep feature map are different.

a：将所述浅层特征图以及所述深层特征图输入至所述特征金字塔网络进行特征拼接，确定出拼接特征图。a: Input the shallow feature map and the deep feature map to the feature pyramid network for feature splicing, and determine the splicing feature map.

这里，将浅层特征图和深层特征图输入至特征金字塔网络，对浅层特征图和深层特征图进行拼接，得到拼接特征图。Here, the shallow feature map and the deep feature map are input to the feature pyramid network, and the shallow feature map and the deep feature map are spliced to obtain the spliced feature map.

其中，浅层特征图和深层特征图的维度相一致，若浅层特征图与深层特征图的维度为W*H*C，则拼接特征图的维度为W*H*2C。Among them, the dimensions of the shallow feature map and the deep feature map are consistent. If the dimensions of the shallow feature map and the deep feature map are W*H*C, the dimension of the spliced feature map is W*H*2C.

b：将所述拼接特征图输入至所述特征金字塔网络的卷积层，对所述拼接特征图进行卷积处理，确定出第一目标特征图。b: Input the spliced feature map into the convolution layer of the feature pyramid network, perform convolution processing on the spliced feature map, and determine the first target feature map.

这里，将拼接特征图输入至特征金字塔网络的卷积层，对拼接特征图进行卷积处理，确定出第一目标特征图。Here, the spliced feature map is input into the convolution layer of the feature pyramid network, and the spliced feature map is convolved to determine the first target feature map.

其中，第一目标特征图为对拼接特征图进行卷积后，得到与拼接特征图的维度相一致的特征图。The first target feature map is a feature map that is consistent with the dimension of the spliced feature map after convolving the spliced feature map.

(1)：获取初始化的二维矩阵，其中，所述初始化的二维矩阵的维度是根据卷积核的数量以及所述拼接特征图在不同维度下的特征向量确定的。(1): Obtain an initialized two-dimensional matrix, wherein the dimension of the initialized two-dimensional matrix is determined according to the number of convolution kernels and the feature vectors of the splicing feature map in different dimensions.

其中，若拼接特征图的维度为W*H*2C，不同维度下拼接特征图的特征向量为在W*H*1、W*H*2…W*H*2C维度下拼接特征图所对应的特征向量。Among them, if the dimension of the spliced feature map is W*H*2C, the feature vectors of the spliced feature maps in different dimensions are corresponding to the spliced feature maps in the dimensions of W*H*1, W*H*2...W*H*2C eigenvectors of .

其中，初始化的二维矩阵中的卷积核的数量与拼接特征图的通道数相一致，若拼接特征图的维度为W*H*2C，则初始化的二维矩阵的维度为2C*2C。Among them, the number of convolution kernels in the initialized two-dimensional matrix is consistent with the number of channels of the spliced feature map. If the dimension of the spliced feature map is W*H*2C, the dimension of the initialized two-dimensional matrix is 2C*2C.

其中，初始化的二维矩阵的列表示2C个卷积核，初始化的二维矩阵的行表示2C个不同维度下拼接特征图的特征向量。Among them, the columns of the initialized two-dimensional matrix represent 2C convolution kernels, and the rows of the initialized two-dimensional matrix represent the feature vectors of the spliced feature maps in 2C different dimensions.

(2)：基于在不同维度下所述拼接特征图的特征向量相对应的特征信息，确定出不同维度下所述拼接特征图的特征向量的目标数值。(2): Based on the feature information corresponding to the feature vectors of the spliced feature maps in different dimensions, determine the target values of the feature vectors of the spliced feature maps in different dimensions.

这里，利用在不同维度下拼接特征图的特征向量相对应特征信息的重要程度，确定出不同维度下拼接特征图的特征向量的目标数值。Here, the target value of the feature vector of the spliced feature maps in different dimensions is determined by using the importance of the feature information corresponding to the feature vectors of the spliced feature maps in different dimensions.

这里，初始化的二维矩阵在训练学习的过程中，根据不同维度下拼接特征图的特征向量相对应特征信息的重要程度，挑选出初始化的二维矩阵认为需要进行卷积的维度下的拼接特征图的特征向量，并将需要进行卷积的维度下的拼接特征图的特征向量的目标数值赋予大于0，将不需要进行卷积维度下的特征拼接图的特征向量的目标数据赋予小于0。Here, in the process of training and learning of the initialized two-dimensional matrix, according to the importance of the feature information corresponding to the feature vectors of the spliced feature maps in different dimensions, the initialized two-dimensional matrix is selected for the splicing features in the dimension that needs to be convolved. The feature vector of the graph, and the target value of the feature vector of the splicing feature map in the dimension that needs to be convolutional is assigned greater than 0, and the target value of the feature vector of the feature splicing map in the dimension that does not require convolution is assigned less than 0.

其中，不同维度下拼接特征图的特征向量相对应特征信息的重要程度是根据专家经验或者是网络学习确定出来的。Among them, the importance of the feature vectors corresponding to the feature information of the spliced feature maps in different dimensions is determined according to expert experience or network learning.

其中，不同维度下拼接特征图的特征向量相对应特征信息的重要程度较高的则将该特征向量的目标数值赋予大于0，较低的则将特征向量的目标数值赋予小于0。Among them, if the feature vectors of the spliced feature maps in different dimensions have a higher degree of importance corresponding to the feature information, the target value of the feature vector is given greater than 0, and the target value of the feature vector is given less than 0 if it is lower.

举例来讲，拼接特征图为刀具的违禁品图像，在不同维度下刀具的违禁品图像的特征图像是不同的，如在某一维度下刀具的违禁品图像的特征图像包括颜色、纹理信息，则该维度下的特征信息的重要程度较低，如某一维度下刀具的违禁品图像的特征图像包括刀具的轮廓，则该维度下的特征信息的重要程度较高。For example, the stitched feature map is the image of the contraband of the tool. The feature image of the contraband image of the tool in different dimensions is different. For example, the feature image of the contraband image of the tool in a certain dimension includes color and texture information. The importance of the feature information in this dimension is lower. For example, the feature image of the contraband image of the tool in a certain dimension includes the contour of the tool, and the importance of the feature information in this dimension is higher.

(3)：将所述不同维度下所述拼接特征图的特征向量的目标数值添加至所述初始化的二维矩阵，生成目标二维矩阵。(3): Add the target value of the feature vector of the splicing feature map in the different dimensions to the initialized two-dimensional matrix to generate a target two-dimensional matrix.

这里，将不同维度下所述拼接特征图的特征向量的目标数值添加至初始化的二维矩阵，生成目标二维矩阵。Here, the target values of the feature vectors of the spliced feature maps in different dimensions are added to the initialized two-dimensional matrix to generate the target two-dimensional matrix.

其中，目标二维矩阵中的每一行为2C个不同维度下拼接特征图的特征向量以及该维度下拼接特征图的特征向量相对应的目标数值，目标二维矩阵中的每一列为2C个卷积核。Among them, each row in the target two-dimensional matrix is the eigenvector of the spliced feature map in 2C different dimensions and the target value corresponding to the eigenvector of the spliced feature map in this dimension, and each column in the target two-dimensional matrix has 2C volumes accumulated nucleus.

(4)：每个所述卷积核根据所述目标二维矩阵中的不同维度下所述拼接特征图的特征向量的目标数值，筛选出相对应的不同维度下所述拼接特征图的特征向量，并进行卷积处理，确定出所述第一目标特征图。(4): According to the target value of the feature vector of the splicing feature map under different dimensions in the target two-dimensional matrix, each of the convolution kernels filters out the features of the splicing feature map in the corresponding different dimensions. vector, and perform convolution processing to determine the first target feature map.

这里，每个卷积核根据目标二维矩阵中的不同维度下拼接特征图的特征向量的目标数值，筛选出相对应的不同维度下拼接特征图的特征向量，并进行卷积处理，确定出第一目标特征图。Here, each convolution kernel filters out the corresponding eigenvectors of the spliced feature maps in different dimensions according to the target value of the eigenvectors of the spliced feature maps in different dimensions in the target two-dimensional matrix, and performs convolution processing to determine the The first target feature map.

这里，每个卷积核需要根据确定出相对应的需要卷积的维度下拼接特征图的特征向量，不需要将每个维度下的下拼接特征图的特征向量进行卷积。Here, each convolution kernel needs to convolve the feature vector of the feature map under the corresponding dimension that needs to be convolved, and it is not necessary to convolve the feature vector of the feature map under the convolution under each dimension.

其中，目标二维矩阵(M)内的数值在训练过程中不断更新，若M(i,j)>＝0，则表示第i个卷积核要对第j个维度下拼接特征图的特征向量进行卷积，若M(i,j)<0，表示第i个卷积核不对第j个维度下拼接特征图的特征向量进行卷积。Among them, the value in the target two-dimensional matrix (M) is continuously updated during the training process. If M(i,j)>=0, it means that the i-th convolution kernel needs to stitch the features of the feature map in the j-th dimension. The vector is convolved. If M(i,j)<0, it means that the i-th convolution kernel does not convolve the feature vector of the spliced feature map in the j-th dimension.

其中，每个卷积核对相对应的不同维度下拼接特征图的特征向量进行卷积后，得到多个卷积特征图，将多个卷积特征图进行拼接得到第一目标特征图。Among them, after each convolution kernel convolves the feature vectors of the corresponding spliced feature maps in different dimensions, multiple convolution feature maps are obtained, and the multiple convolution feature maps are spliced to obtain the first target feature map.

这里，请参阅图2，图2为本申请实施例所提供的一种目标图像的检测方法中卷积核进行卷积的示意图，如图2所示，黑色圆点代表目标二维矩阵中该维度下拼接特征图的特征向量的目标数值大于0，白色圆点代表维度下拼接特征图的特征向量的目标数值小于0。第一正方形为与目标二维矩阵相对应的不同维度下拼接特征图的特征向量，第二正方形为与目标矩阵相对应的卷积核。举例来讲，目标二维矩阵中的第一行中的第一个黑色圆点、第三个黑色圆点、第四个黑色圆点以及第七个黑色圆点表征着第一卷积核需要对第一维度下的拼接特征图的特征向量、第三维度下的拼接特征图的特征向量、第四维度下的拼接特征图的特征向量以及第七维度下的拼接特征图的特征向量进行卷积；目标二维矩阵中的第二行中的第一个黑色圆点、第四个黑点、第五个黑点以及第八个黑点表征着第二卷积核需要对第一维度下的拼接特征图的特征向量、第四维度下的拼接特征图的特征向量、第五维度下的拼接特征图的特征向量以及第八维度下的拼接特征图的特征向量进行卷积，往下依次类推。这样，避免了FPN在进行特征融合过程中不同的特征向量均是利用相同权重进行融合进而导致的融合后的特征并不是最优的技术问题，而本方案是选取出每个卷积核相对应的进行卷积的维度下的拼接特征图的特征向量，不需将所有维度下的拼接特征图的特征向量均进行卷积，从而使得第一目标特征图的特征语义更高。Here, please refer to FIG. 2. FIG. 2 is a schematic diagram of convolution performed by a convolution kernel in a method for detecting a target image provided by an embodiment of the present application. As shown in FIG. 2, the black dots represent the target two-dimensional matrix in the The target value of the feature vector of the spliced feature map under the dimension is greater than 0, and the white dots represent that the target value of the feature vector of the spliced feature map under the dimension is less than 0. The first square is the feature vector of the spliced feature maps in different dimensions corresponding to the target two-dimensional matrix, and the second square is the convolution kernel corresponding to the target matrix. For example, the first black dot, the third black dot, the fourth black dot, and the seventh black dot in the first row of the target two-dimensional matrix represent the needs of the first convolution kernel Roll the feature vector of the mosaic feature map in the first dimension, the feature vector of the mosaic feature map under the third dimension, the feature vector of the mosaic feature map under the fourth dimension, and the feature vector of the mosaic feature map under the seventh dimension. Product; the first black dot, the fourth black dot, the fifth black dot, and the eighth black dot in the second row of the target two-dimensional matrix represent that the second convolution kernel needs to perform a The feature vector of the splicing feature map, the feature vector of the splicing feature map under the fourth dimension, the feature vector of the splicing feature map under the fifth dimension, and the feature vector of the splicing feature map under the eighth dimension are convolved, and the order is down. analogy. In this way, it avoids the technical problem that the fused features are not optimal because different feature vectors are fused with the same weight during the feature fusion process of FPN. In this solution, each convolution kernel is selected corresponding to It is not necessary to convolve the feature vectors of the stitched feature maps in all dimensions, so that the feature semantics of the first target feature map is higher.

c：将所述第一目标特征图输入至所述特征金字塔网络的池化层，对所述第一目标特征图进行全局平均池化处理，确定出第二目标特征图。c: Input the first target feature map to the pooling layer of the feature pyramid network, perform global average pooling on the first target feature map, and determine a second target feature map.

这里，将第一目标特征图输入至池化层，对第一目标特征图进行全局平均池化处理，确定出第二目标特征图。Here, the first target feature map is input to the pooling layer, the global average pooling process is performed on the first target feature map, and the second target feature map is determined.

d：将所述第二目标特征图输入至所述特征金字塔网络的激活层，对所述第二目标特征图进行激活处理，确定出第一目标权重和第二目标权重。d: Input the second target feature map to the activation layer of the feature pyramid network, perform activation processing on the second target feature map, and determine the first target weight and the second target weight.

这里，将第二目标特征图输入至特征金字塔网络的激活层，对第二目标特征图进行激活处理，确定出第一目标权重和第二目标权重。Here, the second target feature map is input to the activation layer of the feature pyramid network, the second target feature map is activated, and the first target weight and the second target weight are determined.

一：利用sigmoid激活函数对所述第二目标特征图进行归一化处理，确定出权重向量。1: Use the sigmoid activation function to normalize the second target feature map to determine the weight vector.

这里，利用sigmoid激活函数对所述第二目标特征图进行归一化处理，确定出权重向量。Here, a sigmoid activation function is used to normalize the second target feature map to determine a weight vector.

二：基于所述权重向量的维度进行划分，确定出所述第一目标权重和所述第二目标权重。Step 2: Divide based on the dimension of the weight vector, and determine the first target weight and the second target weight.

这里，利用权重向量的维度进行划分，确定出第一目标权重和第二目标权重。如，若第一权重的维度为1*1*2C，将权重向量进行还原为两个尺寸均为1*1*C的第一目标权重和第二目标权重，其中，第一目标权重和第二目标权重虽然维度一样，但是所包含的权重内容是不同的。Here, the dimension of the weight vector is used for division, and the first target weight and the second target weight are determined. For example, if the dimension of the first weight is 1*1*2C, the weight vector is restored to two first and second target weights with dimensions of 1*1*C, wherein the first target weight and the second target weight are Although the dimensions of the two target weights are the same, the content of the weights contained is different.

e：将所述第一目标权重和所述第二目标权重输入至所述特征金字塔网络的融合层，利用所述第一目标权重对所述浅层特征图以及利用所述第二目标权重所述深层特征图进行加权处理，确定出所述融合特征图。e: Input the first target weight and the second target weight to the fusion layer of the feature pyramid network, use the first target weight to combine the shallow feature map and the second target weight The deep feature map is weighted to determine the fusion feature map.

这里，将第一目标权重和第二目标权重输入至特征金字塔网络的融合层，对浅层特征图以及深层特征图进行加权处理，确定出融合特征图。Here, the first target weight and the second target weight are input to the fusion layer of the feature pyramid network, and the shallow layer feature map and the deep layer feature map are weighted to determine the fusion feature map.

1)：利用所述第一目标权重对所述浅层特征图进行加权处理，确定出加权后的浅层特征图。1): Perform weighting processing on the shallow feature map by using the first target weight to determine a weighted shallow feature map.

这里，利用第一目标权重对浅层特征图进行加权处理，确定出加权后的浅层特征图。Here, the first target weight is used to weight the shallow layer feature map to determine the weighted shallow layer feature map.

2)：利用所述第二目标权重对所述深层特征图进行加权处理，确定出加权后的深层特征图。2): Perform weighting processing on the deep feature map by using the second target weight to determine a weighted deep feature map.

这里，利用第二目标权重对深层特征图进行加权处理，确定出加权后的深层特征图。Here, the deep feature map is weighted by using the second target weight to determine the weighted deep feature map.

3)：将所述加权后的浅层特征图以及所述加权后的深层特征图进行特征相加，确定出所述融合特征图。3): Perform feature addition on the weighted shallow feature map and the weighted deep feature map to determine the fusion feature map.

这里，将加权后的浅层特征图以及加权后的深层特征图进行特征相加，确定出融合特征图。Here, the features of the weighted shallow feature map and the weighted deep feature map are added to determine the fusion feature map.

这里，还可以利用第一目标权重对深层特征图进行加权处理，确定出加权后的深层特征图，利用第二目标权重对浅层特征图进行加权处理，确定出加权后的浅层特征图。Here, the first target weight may be used to weight the deep feature map to determine the weighted deep feature map, and the second target weight may be used to weight the shallow feature map to determine the weighted shallow feature map.

进一步的，请参阅图3，图3为本申请实施例所提供的一种目标图像的检测方法中融合特征图确定过程的示意图。如图3所示，浅层特征图的维度为W*H*C、深层特征图的维度为W*H*C，将浅层特征图和深层特征图进行拼接处理，得到拼接特征图，拼接特征图的维度为W*H*2C，每个卷积核根据目标二维矩阵中筛选出相对应的不同维度下拼接特征图的特征向量，并进行卷积处理，确定出第一目标特征图，第一目标特征图的维度为W*H*2C，对第一目标特征图进行池化处理，确定出第二目标特征图，第二目标特征图的维度为1*1*2C，对第二目标特征图进行激活处理，确定出第一目标权重1*1*C和第二目标权重1*1*C，利用第一目标权重对浅层特征图进行加权处理，确定出加权后的浅层特征图，利用第二目标权重对深层特征图进行加权处理，确定出加权后的深层特征图，将加权后的浅层特征图以及所述加权后的深层特征图进行相加处理，确定出融合特征图。因为浅层特征图和深层特征图的特征语义是不同的，若用相等的权重进行加权处理会导致融合特征图的语义信息不准确的技术问题，而本申请利用确定出来的第一目标权重对浅层特征图进行加权处理，利用第二目标权重对深层特征图进行加权处理，避免了FPN在进行特征融合过程中不同的特征向量均是利用相同权重进行融合进而导致的融合后的特征并不是最优的技术问题，提高了融合特征图的语义信息的准确率。Further, please refer to FIG. 3 , which is a schematic diagram of a process of determining a fusion feature map in a method for detecting a target image provided by an embodiment of the present application. As shown in Figure 3, the dimension of the shallow feature map is W*H*C, and the dimension of the deep feature map is W*H*C. The shallow feature map and the deep feature map are spliced to obtain a spliced feature map. The dimension of the feature map is W*H*2C, and each convolution kernel filters out the feature vectors of the corresponding feature maps in different dimensions from the target two-dimensional matrix, and performs convolution processing to determine the first target feature map. , the dimension of the first target feature map is W*H*2C, the first target feature map is pooled to determine the second target feature map, and the dimension of the second target feature map is 1*1*2C. The two target feature maps are activated, and the first target weight 1*1*C and the second target weight 1*1*C are determined, and the first target weight is used to weight the shallow feature map, and the weighted shallow feature map is determined. layer feature map, using the second target weight to weight the deep feature map, determine the weighted deep feature map, add the weighted shallow feature map and the weighted deep feature map, and determine the Fusion feature maps. Because the feature semantics of the shallow feature map and the deep feature map are different, if equal weights are used for weighting processing, the technical problem of inaccurate semantic information of the fusion feature map will result. The shallow feature map is weighted, and the second target weight is used to weight the deep feature map, which avoids that different feature vectors are fused with the same weight during the feature fusion process of FPN, and the fused features are not. The optimal technical problem improves the accuracy of the semantic information of the fusion feature map.

S104：将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。S104: Input the fusion feature map to the head network of the target detection model, and determine the target item in the target image to be detected.

该步骤中，将融合特征图输入至目标检测模型的头网络，确定出待检测的目标图像中的目标物品。In this step, the fusion feature map is input to the head network of the target detection model, and the target item in the target image to be detected is determined.

这里，如目标图像为审核人员需进行审核的违禁品图像，则识别出来的目标物品可为刀具、枪支等其他违禁品。Here, if the target image is an image of a contraband to be reviewed by the reviewer, the identified target item may be other contraband such as knives and guns.

在具体实施例中，获取待检测的目标图像，将待检测的目标图像输入至目标检测模型的骨干网络，输出浅层特征图以及深层特征图；将浅层特征图以及深层特征图输入至目标检测模型的特征金字塔网络，对浅层特征图以及深层特征图分别进行不同权重加权处理，输出融合特征图；其中，特征金字塔网络层增加了二维矩阵以使在浅层特征图以及深层特征图进行特征融合的过程中采用非等权重的方式进行融合；将融合特征图输入至目标检测模型的头网络，对融合特征图进行预测，待检测的目标图像中包括的违禁品。In a specific embodiment, the target image to be detected is acquired, the target image to be detected is input into the backbone network of the target detection model, and the shallow feature map and the deep feature map are output; the shallow feature map and the deep feature map are input to the target The feature pyramid network of the detection model, the shallow feature map and the deep feature map are weighted with different weights respectively, and the fusion feature map is output; among them, the feature pyramid network layer adds a two-dimensional matrix to make the shallow feature map and the deep feature map In the process of feature fusion, the method of unequal weight is used for fusion; the fusion feature map is input to the head network of the target detection model, and the fusion feature map is predicted, and the contraband included in the target image to be detected.

本申请实施例提供的一种目标图像的检测方法，检测方法包括：获取待检测的目标图像；将待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对待检测的目标图像进行特征提取，确定出待检测的目标图像的浅层特征图以及深层特征图；将浅层特征图以及深层特征图输入至目标检测模型的特征金字塔网络，分别对浅层特征图和深层特征图以不同权重进行加权处理，确定出融合特征图；将融合特征图输入至目标检测模型的头网络，确定出待检测的目标图像中的目标物品。通过将目标图像的浅层特征图以及深层特征图输入到特征金字塔网络之中，在特征金字塔网络中对浅层特征图以及深层特征图进行不同权重的加权处理，得到融合特征图，提高了目标图像的检测效率以及准确率。A method for detecting a target image provided by an embodiment of the present application includes: acquiring a target image to be detected; inputting the target image to be detected into a backbone network of a pre-trained target detection model, and performing the detection on the target image to be detected. Feature extraction, determine the shallow feature map and deep feature map of the target image to be detected; input the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, respectively Different weights are weighted to determine the fusion feature map; the fusion feature map is input to the head network of the target detection model to determine the target object in the target image to be detected. By inputting the shallow feature map and the deep feature map of the target image into the feature pyramid network, the shallow feature map and the deep feature map are weighted with different weights in the feature pyramid network to obtain the fusion feature map, which improves the target performance. Image detection efficiency and accuracy.

请参阅图4、图5，图4为本申请实施例所提供的一种目标图像的检测装置的结构示意图之一，图5为本申请实施例所提供的一种目标图像的检测装置的结构示意图之二。如图4中所示，所述目标图像的检测装置400包括：Please refer to FIG. 4 and FIG. 5 , FIG. 4 is one of the schematic structural diagrams of a target image detection apparatus provided by an embodiment of the present application, and FIG. 5 is a structure of a target image detection apparatus provided by an embodiment of the present application Diagram two. As shown in FIG. 4 , the detection device 400 of the target image includes:

获取模块410，用于获取待检测的目标图像；an acquisition module 410, configured to acquire a target image to be detected;

特征提取模块420，用于将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图；The feature extraction module 420 is used to input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the characteristics of the target image to be detected. Shallow feature maps and deep feature maps;

特征融合模块430，用于将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图；The feature fusion module 430 is used for inputting the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, and performing different weights on the shallow feature map and the deep feature map respectively. Weighted processing to determine the fusion feature map;

确定模块440，用于将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。The determining module 440 is configured to input the fusion feature map into the head network of the target detection model, and determine the target item in the target image to be detected.

进一步的，所述特征融合模块430在用于所述将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图时，所述特征融合模块430具体用于：Further, the feature fusion module 430 is used to input the shallow feature map and the deep feature map into the feature pyramid network of the target detection model, respectively. The deep feature maps are weighted with different weights, and when a fusion feature map is determined, the feature fusion module 430 is specifically used for:

进一步的，所述特征融合模块430在用于所述将所述拼接特征图输入至所述特征金字塔网络的卷积层，对所述拼接特征图进行卷积处理，确定出第一目标特征图时，所述特征融合模块430具体用于：Further, the feature fusion module 430 performs convolution processing on the spliced feature map in the convolution layer used to input the spliced feature map into the feature pyramid network to determine the first target feature map. , the feature fusion module 430 is specifically used for:

进一步的，所述特征融合模块430在用于所述将所述第二目标特征图输入至所述特征金字塔网络的激活层，对所述第二目标特征图进行激活处理，确定出第一目标权重和第二目标权重时，所述特征融合模块430具体用于：Further, the feature fusion module 430 performs activation processing on the second target feature map in the activation layer for inputting the second target feature map into the feature pyramid network, and determines the first target. When the weight and the second target weight, the feature fusion module 430 is specifically used for:

进一步的，所述特征融合模块430在用于所述将所述第一目标权重和所述第二目标权重输入至所述特征金字塔网络的融合层，利用所述第一目标权重对所述浅层特征图进行加权处理以及利用所述第二目标权重对所述深层特征图进行加权处理，确定出所述融合特征图时，所述特征融合模块430具体用于：Further, in the fusion layer for inputting the first target weight and the second target weight into the feature pyramid network, the feature fusion module 430 uses the first target weight to perform a When the layer feature map is weighted and the second target weight is used to weight the deep feature map, and the fusion feature map is determined, the feature fusion module 430 is specifically used for:

进一步的，如图5所示，所述目标图像的检测装置400还包括模型训练模块450，所述模型训练模块450通过以下步骤训练出所述目标检测模型：Further, as shown in FIG. 5 , the target image detection device 400 further includes a model training module 450, and the model training module 450 trains the target detection model through the following steps:

本申请实施例提供的一种目标图像的检测装置，所述检测装置包括：获取模块，用于获取待检测的目标图像；特征提取模块，用于将所述待检测的目标图像输入至预先训练好的目标检测模型的骨干网络，对所述待检测的目标图像进行特征提取，确定出所述待检测的目标图像的浅层特征图以及深层特征图；特征融合模块，用于将所述浅层特征图以及所述深层特征图输入至所述目标检测模型的特征金字塔网络，分别对所述浅层特征图和所述深层特征图以不同权重进行加权处理，确定出融合特征图；确定模块，用于将所述融合特征图输入至所述目标检测模型的头网络，确定出所述待检测的目标图像中的目标物品。通过将目标图像的浅层特征图以及深层特征图输入到特征金字塔网络之中，在特征金字塔网络中对浅层特征图以及深层特征图进行不同权重的加权处理，得到融合特征图，提高了目标图像的检测效率以及准确率。An apparatus for detecting a target image provided by an embodiment of the present application includes: an acquisition module for acquiring a target image to be detected; a feature extraction module for inputting the target image to be detected into a pre-training The backbone network of a good target detection model performs feature extraction on the target image to be detected, and determines the shallow feature map and deep feature map of the target image to be detected; The layer feature map and the deep feature map are input to the feature pyramid network of the target detection model, and the shallow feature map and the deep feature map are weighted with different weights to determine the fusion feature map; , which is used to input the fusion feature map to the head network of the target detection model to determine the target item in the target image to be detected. By inputting the shallow feature map and the deep feature map of the target image into the feature pyramid network, the shallow feature map and the deep feature map are weighted with different weights in the feature pyramid network to obtain the fusion feature map, which improves the target performance. Image detection efficiency and accuracy.

请参阅图6，图6为本申请实施例所提供的一种电子设备的结构示意图。如图6中所示，所述电子设备600包括处理器610、存储器620和总线630。Please refer to FIG. 6 , which is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 6 , the electronic device 600 includes a processor 610 , a memory 620 and a bus 630 .

所述存储器620存储有所述处理器610可执行的机器可读指令，当电子设备600运行时，所述处理器610与所述存储器620之间通过总线630通信，所述机器可读指令被所述处理器610执行时，可以执行如上述图1所示方法实施例中的目标图像的检测方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。The memory 620 stores machine-readable instructions executable by the processor 610. When the electronic device 600 is running, the processor 610 communicates with the memory 620 through the bus 630, and the machine-readable instructions are executed. When executed, the processor 610 may execute the steps of the method for detecting a target image in the method embodiment shown in FIG. 1 above. For the specific implementation, please refer to the method embodiment, which will not be repeated here.

本申请实施例还提供一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该计算机程序被处理器运行时可以执行如上述图1所示方法实施例中的目标图像的检测方法的步骤，具体实现方式可参见方法实施例，在此不再赘述。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the computer program can execute the target image in the method embodiment shown in FIG. 1 above. For the steps of the detection method, reference may be made to the method embodiment for a specific implementation manner, which will not be repeated here.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，又例如，多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(Read-OnlyMemory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.

最后应说明的是：以上所述实施例，仅为本申请的具体实施方式，用以说明本申请的技术方案，而非对其限制，本申请的保护范围并不局限于此，尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特征进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present application, and are used to illustrate the technical solutions of the present application, rather than restricting them. The protection scope of the present application is not limited thereto, although referring to the foregoing The embodiments describe the application in detail, and those of ordinary skill in the art should understand that any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the application. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be covered in the application. within the scope of protection. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. a detection method of target image, is characterized in that, described detection method comprises:

Obtain the target image to be detected;

Input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the shallow feature map and deep feature of the target image to be detected. picture;

Inputting the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, respectively weighting the shallow feature map and the deep feature map with different weights to determine fusion features picture;

The fusion feature map is input to the head network of the target detection model, and the target item in the target image to be detected is determined.

2 . The detection method according to claim 1 , wherein, by inputting the shallow feature map and the deep feature map into the feature pyramid network of the target detection model, the shallow feature maps are respectively applied to the feature pyramid network of the target detection model. 3 . The map and the deep feature map are weighted with different weights to determine the fusion feature map, including:

Inputting the shallow feature map and the deep feature map to the feature pyramid network for feature splicing, and determining the splicing feature map;

Inputting the splicing feature map to the convolution layer of the feature pyramid network, performing convolution processing on the splicing feature map, and determining the first target feature map;

Inputting the first target feature map to the pooling layer of the feature pyramid network, performing a global average pooling process on the first target feature map, and determining a second target feature map;

Inputting the second target feature map to the activation layer of the feature pyramid network, performing activation processing on the second target feature map, and determining the first target weight and the second target weight;

Inputting the first target weight and the second target weight to the fusion layer of the feature pyramid network, using the first target weight to weight the shallow feature map and using the second target weight Perform weighting processing on the deep feature map to determine the fusion feature map.

3. The detection method according to claim 2, wherein the splicing feature map is input into the convolution layer of the feature pyramid network, the splicing feature map is subjected to convolution processing, and the first splicing feature map is determined. A target feature map, including:

Obtain an initialized two-dimensional matrix, wherein the dimension of the initialized two-dimensional matrix is determined according to the number of convolution kernels and the number of feature vectors of the splicing feature map in different dimensions;

Based on the feature information corresponding to the feature vectors of the spliced feature maps in different dimensions, determine the target value of the feature vectors of the spliced feature maps in different dimensions;

The target value of the feature vector of the splicing feature map under the different dimensions is added to the two-dimensional matrix of the initialization, and the target two-dimensional matrix is generated;

Each of the convolution kernels filters out the corresponding feature vectors of the spliced feature maps in different dimensions according to the target values of the feature vectors of the spliced feature maps in different dimensions in the target two-dimensional matrix, and performs Convolution processing is performed to determine the first target feature map.

4 . The detection method according to claim 2 , wherein the second target feature map is input into the activation layer of the feature pyramid network, and the second target feature map is activated to determine the Get the first target weight and the second target weight, including:

Use the sigmoid activation function to normalize the second target feature map to determine the weight vector;

The first target weight and the second target weight are determined by dividing based on the dimension of the weight vector.

5 . The detection method according to claim 2 , wherein the first target weight and the second target weight are input into the fusion layer of the feature pyramid network, and the first target weight is used. 6 . Perform weighting processing on the shallow feature map and the deep feature map using the second target weight to determine the fusion feature map, including:

Perform weighting processing on the shallow feature map by using the first target weight to determine the weighted shallow feature map;

Perform weighting processing on the deep feature map using the second target weight to determine a weighted deep feature map;

The weighted shallow feature map and the weighted deep feature map are feature added to determine the fusion feature map.

6. The detection method according to claim 1, wherein the target detection model is trained through the following steps:

Obtain a plurality of sample pictures and label information corresponding to each sample picture, and divide the plurality of sample pictures into a training set and a verification set;

Use the training set and the label information corresponding to the sample pictures corresponding to the training set to perform multiple training on the initial training model to determine the target detection model;

The target detection model is tested by using the verification set to verify the detection accuracy of the target detection model.

7. A detection device for a target image, wherein the detection device comprises:

an acquisition module for acquiring the target image to be detected;

The feature extraction module is used to input the target image to be detected into the backbone network of the pre-trained target detection model, perform feature extraction on the target image to be detected, and determine the shallowness of the target image to be detected. Layer feature maps and deep feature maps;

A feature fusion module is used to input the shallow feature map and the deep feature map to the feature pyramid network of the target detection model, and weight the shallow feature map and the deep feature map with different weights respectively processing to determine the fusion feature map;

A determination module, configured to input the fusion feature map to the head network of the target detection model, and determine the target item in the target image to be detected.

8 . The detection device according to claim 7 , wherein the feature fusion module is used in the feature pyramid network for inputting the shallow feature map and the deep feature map to the target detection model. 9 . , respectively perform weighting processing on the shallow feature map and the deep feature map with different weights, and when the fusion feature map is determined, the feature fusion module is specifically used for:

The first target weight and the second target weight are input to the fusion layer of the feature pyramid network, and the shallow feature map is paired with the first target weight and the second target weight is used for the The deep feature map is weighted to determine the fusion feature map.

9. An electronic device, comprising: a processor, a memory, and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the Communication between the memories is performed through the bus, and the machine-readable instructions are executed by the processor to execute the steps of the method for detecting a target image according to any one of claims 1 to 6.

10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and the computer program executes the target image according to any one of claims 1 to 6 when the computer program is run by a processor steps of the detection method.