CN115100417A

CN115100417A - Image processing method, storage medium, and electronic device

Info

Publication number: CN115100417A
Application number: CN202210662381.4A
Authority: CN
Inventors: 孙修宇; 姜奕祺; 许贤哲
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-06-13
Filing date: 2022-06-13
Publication date: 2022-09-23
Anticipated expiration: 2042-06-13

Abstract

The invention discloses an image processing method, a storage medium and an electronic device. The method includes: acquiring a target image, wherein the target image includes a target object; performing multi-scale feature extraction on the target image to obtain a plurality of first feature maps; and using a multi-branch network structure to perform feature fusion on the plurality of first feature maps , obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; at least one second feature map is detected to obtain the detection result of the target object. The present invention solves the technical problem of low image detection accuracy in the related art.

Description

Image processing method, storage medium, and electronic device

技术领域technical field

本发明涉及图像处理领域，具体而言，涉及一种图像处理方法、存储介质以及电子设备。The present invention relates to the field of image processing, in particular, to an image processing method, a storage medium and an electronic device.

背景技术Background technique

目前，在计算机视觉任务中，一般采用深度学习技术对图像中的对象进行检测，深度学习技术的复杂度较高，导致检测的效率较低，对于复杂度较低的检测方式，其检测的精确度较差。At present, in computer vision tasks, deep learning technology is generally used to detect objects in images. The high complexity of deep learning technology results in low detection efficiency. For low-complexity detection methods, the detection accuracy is high. Poor degree.

针对上述的问题，目前尚未提出有效的解决方案。For the above problems, no effective solution has been proposed yet.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种图像处理方法、存储介质以及电子设备，以至少解决相关技术中对图像进行检测的精确度较低的技术问题。Embodiments of the present application provide an image processing method, a storage medium, and an electronic device, so as to at least solve the technical problem of low image detection accuracy in the related art.

根据本申请实施例的一个方面，提供了一种图像处理方法，包括：获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。According to an aspect of the embodiments of the present application, an image processing method is provided, including: acquiring a target image, wherein the target image includes a target object; performing multi-scale feature extraction on the target image to obtain a plurality of first feature maps; The branch network structure performs feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; The feature map is detected, and the detection result of the target object is obtained.

根据本申请实施例的一个方面，提供了一种图像处理方法，包括：获取目标遥感图像，其中，目标遥感图像包含目标对象；对目标遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。According to an aspect of the embodiments of the present application, an image processing method is provided, including: acquiring a target remote sensing image, wherein the target remote sensing image includes a target object; and performing multi-scale feature extraction on the target remote sensing image to obtain a plurality of first feature maps ; use a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on a plurality of first feature maps through a plurality of branches; A second feature map is used for detection to obtain the detection result of the target object.

根据本申请实施例的一个方面，提供了一种图像处理方法，包括：获取农业遥感图像，其中，农业遥感图像包含农作物；对农业遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到农作物的检测结果。According to an aspect of the embodiments of the present application, an image processing method is provided, including: acquiring an agricultural remote sensing image, wherein the agricultural remote sensing image includes crops; performing multi-scale feature extraction on the agricultural remote sensing image to obtain a plurality of first feature maps; A multi-branch network structure is used to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; The second feature map is detected to obtain the detection result of the crop.

根据本申请实施例的一个方面，提供了一种图像处理方法，包括：获取建筑物遥感图像，其中，建筑物遥感图像包含目标建筑物；对建筑物遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标建筑物的检测结果。According to an aspect of the embodiments of the present application, an image processing method is provided, comprising: acquiring a remote sensing image of a building, wherein the remote sensing image of the building includes a target building; and performing multi-scale feature extraction on the remote sensing image of the building to obtain multiple The first feature map; using a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to feature the plurality of first feature maps through multiple branches. Fusion; detecting at least one second feature map to obtain a detection result of the target building.

根据本申请实施例的一个方面，提供了一种图像处理方法，包括：云服务器获取目标图像，其中，目标图像包含目标对象；云服务器对目标图像进行多尺度特征提取，得到多个第一特征图；云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；云服务器对至少一个第二特征图进行检测，得到目标对象的检测结果。According to an aspect of the embodiments of the present application, an image processing method is provided, comprising: acquiring a target image by a cloud server, wherein the target image contains a target object; and the cloud server performs multi-scale feature extraction on the target image to obtain a plurality of first features Figure; the cloud server utilizes a multi-branch network structure to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches ; the cloud server detects at least one second feature map to obtain a detection result of the target object.

在本申请实施例中，首先获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于表示通过多个分支对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合，对至少一个第二特征图进行检测，得到目标对象的检测结果，实现了提高目标对象检测结果的准确度。容易注意到的是，可以通过多个分支对第一特征图进行特征融合，从而提高得到的至少一个第二特征图的精确度，并且多个分支还可以减少融合过程中的参数量，从而可以提高融合的效率，进而解决了相关技术中对图像进行检测的精确度较低的技术问题。In the embodiment of the present application, first acquire a target image, wherein the target image includes a target object; perform multi-scale feature extraction on the target image to obtain multiple first feature maps; use a multi-branch network structure to perform multiple first feature maps Feature fusion to obtain at least one second feature map, wherein the multi-branch network structure is used to indicate that feature fusion is performed on multiple first feature maps through multiple branches to obtain at least one second feature map, wherein the multi-branch network structure is used. By performing feature fusion on multiple first feature maps through multiple branches, and detecting at least one second feature map, a detection result of the target object is obtained, thereby improving the accuracy of the detection result of the target object. It is easy to notice that feature fusion can be performed on the first feature map through multiple branches, thereby improving the accuracy of the obtained at least one second feature map, and multiple branches can also reduce the amount of parameters in the fusion process, so that the The efficiency of fusion is improved, thereby solving the technical problem of low image detection accuracy in the related art.

附图说明Description of drawings

此处所说明的附图用来提供对本发明的进一步理解，构成本申请的一部分，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。在附图中：The accompanying drawings described herein are used to provide a further understanding of the present invention and constitute a part of the present application. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached image:

图1是根据本申请实施例的一种用于实现图像处理方法的计算机终端(或移动设备)的硬件结构框图；1 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing an image processing method according to an embodiment of the present application;

图2是根据本申请实施例1的图像处理方法的流程图；2 is a flowchart of an image processing method according to Embodiment 1 of the present application;

图3是根据本申请实施例的一种长颈鹿目标检测器的示意图；3 is a schematic diagram of a giraffe target detector according to an embodiment of the present application;

图4是根据本申请实施例的一种检测器的示意图；4 is a schematic diagram of a detector according to an embodiment of the present application;

图5是根据本申请实施例的一种多分支网络结构的示意图；5 is a schematic diagram of a multi-branch network structure according to an embodiment of the present application;

图6是根据本申请实施例的一种通过目标检测模型进行检测的示意图；6 is a schematic diagram of detection by a target detection model according to an embodiment of the present application;

图7是根据本申请实施例的一种目标检测模型训练的示意图；7 is a schematic diagram of a target detection model training according to an embodiment of the present application;

图8是根据本申请实施例2的一种图像处理方法的流程图；8 is a flowchart of an image processing method according to Embodiment 2 of the present application;

图9是根据本申请实施例3的一种图像处理方法的流程图；9 is a flowchart of an image processing method according to Embodiment 3 of the present application;

图10是根据本申请实施例4的一种图像处理方法的流程图；10 is a flowchart of an image processing method according to Embodiment 4 of the present application;

图11是根据本申请实施例5的一种图像处理方法的流程图；11 is a flowchart of an image processing method according to Embodiment 5 of the present application;

图12是根据本申请实施例6的一种图像处理装置的示意图；12 is a schematic diagram of an image processing apparatus according to Embodiment 6 of the present application;

图13是根据本申请实施例7的一种图像处理装置的示意图；13 is a schematic diagram of an image processing apparatus according to Embodiment 7 of the present application;

图14是根据本申请实施例8的一种图像处理装置的示意图；14 is a schematic diagram of an image processing apparatus according to Embodiment 8 of the present application;

图15是根据本申请实施例9的一种图像处理装置的示意图；15 is a schematic diagram of an image processing apparatus according to Embodiment 9 of the present application;

图16是根据本申请实施例的一种计算机终端的结构框图。FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present application.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to make those skilled in the art better understand the solutions of the present invention, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only Embodiments are part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and the like in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

首先，在对本申请实施例进行描述的过程中出现的部分名词或术语适用于如下解释：First of all, some nouns or terms that appear in the process of describing the embodiments of the present application are suitable for the following explanations:

GFPN：Generalized-Feature Pyramid Networks，广义特征金字塔网络结构，作用在检测特征融合部分，用来做不同尺度下的特征融合。GFPN: Generalized-Feature Pyramid Networks, the generalized feature pyramid network structure, acts on the detection feature fusion part, and is used for feature fusion at different scales.

GFocalV2：Generalized Focal Loss V2，一种特殊的网络结构，用于目标检测中目标与预测值的匹配，用于进行特征检测。GFocalV2: Generalized Focal Loss V2, a special network structure for matching the target and the predicted value in target detection for feature detection.

YOLOv1 YOLOv2 YOLOv3 YOLOv4 YOLOv5 YOLOX：一系列特殊的基于图片的检测方法。YOLOv1 YOLOv2 YOLOv3 YOLOv4 YOLOv5 YOLOX: A series of special image-based detection methods.

目前，基于图片的物体检测是机器视觉任务中的一种基础技术，被广泛的应用在遥感，安防，国土，水利，零售等行业中。基于深度学习的物体检测技术是目前的主流方法，但是该方案计算复杂度往往很高，很难满足实际实用。At present, image-based object detection is a basic technology in machine vision tasks, and is widely used in remote sensing, security, homeland, water conservancy, retail and other industries. Object detection technology based on deep learning is the current mainstream method, but the computational complexity of this scheme is often very high, and it is difficult to meet practical use.

本申请提供了一种图像处理方法，可以在提高图像检测效率的同时提高检测结果的准确度。The present application provides an image processing method, which can improve the accuracy of the detection result while improving the image detection efficiency.

实施例1Example 1

根据本申请实施例，还提供了一种图像处理方法实施例，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present application, an embodiment of an image processing method is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although A logical order is shown in the flowcharts, but in some cases steps shown or described may be performed in an order different from that herein.

本申请实施例一所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。图1是根据本申请实施例的一种用于实现图像处理方法的计算机终端(或移动设备)的硬件结构框图。如图1所示，计算机终端10(或移动设备10)可以包括一个或多个(图中采用102a、102b，……，102n来示出)处理器(处理器可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输模块106。除此以外，还可以包括：显示器、输入/输出接口(I/O接口)、通用串行总线(USB)端口(可以作为BUS总线的端口中的一个端口被包括)、网络接口、电源和/或相机。本领域普通技术人员可以理解，图1所示的结构仅为示意，其并不对上述电子装置的结构造成限定。例如，计算机终端10还可包括比图1中所示更多或者更少的组件，或者具有与图1所示不同的配置。The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. FIG. 1 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing an image processing method according to an embodiment of the present application. As shown in FIG. 1 , the computer terminal 10 (or the mobile device 10 ) may include one or more processors (102a, 102b, . A processing device such as an MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission module 106 for communication functions. In addition, may also include: display, input/output interface (I/O interface), universal serial bus (USB) port (may be included as one of the ports of the BUS bus), network interface, power supply and/or or camera. Those of ordinary skill in the art can understand that the structure shown in FIG. 1 is only a schematic diagram, which does not limit the structure of the above electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1 , or have a different configuration than that shown in FIG. 1 .

应当注意到的是上述一个或多个处理器和/或其他数据处理电路在本文中通常可以被称为“数据处理电路”。该数据处理电路可以全部或部分的体现为软件、硬件、固件或其他任意组合。此外，数据处理电路可为单个独立的处理模块，或全部或部分的结合到计算机终端10(或移动设备)中的其他元件中的任意一个内。如本申请实施例中所涉及到的，该数据处理电路作为一种处理器控制(例如与接口连接的可变电阻终端路径的选择)。It should be noted that the one or more processors and/or other data processing circuits described above may generally be referred to herein as "data processing circuits". The data processing circuit may be embodied in whole or in part as software, hardware, firmware or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a kind of processor control (eg, selection of a variable resistance termination path connected to an interface).

存储器104可用于存储应用软件的软件程序以及模块，如本申请实施例中的图像处理方法对应的程序指令/数据存储装置，处理器通过运行存储在存储器104内的软件程序以及模块，从而执行各种功能应用以及数据处理，即实现上述的图像处理方法。存储器104可包括高速随机存储器，还可包括非易失性存储器，如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中，存储器104可进一步包括相对于处理器远程设置的存储器，这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store software programs and modules of the application software, such as a program instruction/data storage device corresponding to the image processing method in the embodiment of the present application. This kind of function application and data processing is to realize the above-mentioned image processing method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, memory 104 may further include memory located remotely from the processor, and these remote memories may be connected to computer terminal 10 through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中，传输装置106包括一个网络适配器(Network Interface Controller，NIC)，其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中，传输装置106可以为射频(Radio Frequency，RF)模块，其用于通过无线方式与互联网进行通讯。Transmission means 106 are used to receive or transmit data via a network. A specific example of the above-mentioned network may include a wireless network provided by a communication provider of the computer terminal 10 . In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices through a base station so as to communicate with the Internet. In one example, the transmission device 106 may be a radio frequency (Radio Frequency, RF) module, which is used for wirelessly communicating with the Internet.

显示器可以例如触摸屏式的液晶显示器(LCD)，该液晶显示器可使得用户能够与计算机终端10(或移动设备)的用户界面进行交互。The display may be, for example, a touch screen type liquid crystal display (LCD) that enables a user to interact with the user interface of the computer terminal 10 (or mobile device).

此处需要说明的是，在一些可选实施例中，上述图1所示的计算机设备(或移动设备)可以包括硬件元件(包括电路)、软件元件(包括存储在计算机可读介质上的计算机代码)、或硬件元件和软件元件两者的结合。应当指出的是，图1仅为特定具体实例的一个实例，并且旨在示出可存在于上述计算机设备(或移动设备)中的部件的类型。It should be noted here that, in some optional embodiments, the computer device (or mobile device) shown in FIG. 1 may include hardware elements (including circuits), software elements (including a computer stored on a computer-readable medium) code), or a combination of both hardware and software elements. It should be noted that FIG. 1 is only one example of a specific embodiment, and is intended to illustrate the types of components that may be present in a computer device (or mobile device) as described above.

在上述运行环境下，本申请提供了如图2所示的图像处理方法。图2是根据本申请实施例1的图像处理方法的流程图。如图2所示，该方法包括：Under the above operating environment, the present application provides an image processing method as shown in FIG. 2 . FIG. 2 is a flowchart of an image processing method according to Embodiment 1 of the present application. As shown in Figure 2, the method includes:

步骤S202，获取目标图像。Step S202, acquiring a target image.

其中，目标图像包含目标对象。Among them, the target image contains the target object.

上述的目标图像可以为包含有待检测目标对象的图像，其中，目标图像可以为无人机和/或卫星获取到的遥感图像，目标图像还可以为拍摄设备拍摄得到的图像。The above-mentioned target image may be an image containing the target object to be detected, wherein the target image may be a remote sensing image obtained by a drone and/or a satellite, and the target image may also be an image captured by a photographing device.

上述的目标对象可以为目标图像中待检测的特定物体。The above-mentioned target object may be a specific object to be detected in the target image.

在农业场景中，目标图像可以为农业遥感图像，目标对象可以为待检测的农作物。In an agricultural scene, the target image may be an agricultural remote sensing image, and the target object may be a crop to be detected.

在建筑物场景中，目标图像可以为建筑物遥感图像，目标对象可以为待检测的建筑物。In a building scene, the target image may be a remote sensing image of a building, and the target object may be a building to be detected.

在一种可选的实施例中，为了更好的对目标图像进行处理，可以将获取到的目标图像传输给相应的处理设备进行处理，例如，直接传输给用户的计算机终端(例如，笔记本电脑、个人电脑等)进行处理，或者通过用户的计算机终端传输给云服务器进行处理。需要说明的是，由于对目标图像处理需要大量的计算资源，在本申请实施例中以处理设备为云服务器为例进行说明。In an optional embodiment, in order to better process the target image, the acquired target image may be transmitted to a corresponding processing device for processing, for example, directly transmitted to the user's computer terminal (for example, a laptop computer) , personal computer, etc.) for processing, or transmitted to the cloud server through the user's computer terminal for processing. It should be noted that, since a large amount of computing resources are required to process the target image, in the embodiments of the present application, the processing device is a cloud server as an example for description.

例如，为了方便用户上传目标图像，可以提供给用户一个交互界面，其中，交互界面中包含“选择图像”、“上传”、“图像显示”等控件，用户可以点击“选择图像”按钮确定需要上传的目标图像，并通过点击“上传”按钮将目标图像上传至云服务器进行处理。另外，为了方便用户确认选择的目标图像是否为需要处理的目标图像，可以在“图像显示”区域中显示选择的目标图像，在用户确认无误之后，通过点击“上传”按钮进行数据上传。For example, in order to facilitate the user to upload the target image, an interactive interface can be provided to the user, wherein the interactive interface includes controls such as "Select Image", "Upload", "Image Display", etc. The user can click the "Select Image" button to determine the need to upload and upload the target image to the cloud server for processing by clicking the "Upload" button. In addition, in order to facilitate the user to confirm whether the selected target image is the target image that needs to be processed, the selected target image can be displayed in the "Image Display" area. After the user confirms that it is correct, the data can be uploaded by clicking the "Upload" button.

需要说明的是，客户端与云服务器之间可以通过特定接口进行数据交互，客户端可以将用户选择的目标对象的描述页面传入接口函数，并作为接口的一个参数，实现将目标对象的描述页面上传至云服务器的目的。It should be noted that the client and the cloud server can exchange data through a specific interface. The client can pass the description page of the target object selected by the user into the interface function, and use it as a parameter of the interface to realize the description of the target object. The purpose of uploading the page to the cloud server.

步骤S204，对目标图像进行多尺度特征提取，得到多个第一特征图。Step S204, performing multi-scale feature extraction on the target image to obtain a plurality of first feature maps.

在一种可选的实施例中，可以对目标图像进行多尺度特征提取，以便于得到不同尺度的多个第一特征图，可以通过预设的特征融合策略对多个第一特征图进行融合，以提高对特征图的检测精确度。In an optional embodiment, multi-scale feature extraction may be performed on the target image, so as to obtain multiple first feature maps of different scales, and the multiple first feature maps may be fused through a preset feature fusion strategy , to improve the detection accuracy of the feature map.

在另一种可选的实施例中，可以利用长颈鹿目标检测器(GiraffeDet)中的特征提取层对目标图像进行多尺度特征提取，得到多个第一特征图，其中，多个第一特征图的尺度不同。可选的，特征提取层中包含有多个缩放层(scale)，每个缩放层对应的尺寸不同，可以根据多个缩放层对目标图像进行多尺度特征提取，得到多个第一特征图。In another optional embodiment, the feature extraction layer in the giraffe target detector (GiraffeDet) may be used to perform multi-scale feature extraction on the target image to obtain multiple first feature maps, wherein the multiple first feature maps different scales. Optionally, the feature extraction layer includes multiple scaling layers (scales), and each scaling layer corresponds to a different size, and multi-scale feature extraction can be performed on the target image according to the multiple scaling layers to obtain multiple first feature maps.

步骤S206，利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图。Step S206, using a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map.

其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合。The multi-branch network structure is used to perform feature fusion on a plurality of first feature maps through a plurality of branches.

上述的多分支网络结构中可以包含两个或两个以上的分支，具体包含的分支数量可以由用户自行设置，也可以根据需求对分支的数量灵活调整。The above-mentioned multi-branch network structure may include two or more branches, and the specific number of branches may be set by the user, and the number of branches may also be flexibly adjusted according to requirements.

在一种可选的实施例中，可以按照预设的特征融合策略利用多分支网络对多个第一特征图进行反复多次的特征融合，得到多个第二特征图，其中，特征融合策略可以为GiraffeDet所使用的特征融合策略。In an optional embodiment, a multi-branch network may be used to repeatedly perform feature fusion on multiple first feature maps according to a preset feature fusion strategy to obtain multiple second feature maps, wherein the feature fusion strategy Feature fusion strategy that can be used by GiraffeDet.

图3是根据本申请实施例的一种长颈鹿目标检测器的示意图，如图3所示，S1-S5为特征提取层提取得到的不同尺度的多个第一特征图，可以将多个第一特征图进行反复多次的融合，并采用log2^N的形式对融合得到的特征图进行连接，得到最终需要检测的特征图。在融合部分，可以采用图3中所示的方式，将不同尺寸的特征图进行上下采样，其中，向上的箭头表示下采样的过程，向上的箭头表示上采样的过程，之后可以将不同尺寸的第一特征图进行拼接后输入到对应的多分支网络结构中，通过多分支网络结构将不同尺寸的第一特征图进行融合，得到融合后的特征，例如S5_0，之后可以反复对不同尺寸的第一特征图和融合后的特征继续进行融合，堆叠出最终的例如S5_N，其中，N表示堆叠的次数，图3中的S5_0是由S4和S5融合得到的，S4_0是由S3、S4和S5融合得到的，S5_1是由S5、S5_0和S4_0融合得到的，其中，S5_1和S5之前存在跨连，其中，跨连是按照log2^N的形式进行连接。FIG. 3 is a schematic diagram of a giraffe target detector according to an embodiment of the present application. As shown in FIG. 3 , S1-S5 are multiple first feature maps of different scales extracted by the feature extraction layer. The feature maps are fused repeatedly, and the feature maps obtained from the fusion are connected in the form of ^log2N to obtain the feature maps that need to be detected in the end. In the fusion part, the method shown in Fig. 3 can be used to up and down sample feature maps of different sizes, wherein the upward arrow represents the down-sampling process, and the upward arrow represents the up-sampling process. After the first feature map is spliced, it is input into the corresponding multi-branch network structure, and the first feature maps of different sizes are fused through the multi-branch network structure to obtain the fused features, such as S5_0. A feature map and the fused feature continue to be fused, and the final example is S5_N, where N represents the number of stacking, S5_0 in Figure 3 is obtained by the fusion of S4 and S5, and S4_0 is obtained by the fusion of S3, S4 and S5 It is obtained that S5_1 is obtained by the fusion of S5, S5_0 and S4_0, wherein, there is a cross-connection before S5_1 and S5, and the cross-connection is connected in the form of log2 ^N.

在一种可选的实施例中，对目标图像中的目标对象进行检测的框架可以为backbone(特征提取网络)-neck(浅层高分辨率细节特征图与低分辨率语义特征图进行融合)-head(检测器)，其中，目前，backbone：neck：head的计算比例一般是2-4:1:2，其中，融合部分的计算比例较小，导致后续的计算精度较低，为了解决该问题，本申请将计算比例调整为1:5:1，增加融合部分的计算比例，可以进一步的提高检测的精度。In an optional embodiment, the framework for detecting the target object in the target image may be backbone (feature extraction network)-neck (shallow high-resolution detail feature map and low-resolution semantic feature map are fused) -head (detector), where, at present, the calculation ratio of backbone:neck:head is generally 2-4:1:2. Among them, the calculation ratio of the fusion part is small, resulting in low subsequent calculation accuracy. In order to solve this problem Problem, this application adjusts the calculation ratio to 1:5:1, and increases the calculation ratio of the fusion part, which can further improve the detection accuracy.

需要说明的是，上述的计算比例是指计算量的分配，假设一共需要100GFLOPs的计算量，那么backbone部分的计算大概会占2/5*100＝40GFLOPS；具体计算就是神经网络里面对应的卷积这样的标准操作。It should be noted that the above calculation ratio refers to the allocation of calculation volume. Assuming that a total of 100GFLOPs of calculation volume is required, then the calculation of the backbone part will probably account for 2/5*100=40GFLOPS; the specific calculation is the corresponding convolution in the neural network. such a standard operation.

步骤S208，对至少一个第二特征图进行检测，得到目标对象的检测结果。Step S208: Detect at least one second feature map to obtain a detection result of the target object.

上述的检测结果可以为目标检测框，其中，目标检测框用于对目标图像中的目标对象进行标注；上述的检测结果还可以包含目标对象的类别，其中，目标对象的类别可以标注在目标检测框旁边或者指定的其他地方。The above detection result can be a target detection frame, wherein the target detection frame is used to mark the target object in the target image; the above detection result can also include the category of the target object, wherein the category of the target object can be marked in the target detection. next to the box or elsewhere as specified.

在一种可选的实施例中，可以根据多个第一特征图进行融合，得到至少一个第二特征图，针对于不同尺度的第二特征图，可以采用不同的检测器进行检测，不同的检测器结构相同，也即，不同的检测器对应的长宽高一致，但是检测器中的参数不同由于不同尺度的第二特征图的精度不同，因此，通过不同精度的检测器对不同尺度的第二特征图进行检测，能够进一步的提高针对该尺度的特征图的检测效果，从而提高对该尺度的特征图的检测精确度。可以通过调整检测器的参数使得检测器的检测精度较高，以便检测精确度较高的第二特征图，从而提高检测的准确度，可以通过调整检测器的参数使得检测器的检测精度较低，以便检测精确度较低的第二特征图，从而提高检测的准确度。通过这种设置可以在相同的计算量下，使得检测的精度更高。In an optional embodiment, at least one second feature map can be obtained by fusion according to multiple first feature maps, and different detectors can be used to detect the second feature maps of different scales. The detector structure is the same, that is, the length, width and height corresponding to different detectors are the same, but the parameters in the detectors are different. Because the accuracy of the second feature map of different scales is different, the Detecting the second feature map can further improve the detection effect of the feature map of the scale, thereby improving the detection accuracy of the feature map of the scale. The parameters of the detector can be adjusted to make the detection accuracy of the detector higher, so as to detect the second feature map with higher accuracy, thereby improving the accuracy of detection, and the parameters of the detector can be adjusted to make the detection accuracy of the detector lower , so as to detect the second feature map with lower accuracy, thereby improving the detection accuracy. Through this setting, the detection accuracy can be higher under the same calculation amount.

在另一种可选的实施例中，采用GFocalV2 Head(检测算法)作为基础检测器结构。In another optional embodiment, GFOcalV2 Head (detection algorithm) is used as the basic detector structure.

图4是根据本申请实施例的一种检测器的示意图，左边显示的检测器为目标使用的检测器的方式，其针对与不同尺度的特征图，采用相同的检测器进行检测；右边显示的检测器为本申请中使用检测器的方式，其针对于不同尺度的特征图，采用参数不同但是结构相同的检测器进行检测，从而提高检测的精确度。FIG. 4 is a schematic diagram of a detector according to an embodiment of the present application. The detector shown on the left is the detector used by the target, and the same detector is used to detect the feature maps of different scales; the detector shown on the right The detector is a method of using a detector in the present application, and for feature maps of different scales, detectors with different parameters but the same structure are used for detection, thereby improving the detection accuracy.

在另一种可选的实施例中，可以获取目标图像，其中，目标图像包含目标对象；可以利用目标检测模型中的特征提取层对目标图像进行多尺度特征提取，得到多个第一特征图；可以利用目标检测模型中的多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；可以利用目标检测模型中的检测器对至少一个第二特征图进行检测，得到目标对象的检测结果。In another optional embodiment, a target image may be acquired, wherein the target image includes a target object; a feature extraction layer in the target detection model may be used to perform multi-scale feature extraction on the target image to obtain multiple first feature maps ; The multi-branch network structure in the target detection model can be used to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used for multiple first feature maps through multiple branches. Perform feature fusion; the detector in the target detection model can be used to detect at least one second feature map to obtain a detection result of the target object.

通过上述步骤，首先获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于表示通过多个分支对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合，对至少一个第二特征图进行检测，得到目标对象的检测结果，实现了提高目标对象检测结果的准确度。容易注意到的是，可以通过多个分支对第一特征图进行特征融合，从而提高得到的至少一个第二特征图的精确度，并且多个分支还可以减少融合过程中的参数量，从而可以提高融合的效率，进而解决了相关技术中对图像进行检测的精确度较低的技术问题。Through the above steps, first obtain a target image, wherein the target image contains the target object; perform multi-scale feature extraction on the target image to obtain a plurality of first feature maps; use a multi-branch network structure to perform feature fusion on the plurality of first feature maps, Obtain at least one second feature map, wherein the multi-branch network structure is used to represent feature fusion of multiple first feature maps through multiple branches to obtain at least one second feature map, wherein the multi-branch network structure is used to obtain at least one second feature map through multiple branches. Each branch performs feature fusion on a plurality of first feature maps, and detects at least one second feature map to obtain a detection result of the target object, thereby improving the accuracy of the detection result of the target object. It is easy to notice that feature fusion can be performed on the first feature map through multiple branches, thereby improving the accuracy of the obtained at least one second feature map, and multiple branches can also reduce the amount of parameters in the fusion process, so that the The efficiency of fusion is improved, thereby solving the technical problem of low image detection accuracy in the related art.

本申请上述实施例中，多分支网络结构包括：第一分支、第二分支，其中，第一分支的输出与第二分支的输出连接。In the above embodiments of the present application, the multi-branch network structure includes: a first branch and a second branch, wherein the output of the first branch is connected to the output of the second branch.

上述的第一分支中可以包含有1×1的卷积层，上述的第二分支中可以包含有N个1×1和3×3的卷积块，其中，N个卷积块可以依次连接。需要说明的是第二分支中的3×3可以替换为其他的卷积层，例如，可以将3×3替换为5×5，但不限于此。The above-mentioned first branch may include a 1×1 convolutional layer, and the above-mentioned second branch may include N 1×1 and 3×3 convolutional blocks, wherein the N convolutional blocks can be connected in sequence. . It should be noted that 3×3 in the second branch can be replaced with other convolutional layers, for example, 3×3 can be replaced with 5×5, but not limited to this.

在一种可选的实施例中，多分支网络结构可以包括：第一卷积层、第一分支、第二分支、第一卷积层的输出与第一分支和第二分支的输入连接，第一分支的输出与第二分支的输出连接。In an optional embodiment, the multi-branch network structure may include: a first convolution layer, a first branch, a second branch, and the output of the first convolution layer is connected to the input of the first branch and the second branch, The output of the first branch is connected to the output of the second branch.

上述的第一卷积层可以为1×1。The above-mentioned first convolutional layer may be 1×1.

图5是根据本申请实施例的一种多分支网络结构的示意图，如图5所示可以将待融合的多个第一特征图进行处理后合并，得到合并特征图，可以将合并特征图分别输入到第一分支和第二分支进行处理，得到两个分支输出的两个输出特征图，可以对两个输出特征图进行拼接，得到第二特征图。FIG. 5 is a schematic diagram of a multi-branch network structure according to an embodiment of the present application. As shown in FIG. 5 , a plurality of first feature maps to be merged can be processed and then merged to obtain a merged feature map. The merged feature maps can be respectively The input is processed into the first branch and the second branch to obtain two output feature maps output by the two branches, and the two output feature maps can be spliced to obtain the second feature map.

在另一种可选的实施例中，多分支网络结构还可以包含多个分支，此处不做限定，其中，第一分支中可以包含多个子分支，第二分支中也可以包含多个子分支。In another optional embodiment, the multi-branch network structure may further include multiple branches, which is not limited here, wherein the first branch may include multiple sub-branches, and the second branch may also include multiple sub-branches .

本申请上述实施例中，利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，包括：对多个第一特征图进行通道合并，得到合并特征图；利用第一分支对合并特征图进行卷积处理，得到第一输出特征；利用第二分支对合并特征图进行卷积处理，得到第二输出特征；对第一输出特征和第二输出特征进行通道合并，得到至少一个第二特征图。In the above embodiments of the present application, using a multi-branch network structure to perform feature fusion on multiple first feature maps to obtain at least one second feature map includes: performing channel merging on multiple first feature maps to obtain a combined feature map; using The first branch performs convolution processing on the merged feature map to obtain the first output feature; the second branch performs convolution processing on the merged feature map to obtain the second output feature; the first output feature and the second output feature are channel merged , obtain at least one second feature map.

上述的第一分支中可以包含有不同卷积核的卷积层，通过不同大小的卷积核对应的卷积层对合并特征图进行处理，可以减少卷积操作的计算量，从而提高融合的效率。The above-mentioned first branch may include convolution layers with different convolution kernels, and the combined feature maps are processed by convolution layers corresponding to convolution kernels of different sizes, which can reduce the calculation amount of the convolution operation, thereby improving the fusion efficiency. efficiency.

在一种可选的实施例中，可以利用第一卷积层对多个第一特征图直接进行通道合并，得到上述的合并特征图。还可以利用第一卷积层对多个第一特征图进行卷积操作，得到多个第三特征图，以便能够将多个第一特征图的尺寸进行统一，便于后续的合并过程，可以对多个第三特征图的通道进行合并，实现将多个第三特征图进行拼接的效果，得到合并特征图，可以线利用第一分支对合并特征图进行卷积处理，得到第一输出特征，可以利用第二分支对合并特征进行卷积处理，得到第二输出特征，可以将第一输出特征和第二输出特征的通道进行合并，得到第二特征图。In an optional embodiment, the first convolutional layer may be used to directly combine the channels of the plurality of first feature maps to obtain the above-mentioned combined feature maps. It is also possible to use the first convolution layer to perform convolution operations on multiple first feature maps to obtain multiple third feature maps, so that the sizes of multiple first feature maps can be unified, which is convenient for the subsequent merging process. The channels of multiple third feature maps are merged to achieve the effect of splicing multiple third feature maps to obtain a merged feature map, and the first branch can be used to perform convolution processing on the merged feature map to obtain the first output feature, The second branch may be used to perform convolution processing on the merged feature to obtain a second output feature, and the channels of the first output feature and the second output feature may be merged to obtain a second feature map.

进一步的，可以将第二特征图作为第一特征图，并利用多分支网络继续对多个第一特征图进行处理，得到新的第二特征图，通过此步骤进行反复多次的操作，可以得到多个第二特征图。Further, the second feature map can be used as the first feature map, and a multi-branch network can be used to continue to process multiple first feature maps to obtain a new second feature map. A plurality of second feature maps are obtained.

本申请上述实施例中，第一分支包括：至少一个卷积块，多个卷积块中每个卷积块包含有多个子卷积层，多个子卷积层的卷积核不同。In the above embodiments of the present application, the first branch includes: at least one convolution block, each convolution block in the multiple convolution blocks includes multiple sub-convolutional layers, and the convolution kernels of the multiple sub-convolutional layers are different.

在一种可选的实施例中，可以根据输入的第一特征图的数量确定卷积块的数量，若输入的第一特征图的数量为N，则卷积块的数量为N。In an optional embodiment, the number of convolution blocks may be determined according to the number of input first feature maps. If the number of input first feature maps is N, the number of convolution blocks is N.

上述的多个子卷积层的卷积核的大小不同，其中，多个子卷积层中位于卷积块前面的卷积层对应的卷积核可以小于位于卷积块后面的卷积层对应的卷积核。The sizes of the convolution kernels of the above-mentioned multiple sub-convolutional layers are different, wherein the convolution kernels corresponding to the convolutional layers located in front of the convolutional blocks in the multiple sub-convolutional layers may be smaller than those corresponding to the convolutional layers located behind the convolutional blocks. convolution kernel.

上述的多个子卷积层可以分别为1×1和3×3。The above-mentioned multiple sub-convolutional layers can be 1×1 and 3×3, respectively.

上述的多个卷积块中包含的多个子卷积层可以相同。上述的多个卷积块中包含的多个子卷积层可以不同。The multiple sub-convolutional layers included in the aforementioned multiple convolutional blocks may be the same. The multiple sub-convolutional layers included in the aforementioned multiple convolutional blocks may be different.

本申请上述实施例中，利用第一分支对合并特征图进行卷积处理，得到第一输出特征，包括：利用至少一个卷积块对合并特征图进行卷积处理，得到第一输出特征。In the above embodiments of the present application, using the first branch to perform convolution processing on the merged feature map to obtain the first output feature includes: using at least one convolution block to perform convolution processing on the merged feature map to obtain the first output feature.

上述的卷积块中包含有第一子卷积层和第二子卷积层，其中，第一子卷积层可以为卷积核较小的卷积层，第二子卷积层可以为卷积核较大的卷积层。第一子卷积层可以为1×1，第二子卷积层可以为3×3。The above-mentioned convolution block includes a first sub-convolution layer and a second sub-convolution layer, wherein the first sub-convolution layer can be a convolution layer with a smaller convolution kernel, and the second sub-convolution layer can be Convolutional layers with larger convolution kernels. The first subconvolutional layer can be 1×1, and the second subconvolutional layer can be 3×3.

在一种可选的实施例中，可以在包含一个卷积块的情况下，利用该卷积块中的第一子卷积层对合并特征进行卷积操作，得到处理结果，可以将处理结果输入到第二子卷积层中，并利用第二子卷积层对处理结果进卷积操作，得到第一输出特征。In an optional embodiment, when a convolution block is included, the first sub-convolution layer in the convolution block can be used to perform a convolution operation on the merged feature to obtain a processing result. Input into the second sub-convolutional layer, and use the second sub-convolutional layer to perform convolution operation on the processing result to obtain the first output feature.

本申请上述实施例中，利用第二分支对合并特征图进行卷积处理，得到第二输出特征，包括：利用三卷积层对合并特征图进行卷积处理，得到第二输出特征。In the above embodiments of the present application, using the second branch to perform convolution processing on the combined feature map to obtain the second output feature includes: using three convolution layers to perform convolution processing on the combined feature map to obtain the second output feature.

上述的第三卷积层可以为1×1。The above-mentioned third convolutional layer may be 1×1.

本申请上述实施例中，该方法还包括：利用目标检测模型对目标图像进行检测，得到目标对象的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。In the above-mentioned embodiment of the present application, the method further includes: using a target detection model to detect the target image to obtain a detection result of the target object, wherein the target detection model is obtained by training based on the target sample image, and the target sample image is obtained through the sample detection frame. A sample image is obtained by data augmentation.

在一种可选的实施例中，可以将目标图像输入到目标检测模型中，利用目标检测模型对目标图像进行检测，得到目标对象的检测结果。其中，目标检测模型的主要框架可以为backbone-neck-head，可以先利用backbone对目标图像进行多尺度特征提取，得到多个第一特征图，然后利用neck利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；最后利用head对至少一个第二特征图进行检测，得到目标对象的检测结果。In an optional embodiment, the target image can be input into the target detection model, and the target image can be detected by using the target detection model to obtain the detection result of the target object. Among them, the main framework of the target detection model can be backbone-neck-head, and the backbone can be used to extract multi-scale features of the target image to obtain multiple first feature maps, and then use the neck to use the multi-branch network structure to perform multiple first feature maps. Feature fusion is performed on the feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; finally, the head is used to detect at least one second feature map, Get the detection result of the target object.

在一种可选的实施例中，目标检测模型可以是常用的检测模型，其训练所采样的样本图像与常规所采用的样本图像不同，主要是通过样本检测框对多个样本图像进行数据增强，得到目标样本图像。In an optional embodiment, the target detection model may be a commonly used detection model, and the sample images sampled for training are different from the conventionally used sample images, mainly by performing data enhancement on multiple sample images through the sample detection frame , get the target sample image.

在另一种可选的实施例中，目标检测模型可以中可以包含有特征提取层、多分支网络结构、检测层，其中，特征提取层用于对目标图像进行多尺度特征提取，得到多个第一特征图，多分支网络结构用于对多个第一特征图进行融合，得到至少一个第二特征图，检测层用于对至少一个第二特征图进行检测，得到目标对象的检测结果。其中，多分支网络结构可以包含有第一分支、第二分支，其中，第一分支的输出与第二分支的输出连接。In another optional embodiment, the target detection model may include a feature extraction layer, a multi-branch network structure, and a detection layer, wherein the feature extraction layer is used to perform multi-scale feature extraction on the target image to obtain multiple The first feature map, the multi-branch network structure is used to fuse multiple first feature maps to obtain at least one second feature map, and the detection layer is used to detect the at least one second feature map to obtain the detection result of the target object. The multi-branch network structure may include a first branch and a second branch, wherein the output of the first branch is connected to the output of the second branch.

图6是根据本申请实施例的一种通过目标检测模型进行检测的示意图，可以将待检测的目标图像输入到目标检测模型中，目标检测模型可以输出目标图像中目标对象的检测结果，其检测结果通过目标检测框对目标对象进行标注得到。FIG. 6 is a schematic diagram of detection by a target detection model according to an embodiment of the present application. The target image to be detected can be input into the target detection model, and the target detection model can output the detection result of the target object in the target image. The result is obtained by annotating the target object through the target detection frame.

本申请上述实施例中，该方法还包括：获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标对象；确定多个样本图像中预设数量的样本检测框为目标检测框；对目标检测框对应的目标对象进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。In the above-mentioned embodiments of the present application, the method further includes: acquiring a plurality of sample images and sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used to mark target objects in the sample images; Set the number of sample detection frames as target detection frames; perform data enhancement on the target object corresponding to the target detection frame to obtain target sample images; use the target sample images to train the initial detection model to obtain the target detection model.

在一种可选的实施例中，可以先对多个样本图像进行混合，得到混合图像，然后将混合图像中包含的多个样本检测框中预设数量的样本检测框确定为目标检测框。In an optional embodiment, multiple sample images may be mixed first to obtain a mixed image, and then a preset number of sample detection frames included in the multiple sample detection frames included in the mixed image are determined as target detection frames.

在另一种可选的实施例中，可以在训练数据集中随机选取多个样本图像和多个样本图像对应的样本检测框，可以根据多个样本图像的box level(区域级别)对多个样本图像进行数据增强，可以先对于多个样本图像进行一次全局数据增强，得到增强后的多个样本图像，然后从增强后的多个样本图像中随机选取预设数量的样本检测框为目标检测框进行局部数据增强，可以对目标检测框对应的目标对象进行数据增强，得到目标样本图像，最后利用目标样本图像对初始检测模型进行训练，得到目标模型。In another optional embodiment, multiple sample images and sample detection frames corresponding to the multiple sample images may be randomly selected in the training data set, and the multiple sample images may be detected according to the box level (region level) of the multiple sample images. For image data enhancement, you can first perform global data enhancement on multiple sample images to obtain multiple enhanced sample images, and then randomly select a preset number of sample detection frames from the enhanced multiple sample images as target detection frames By performing local data enhancement, data enhancement can be performed on the target object corresponding to the target detection frame to obtain the target sample image, and finally the initial detection model is trained by using the target sample image to obtain the target model.

在另一种可选的实施例中，可以从多个样本图像中随机选取预设数量的样本检测框为目标检测框进行局部数据增强，可以对目标检测框对应的目标对象进行数据增强，得到增强后的多个样本图像，然后对增强后的多个样本图像进行一次全局数据增强，得到目标样本图像，最后利用目标样本图像对初始检测模型进行训练，得到目标模型。In another optional embodiment, a preset number of sample detection frames may be randomly selected from a plurality of sample images as target detection frames to perform local data enhancement, and data enhancement may be performed on the target object corresponding to the target detection frame to obtain Enhanced multiple sample images, and then perform global data enhancement on the enhanced multiple sample images to obtain target sample images, and finally use the target sample images to train the initial detection model to obtain the target model.

上述的全局数据增强可以为对多个样本图像进行颜色变幻，旋转，对比度增强，随机抹除、缩放、裁剪等。上述的局部数据增强可以为对目标检测框中的目标对象进行颜色变幻，旋转，对比度增强，随机抹除、缩放等。The above-mentioned global data enhancement may be performing color change, rotation, contrast enhancement, random erasure, scaling, cropping, etc. on multiple sample images. The above-mentioned local data enhancement may be performing color change, rotation, contrast enhancement, random erasing, scaling, etc. on the target object in the target detection frame.

本申请上述实施例中，确定多个样本图像中预设数量的样本检测框为目标检测框，包括：对多个样本图像进行拼接，得到初始样本图像；对初始样本图像和预设样本图像进行混合，得到混合图像；确定混合图像中预设数量的样本检测框为目标检测框。In the above embodiments of the present application, determining a preset number of sample detection frames in multiple sample images as target detection frames includes: splicing multiple sample images to obtain an initial sample image; Mixing to obtain a mixed image; determining a preset number of sample detection frames in the mixed image as target detection frames.

在一种可选的实施例中，可以对多个样本图像依次拼接，得到初始样本图像；可选的，可以将多个样本图像拼接到一个白底的图片上，将白底的图片填满，得到初始样本图像，若多个样本图像未将白底的图片填满，则可以继续从训练数据中随机抽取一部分样本图像对白底的图片进行填充，直至填满。可选的，还可以对多个样本图像进行随机拼接。还可以通过其他的方式对多个样本图像进行拼接，此处不做限定。In an optional embodiment, a plurality of sample images can be stitched in sequence to obtain an initial sample image; optionally, a plurality of sample images can be stitched onto a picture with a white background, and the picture with a white background can be filled , to obtain the initial sample image. If multiple sample images do not fill the white background image, you can continue to randomly select a part of the sample image from the training data to fill the white background image until it is filled. Optionally, random splicing can also be performed on multiple sample images. The multiple sample images may also be stitched in other ways, which are not limited here.

上述的预设数量可以为预先设置的数量，上述的预设数量还可以是根据样本检测框的数量确定得到的，例如，预设数量可以为样本检测框的30％。The aforementioned preset number may be a preset number, and the aforementioned preset number may also be determined according to the number of sample detection frames, for example, the preset number may be 30% of the sample detection frames.

在另一种可选的实施例中，可以利用马赛克模组(mosaic)对多个样本图像依次拼接，得到初始样本图像，其中，mosaic用于丰富目标图像中的检测背景和检测对象，以便丰富数据集。在得到初始样本图像之后，可以对初始样本图像进行缩放和裁剪，得到处理后的初始样本图像，可以利用混合模块(mixup)将处理后的初始样本图像和预设样本图像进行混合，得到混合图像，以便于进一步的对数据进行增强，从而丰富样本图像中的特征。可以将混合图像中包含的多个样本检测框中预设数量的样本检测框确定为目标检测框。In another optional embodiment, a mosaic module (mosaic) can be used to sequentially stitch multiple sample images to obtain an initial sample image, wherein the mosaic is used to enrich the detection background and detection objects in the target image, so as to enrich the detection background and detection objects in the target image. data set. After the initial sample image is obtained, the initial sample image can be scaled and cropped to obtain the processed initial sample image, and the processed initial sample image and the preset sample image can be mixed by using a mixing module (mixup) to obtain a mixed image , in order to further enhance the data to enrich the features in the sample image. A preset number of sample detection frames included in the multiple sample detection frames included in the mixed image may be determined as target detection frames.

图7是根据本申请实施例的一种目标检测模型训练的示意图，首先将包含有样本检测框的多个样本图像输入到马赛克模块中，利用马赛克模块对多个样本图像进行拼接，得到初始样本图像，然后利用混合模块将初始样本图像和预设样本图像进行混合，得到混合图像，最后利用区域级别模块对确定混合图像中预设数量的样本检测框为目标检测框，对目标检测框对应的目标对象进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。FIG. 7 is a schematic diagram of training a target detection model according to an embodiment of the present application. First, multiple sample images including sample detection frames are input into the mosaic module, and the mosaic module is used to stitch the multiple sample images to obtain an initial sample. image, and then use the mixing module to mix the initial sample image and the preset sample image to obtain a mixed image, and finally use the region level module to determine the preset number of sample detection frames in the mixed image as the target detection frame, and the corresponding target detection frame. The target object is enhanced with data to obtain the target sample image; the initial detection model is trained by using the target sample image to obtain the target detection model.

本申请上述实施例中，该方法还包括：输出检测结果；接收第一反馈结果，其中，第一反馈结果用于对根据检测结果对合并特征图中的通道进行修改得到；基于第一反馈结果更新合并特征图。In the above-mentioned embodiment of the present application, the method further includes: outputting a detection result; receiving a first feedback result, wherein the first feedback result is obtained by modifying the channel in the merged feature map according to the detection result; based on the first feedback result Update the merged feature map.

在一种可选的实施例中，可以输出检测结果，并显示检测结果至用户的客户端，用户可以根据检测结果对合并特征图中的通道进行修改，得到第一反馈结果，以便与根据第一反馈结果更新合并特征图，可以根据更新后的合并特征图进行检测，得到准确度较高的检测结果。In an optional embodiment, the detection result can be output and displayed to the user's client terminal, and the user can modify the channel in the merged feature map according to the detection result to obtain the first feedback result, so as to be compatible with the first feedback result according to the first feedback result. The merged feature map is updated with a feedback result, and detection can be performed according to the updated merged feature map to obtain a detection result with higher accuracy.

本申请中提出的神经网络结构，其可以使用轻量级的网络结构(csp_darknet)作为backbone，大比重的csp_GFPN作为neck，以及一个scale-decouple(不同尺度)GFocalv2作为head。同时结合基于学习的数据增强的训练方法，最终实现快速的、高精度对目标图像中的目标对象进行检测。大幅减少神经网络结构部署时的资源使用。The neural network structure proposed in this application can use a lightweight network structure (csp_darknet) as the backbone, a large proportion of csp_GFPN as the neck, and a scale-decouple (different scale) GFocalv2 as the head. At the same time, combined with the training method of data augmentation based on learning, the target object in the target image can be detected quickly and accurately. Significantly reduces resource usage when deploying neural network structures.

需要说明的是，对于前述的各方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本发明并不受所描述的动作顺序的限制，因为依据本发明，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the described action sequence. As in accordance with the present invention, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到根据上述实施例的图像处理方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，或者网络设备等)执行本发明各个实施例的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the image processing method according to the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases The former is a better implementation. Based on this understanding, the technical solutions of the present invention can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods of the various embodiments of the present invention.

实施例2Example 2

根据本申请实施例，还提供了一种图像处理方法实施例，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present application, an embodiment of an image processing method is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although A logical order is shown in the flowcharts, but in some cases steps shown or described may be performed in an order different from that herein.

图8是根据本申请实施例2的一种图像处理方法的流程图，如图8所示，该方法可以包括如下步骤：FIG. 8 is a flowchart of an image processing method according to Embodiment 2 of the present application. As shown in FIG. 8 , the method may include the following steps:

步骤S802，获取目标遥感图像。Step S802, acquiring a remote sensing image of the target.

其中，目标遥感图像包含目标对象。Among them, the target remote sensing image contains the target object.

步骤S804，对目标遥感图像进行多尺度特征提取，得到多个第一特征图。Step S804, performing multi-scale feature extraction on the target remote sensing image to obtain a plurality of first feature maps.

步骤S806，利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图。Step S806, using a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map.

步骤S808，对至少一个第二特征图进行检测，得到目标对象的检测结果。Step S808: Detect at least one second feature map to obtain a detection result of the target object.

需要说明的是，本申请上述实施例中涉及到的优选实施方案与实施例1提供的方案以及应用场景、实施过程相同，但不仅限于实施例1所提供的方案。It should be noted that the preferred embodiments involved in the above embodiments of the present application are the same as the solutions, application scenarios, and implementation processes provided in Example 1, but are not limited to the solutions provided in Example 1.

实施例3Example 3

图9是根据本申请实施例3的一种图像处理方法的流程图，如图9所示，该方法可以包括如下步骤：FIG. 9 is a flowchart of an image processing method according to Embodiment 3 of the present application. As shown in FIG. 9 , the method may include the following steps:

步骤S902，获取农业遥感图像。Step S902, obtaining agricultural remote sensing images.

其中，农业遥感图像包含农作物。Among them, agricultural remote sensing images contain crops.

步骤S904，对农业遥感图像进行多尺度特征提取，得到多个第一特征图。Step S904, performing multi-scale feature extraction on the agricultural remote sensing image to obtain a plurality of first feature maps.

步骤S906，利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图。Step S906, using a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map.

步骤S908，对至少一个第二特征图进行检测，得到农作物的检测结果。Step S908, detecting at least one second feature map to obtain a detection result of the crop.

本申请上述实施例中，该方法还包括：利用目标检测模型对农业遥感图像进行检测，得到农作物的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。In the above-mentioned embodiment of the present application, the method further includes: using a target detection model to detect agricultural remote sensing images to obtain detection results of crops, wherein the target detection model is obtained by training based on target sample images, and the target sample images are paired with multiple sample detection frames. A sample image is obtained by data augmentation.

本申请上述实施例中，该方法还包括：获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的农作物；确定多个样本图像中预设数量的样本检测框为目标检测框；对目标检测框对应的农作物进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。In the above-mentioned embodiments of the present application, the method further includes: acquiring a plurality of sample images and sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used to label crops in the sample images; The number of sample detection frames is the target detection frame; the crops corresponding to the target detection frame are enhanced to obtain the target sample image; the target sample image is used to train the initial detection model to obtain the target detection model.

实施例4Example 4

图10是根据本申请实施例4的一种图像处理方法的流程图，如图10所示，该方法可以包括如下步骤：FIG. 10 is a flowchart of an image processing method according to Embodiment 4 of the present application. As shown in FIG. 10 , the method may include the following steps:

步骤S1002，获取建筑物遥感图像。Step S1002, acquiring remote sensing images of buildings.

其中，建筑物遥感图像包含目标建筑物。Among them, the building remote sensing image includes the target building.

步骤S1004，对建筑物遥感图像进行多尺度特征提取，得到多个第一特征图。In step S1004, multi-scale feature extraction is performed on the remote sensing image of the building to obtain a plurality of first feature maps.

步骤S1006，利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图。Step S1006, using a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map.

步骤S1008，对至少一个第二特征图进行检测，得到目标建筑物的检测结果。Step S1008: Detect at least one second feature map to obtain a detection result of the target building.

本申请上述实施例中，该方法还包括：利用目标检测模型对建筑物遥感图像进行检测，得到目标建筑物的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。In the above-mentioned embodiment of the present application, the method further includes: using a target detection model to detect the remote sensing image of the building to obtain the detection result of the target building, wherein the target detection model is obtained by training based on the target sample image, and the target sample image is detected by the sample The box is obtained by data augmentation of multiple sample images.

本申请上述实施例中，该方法还包括：获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标建筑物；确定多个样本图像中预设数量的样本检测框为目标检测框；对目标检测框对应的目标建筑物进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。In the above-mentioned embodiments of the present application, the method further includes: acquiring a plurality of sample images and sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used to mark target buildings in the sample images; A preset number of sample detection frames are target detection frames; data enhancement is performed on the target buildings corresponding to the target detection frames to obtain target sample images; the initial detection model is trained by using the target sample images to obtain a target detection model.

实施例5Example 5

图11是根据本申请实施例5的一种图像处理方法的流程图，如图11所示，该方法可以包括如下步骤：FIG. 11 is a flowchart of an image processing method according to Embodiment 5 of the present application. As shown in FIG. 11 , the method may include the following steps:

步骤S1102，云服务器获取目标图像。Step S1102, the cloud server acquires the target image.

步骤S1104，云服务器对目标图像进行多尺度特征提取，得到多个第一特征图。Step S1104, the cloud server performs multi-scale feature extraction on the target image to obtain a plurality of first feature maps.

步骤S1106，云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图。Step S1106, the cloud server uses a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map.

步骤S1108，云服务器对至少一个第二特征图进行检测，得到目标对象的检测结果。Step S1108, the cloud server detects at least one second feature map to obtain a detection result of the target object.

本申请上述实施例中，云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，包括：云服务器对多个第一特征图进行通道合并，得到合并特征图；云服务器利用第一分支对合并特征图进行卷积处理，得到第一输出特征；云服务器利用第二分支对合并特征图进行卷积处理，得到第二输出特征；云服务器对第一输出特征和第二输出特征进行通道合并，得到至少一个第二特征图。In the above-mentioned embodiment of the present application, the cloud server uses a multi-branch network structure to perform feature fusion on multiple first feature maps to obtain at least one second feature map, including: the cloud server performs channel merging on multiple first feature maps to obtain the merged feature map; the cloud server uses the first branch to perform convolution processing on the combined feature map to obtain a first output feature; the cloud server uses the second branch to perform convolution processing on the combined feature map to obtain a second output feature; The output feature and the second output feature are channel combined to obtain at least one second feature map.

本申请上述实施例中，该方法还包括：云服务器利用目标检测模型对目标图像进行检测，得到目标对象的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。In the above-mentioned embodiment of the present application, the method further includes: the cloud server detects the target image by using the target detection model to obtain the detection result of the target object, wherein the target detection model is obtained by training based on the target sample image, and the target sample image passes through the sample detection frame. Data augmentation is performed on multiple sample images.

本申请上述实施例中，该方法还包括：云服务器获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标对象；云服务器确定多个样本图像中预设数量的样本检测框为目标检测框；云服务器对目标检测框对应的目标对象进行数据增强，得到目标样本图像；云服务器利用目标样本图像对初始检测模型进行训练，得到目标检测模型。In the above-mentioned embodiment of the present application, the method further includes: the cloud server obtains a plurality of sample images and sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used to mark target objects in the sample images; the cloud server determines a plurality of sample detection frames. The preset number of sample detection frames in the sample image is the target detection frame; the cloud server performs data enhancement on the target object corresponding to the target detection frame to obtain the target sample image; the cloud server uses the target sample image to train the initial detection model to obtain the target detection model Model.

本申请上述实施例中，云服务器确定多个样本图像中预设数量的样本检测框为目标检测框，包括：云服务器对多个样本图像进行拼接，得到初始样本图像；云服务器对初始样本图像和预设样本图像进行混合，得到混合图像；云服务器确定混合图像中预设数量的样本检测框为目标检测框。In the above embodiments of the present application, the cloud server determines a preset number of sample detection frames in the plurality of sample images as target detection frames, including: the cloud server stitches the plurality of sample images to obtain an initial sample image; the cloud server stitches the initial sample images Mixing with a preset sample image to obtain a mixed image; the cloud server determines a preset number of sample detection frames in the mixed image as target detection frames.

本申请上述实施例中，该方法还包括：云服务器输出检测结果；云服务器接收第一反馈结果，其中，第一反馈结果用于对根据检测结果对合并特征图中的通道进行修改得到；云服务器基于第一反馈结果更新合并特征图。In the above-mentioned embodiment of the present application, the method further includes: the cloud server outputs a detection result; the cloud server receives a first feedback result, wherein the first feedback result is obtained by modifying the channel in the merged feature map according to the detection result; The server updates the merged feature map based on the first feedback result.

实施例6Example 6

根据本申请实施例，还提供了一种用于实施上述图像处理方法的图像处理装置，图12是根据本申请实施例6的一种图像处理装置的示意图，如图12所示，该装置1200包括：获取模块1202、提取模块1204、融合模块1206、检测模块1208。According to an embodiment of the present application, an image processing apparatus for implementing the above image processing method is also provided. FIG. 12 is a schematic diagram of an image processing apparatus according to Embodiment 6 of the present application. As shown in FIG. 12 , the apparatus 1200 It includes: an acquisition module 1202 , an extraction module 1204 , a fusion module 1206 , and a detection module 1208 .

其中，获取模块用于获取目标遥感图像，其中，农业遥感图像包含目标对象；提取模块用于对目标遥感图像进行多尺度特征提取，得到多个第一特征图；融合模块用于利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；检测模块用于对至少一个第二特征图进行检测，得到目标对象的检测结果。The acquisition module is used to acquire target remote sensing images, wherein the agricultural remote sensing images contain target objects; the extraction module is used to perform multi-scale feature extraction on the target remote sensing images to obtain a plurality of first feature maps; the fusion module is used to utilize multi-branch network The structure performs feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; the detection module is used for at least one feature fusion. The second feature map is detected to obtain the detection result of the target object.

此处需要说明的是，上述的获取模块1202、提取模块1204、融合模块1206、检测模块1208对应于实施例1中的步骤S202至步骤S208，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端10中。It should be noted here that the above-mentioned acquisition module 1202, extraction module 1204, fusion module 1206, and detection module 1208 correspond to steps S202 to S208 in Embodiment 1, and examples and applications implemented by the four modules and corresponding steps The scenario is the same, but is not limited to the content disclosed in the above Embodiment 1. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal 10 provided in the first embodiment.

本申请上述实施例中，融合模块，包括：第一处理单元、合并单元。In the above embodiments of the present application, the fusion module includes: a first processing unit and a merging unit.

其中，合并单元用于对多个第一特征图进行通道合并，得到合并特征图；第一处理单元用于利用第一分支对合并特征图进行卷积处理，得到第一输出特征；第一处理单元还用于利用第二分支对合并特征图进行卷积处理，得到第二输出特征；合并单元还用于对第一输出特征和第二输出特征进行通道合并，得到至少一个第二特征图。The merging unit is used for channel merging multiple first feature maps to obtain a merged feature map; the first processing unit is used to perform convolution processing on the merged feature map using the first branch to obtain a first output feature; the first processing The unit is further configured to perform convolution processing on the merged feature map using the second branch to obtain a second output feature; the merging unit is further configured to perform channel merging on the first output feature and the second output feature to obtain at least one second feature map.

本申请上述实施例中，检测模块还用于利用目标检测模型对目标图像进行检测，得到目标对象的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。In the above-mentioned embodiment of the present application, the detection module is further configured to detect the target image by using the target detection model to obtain the detection result of the target object, wherein the target detection model is obtained by training based on the target sample image, and the target sample image is obtained through the sample detection frame to many A sample image is obtained by data augmentation.

本申请上述实施例中，该装置还包括：获取模块、确定模块、增强模块、训练模块。In the above embodiments of the present application, the apparatus further includes: an acquisition module, a determination module, an enhancement module, and a training module.

其中，获取模块用于获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标对象；确定模块用于确定多个样本图像中预设数量的样本检测框为目标检测框；增强模块用于对目标检测框对应的目标对象进行数据增强，得到目标样本图像；训练模块用于利用目标样本图像对初始检测模型进行训练，得到目标检测模型。Wherein, the acquisition module is used to acquire multiple sample images and sample detection frames corresponding to the multiple sample images, wherein the sample detection frame is used to mark the target objects in the sample images; the determination module is used to determine the preset quantity in the multiple sample images The sample detection frame is the target detection frame; the enhancement module is used to enhance the target object corresponding to the target detection frame to obtain the target sample image; the training module is used to train the initial detection model by using the target sample image to obtain the target detection model.

本申请上述实施例中，确定模块，包括：拼接单元、混合单元、确定单元。In the above embodiments of the present application, the determination module includes: a splicing unit, a mixing unit, and a determination unit.

其中，拼接单元用于对多个样本图像进行拼接，得到初始样本图像；混合单元用于对初始样本图像和预设样本图像进行混合，得到混合图像；确定单元用于确定混合图像中预设数量的样本检测框为目标检测框。Wherein, the splicing unit is used for splicing multiple sample images to obtain an initial sample image; the mixing unit is used for mixing the initial sample image and the preset sample image to obtain a mixed image; the determining unit is used for determining the preset quantity in the mixed image The sample detection frame of is the target detection frame.

本申请上述实施例中，该装置还包括：输出模块、接收模块、更新模块。In the above embodiments of the present application, the device further includes: an output module, a receiving module, and an updating module.

其中，输出模块用于出检测结果；接收模块用于接收第一反馈结果，其中，第一反馈结果用于对根据检测结果对合并特征图中的通道进行修改得到；更新模块用于基于第一反馈结果更新合并特征图。The output module is used to output the detection result; the receiving module is used to receive the first feedback result, wherein the first feedback result is used to modify the channel in the merged feature map according to the detection result; the update module is used to obtain based on the first feedback Feedback results update the merged feature map.

实施例7Example 7

根据本申请实施例，还提供了一种用于实施上述图像处理方法的图像处理装置，图13是根据本申请实施例7的一种图像处理装置的示意图，如图13所示，该装置1300包括：获取模块1302、获取模块1302、获取模块1302、获取模块1302。According to an embodiment of the present application, an image processing apparatus for implementing the above image processing method is also provided. FIG. 13 is a schematic diagram of an image processing apparatus according to Embodiment 7 of the present application. As shown in FIG. 13 , the apparatus 1300 It includes: an acquisition module 1302 , an acquisition module 1302 , an acquisition module 1302 , and an acquisition module 1302 .

其中，获取模块用于获取农业遥感图像，其中，农业遥感图像包含农作物；提取模块用于对农业遥感图像进行多尺度特征提取，得到多个第一特征图；融合模块用于利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；检测模块用于对至少一个第二特征图进行检测，得到农作物的检测结果。The acquisition module is used to acquire agricultural remote sensing images, wherein the agricultural remote sensing images include crops; the extraction module is used to perform multi-scale feature extraction on the agricultural remote sensing images to obtain a plurality of first feature maps; the fusion module is used to utilize a multi-branch network structure Perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; the detection module is used for at least one first feature map. Two feature maps are detected to obtain the detection results of crops.

此处需要说明的是，上述的获取模块1302、提取模块1304、融合模块1306、检测模块1308对应于实施例2中的步骤S802至步骤S808，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端10中。It should be noted here that the above-mentioned acquisition module 1302, extraction module 1304, fusion module 1306, and detection module 1308 correspond to steps S802 to S808 in Embodiment 2, and examples and applications implemented by the four modules and corresponding steps The scenario is the same, but is not limited to the content disclosed in the above Embodiment 1. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal 10 provided in the first embodiment.

实施例8Example 8

根据本申请实施例，还提供了一种用于实施上述图像处理方法的图像处理装置，图14是根据本申请实施例8的一种图像处理装置的示意图，如图14所示，该装置1400包括：获取模块1402、提取模块1404、融合模块1406、检测模块1408。According to an embodiment of the present application, an image processing apparatus for implementing the above image processing method is also provided. FIG. 14 is a schematic diagram of an image processing apparatus according to Embodiment 8 of the present application. As shown in FIG. 14 , the apparatus 1400 It includes: an acquisition module 1402 , an extraction module 1404 , a fusion module 1406 , and a detection module 1408 .

其中，获取模块用于获取建筑物遥感图像，其中，建筑物遥感图像包含目标建筑物；提取模块用于对建筑物遥感图像进行多尺度特征提取，得到多个第一特征图；融合模块用于利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；检测模块用于对至少一个第二特征图进行检测，得到目标建筑物的检测结果。The acquisition module is used to acquire remote sensing images of buildings, wherein the remote sensing images of buildings include target buildings; the extraction module is used to perform multi-scale feature extraction on the remote sensing images of buildings to obtain multiple first feature maps; the fusion module is used for A multi-branch network structure is used to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; the detection module uses For detecting at least one second feature map, a detection result of the target building is obtained.

此处需要说明的是，上述的获取模块1402、提取模块1404、融合模块1406、检测模块1408对应于实施例3中的步骤S902至步骤S908，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端10中。It should be noted here that the above-mentioned acquisition module 1402, extraction module 1404, fusion module 1406, and detection module 1408 correspond to steps S902 to S908 in Embodiment 3, and examples and applications implemented by the four modules and corresponding steps The scenario is the same, but is not limited to the content disclosed in the above Embodiment 1. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal 10 provided in the first embodiment.

实施例9Example 9

根据本申请实施例，还提供了一种用于实施上述图像处理方法的图像处理装置，图15是根据本申请实施例9的一种图像处理装置的示意图，如图15所示，该装置1500包括：获取模块1502、提取模块1504、融合模块1506、检测模块1508。According to an embodiment of the present application, an image processing apparatus for implementing the above image processing method is also provided. FIG. 15 is a schematic diagram of an image processing apparatus according to Embodiment 9 of the present application. As shown in FIG. 15 , the apparatus 1500 It includes: an acquisition module 1502 , an extraction module 1504 , a fusion module 1506 , and a detection module 1508 .

其中，获取模块用于通过云服务器获取目标遥感图像，其中，目标遥感图像包含目标对象；提取模块用于通过云服务器对目标遥感图像进行多尺度特征提取，得到多个第一特征图；融合模块用于通过云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；检测模块用于通过云服务器对至少一个第二特征图进行检测，得到目标对象的检测结果。The acquisition module is used to obtain the target remote sensing image through the cloud server, wherein the target remote sensing image contains the target object; the extraction module is used to perform multi-scale feature extraction on the target remote sensing image through the cloud server to obtain a plurality of first feature maps; the fusion module It is used to perform feature fusion on multiple first feature maps by using a multi-branch network structure through a cloud server to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches. Fusion; the detection module is used to detect at least one second feature map through the cloud server to obtain the detection result of the target object.

此处需要说明的是，上述的获取模块1502、提取模块1504、融合模块1506、检测模块1508对应于实施例4中的步骤S1002至步骤S1008，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端10中。It should be noted here that the above-mentioned acquisition module 1502, extraction module 1504, fusion module 1506, and detection module 1508 correspond to steps S1002 to S1008 in Embodiment 4, and examples and applications implemented by the four modules and corresponding steps The scenario is the same, but is not limited to the content disclosed in the above Embodiment 1. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal 10 provided in the first embodiment.

实施例10Example 10

本发明的实施例可以提供一种电子设备，该电子设备包括：存储器和处理器，处理器用于运行存储器中存储的程序，该电子设备可以为计算机终端，该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地，在本实施例中，上述计算机终端也可以替换为移动终端等终端设备。An embodiment of the present invention may provide an electronic device, the electronic device includes: a memory and a processor, where the processor is configured to run a program stored in the memory, the electronic device may be a computer terminal, and the computer terminal may be a computer terminal in a group of computer terminals. Any computer terminal device. Optionally, in this embodiment, the above-mentioned computer terminal may also be replaced by a terminal device such as a mobile terminal.

可选地，在本实施例中，上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。Optionally, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among multiple network devices of a computer network.

在本实施例中，上述计算机终端可以执行图像处理方法中以下步骤的程序代码：获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。In this embodiment, the above-mentioned computer terminal can execute the program code of the following steps in the image processing method: acquiring a target image, wherein the target image includes a target object; performing multi-scale feature extraction on the target image to obtain a plurality of first feature maps; A multi-branch network structure is used to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; The second feature map is detected to obtain the detection result of the target object.

可选地，图16是根据本申请实施例的一种计算机终端的结构框图。如图16所示，该计算机终端A可以包括：一个或多个(图中仅示出一个)处理器、存储器。Optionally, FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present application. As shown in FIG. 16 , the computer terminal A may include: one or more (only one is shown in the figure) processors and memories.

其中，存储器可用于存储软件程序以及模块，如本申请实施例中的图像处理方法和装置对应的程序指令/模块，处理器通过运行存储在存储器内的软件程序以及模块，从而执行各种功能应用以及数据处理，即实现上述的图像处理方法。存储器可包括高速随机存储器，还可以包括非易失性存储器，如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中，存储器可进一步包括相对于处理器远程设置的存储器，这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory can be used to store software programs and modules, such as program instructions/modules corresponding to the image processing method and apparatus in the embodiments of the present application, and the processor executes various functional applications by running the software programs and modules stored in the memory. And data processing, that is, to realize the above-mentioned image processing method. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory located remotely from the processor, and these remote memories may be connected to Terminal A through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: acquire a target image, wherein the target image includes the target object; perform multi-scale feature extraction on the target image to obtain a plurality of first feature maps; A multi-branch network structure is used to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; The second feature map is detected to obtain the detection result of the target object.

可选的，上述处理器还可以执行如下步骤的程序代码：对多个第一特征图进行通道合并，得到合并特征图；利用第一分支对合并特征图进行卷积处理，得到第一输出特征；利用第二分支对合并特征图进行卷积处理，得到第二输出特征；对第一输出特征和第二输出特征进行通道合并，得到至少一个第二特征图。Optionally, the above-mentioned processor may also execute the program code of the following steps: performing channel merging on a plurality of first feature maps to obtain a combined feature map; using the first branch to perform convolution processing on the combined feature map to obtain a first output feature ; use the second branch to perform convolution processing on the merged feature map to obtain a second output feature; perform channel merging on the first output feature and the second output feature to obtain at least one second feature map.

可选的，上述处理器还可以执行如下步骤的程序代码：利用目标检测模型对目标图像进行检测，得到目标对象的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。Optionally, the above-mentioned processor may also execute the program code of the following steps: using the target detection model to detect the target image to obtain the detection result of the target object, wherein the target detection model is obtained by training based on the target sample image, and the target sample image is obtained through the sample image. The detection frame is obtained by data augmentation of multiple sample images.

可选的，上述处理器还可以执行如下步骤的程序代码：获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标对象；确定多个样本图像中预设数量的样本检测框为目标检测框；对目标检测框对应的目标对象进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。Optionally, the above-mentioned processor may also execute the program code of the following steps: acquiring multiple sample images and sample detection frames corresponding to the multiple sample images, wherein the sample detection frames are used to mark the target objects in the sample images; A preset number of sample detection frames in the sample image are target detection frames; data enhancement is performed on the target object corresponding to the target detection frame to obtain target sample images; the initial detection model is trained by using the target sample images to obtain target detection models.

可选的，上述处理器还可以执行如下步骤的程序代码：对多个样本图像进行拼接，得到初始样本图像；对初始样本图像和预设样本图像进行混合，得到混合图像；确定混合图像中预设数量的样本检测框为目标检测框。Optionally, the above-mentioned processor may also execute the program code of the following steps: splicing a plurality of sample images to obtain an initial sample image; mixing the initial sample image and the preset sample image to obtain a mixed image; Let the number of sample detection boxes be the target detection boxes.

可选的，上述处理器还可以执行如下步骤的程序代码：输出检测结果；接收第一反馈结果，其中，第一反馈结果用于对根据检测结果对合并特征图中的通道进行修改得到；基于第一反馈结果更新合并特征图。Optionally, the above-mentioned processor may also execute the program code of the following steps: outputting a detection result; receiving a first feedback result, wherein the first feedback result is used to modify the channel in the merged feature map according to the detection result; The first feedback result updates the merged feature map.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取目标遥感图像，其中，目标遥感图像包含目标对象；对目标遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。The processor can call the information and application programs stored in the memory through the transmission device to perform the following steps: obtain a target remote sensing image, wherein the target remote sensing image includes a target object; perform multi-scale feature extraction on the target remote sensing image to obtain a plurality of first remote sensing images. feature map; using a multi-branch network structure to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; The at least one second feature map is detected to obtain a detection result of the target object.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取农业遥感图像，其中，农业遥感图像包含农作物；对农业遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到农作物的检测结果。The processor can call the information and application programs stored in the memory through the transmission device to perform the following steps: obtain an agricultural remote sensing image, wherein the agricultural remote sensing image includes crops; perform multi-scale feature extraction on the agricultural remote sensing image to obtain a plurality of first features Figure; perform feature fusion on multiple first feature maps by using a multi-branch network structure to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches; At least one second feature map is detected to obtain a detection result of the crop.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取建筑物遥感图像，其中，建筑物遥感图像包含目标建筑物；对建筑物遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标建筑物的检测结果。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: obtain a remote sensing image of a building, wherein the remote sensing image of the building includes the target building; perform multi-scale feature extraction on the remote sensing image of the building to obtain multiple first feature maps; using a multi-branch network structure to perform feature fusion on the multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used for multiple first feature maps through multiple branches Perform feature fusion; perform detection on at least one second feature map to obtain a detection result of the target building.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：云服务器获取目标图像，其中，目标图像包含目标对象；云服务器对目标图像进行多尺度特征提取，得到多个第一特征图；云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；云服务器对至少一个第二特征图进行检测，得到目标对象的检测结果。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: the cloud server obtains the target image, wherein the target image includes the target object; the cloud server performs multi-scale feature extraction on the target image to obtain a plurality of first images. a feature map; the cloud server uses a multi-branch network structure to perform feature fusion on a plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform a multi-branch network structure on the plurality of first feature maps through a plurality of branches. Feature fusion; the cloud server detects at least one second feature map to obtain a detection result of the target object.

采用本申请实施例，首先获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于表示通过多个分支对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合，对至少一个第二特征图进行检测，得到目标对象的检测结果，实现了提高目标对象检测结果的准确度。容易注意到的是，可以通过多个分支对第一特征图进行特征融合，从而提高得到的至少一个第二特征图的精确度，并且多个分支还可以减少融合过程中的参数量，从而可以提高融合的效率，进而解决了相关技术中对图像进行检测的精确度较低的技术问题。Using the embodiment of the present application, first acquire a target image, where the target image contains a target object; perform multi-scale feature extraction on the target image to obtain a plurality of first feature maps; use a multi-branch network structure to characterize the plurality of first feature maps Fusion to obtain at least one second feature map, wherein the multi-branch network structure is used to represent that feature fusion is performed on multiple first feature maps through multiple branches to obtain at least one second feature map, wherein the multi-branch network structure is used for Feature fusion is performed on multiple first feature maps through multiple branches, and at least one second feature map is detected to obtain a detection result of the target object, thereby improving the accuracy of the detection result of the target object. It is easy to notice that feature fusion can be performed on the first feature map through multiple branches, thereby improving the accuracy of the obtained at least one second feature map, and multiple branches can also reduce the amount of parameters in the fusion process, so that the The efficiency of fusion is improved, thereby solving the technical problem of low image detection accuracy in the related art.

本领域普通技术人员可以理解，图15所示的结构仅为示意，计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(MobileInternet Devices，MID)、PAD等终端设备。图15其并不对上述电子装置的结构造成限定。例如，计算机终端10还可包括比图15中所示更多或者更少的组件(如网络接口、显示装置等)，或者具有与图15所示不同的配置。Those of ordinary skill in the art can understand that the structure shown in FIG. 15 is only a schematic diagram, and the computer terminal can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, an applause computer, and a mobile internet device (Mobile Internet Devices, MID) , PAD and other terminal equipment. FIG. 15 does not limit the structure of the above electronic device. For example, the computer terminal 10 may also include more or less components than those shown in FIG. 15 (eg, network interface, display device, etc.), or have a different configuration than that shown in FIG. 15 .

本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成，该程序可以存储于一计算机可读存储介质中，存储介质可以包括：闪存盘、只读存储器(Read-Only Memory，ROM)、随机存取器(RandomAccess Memory，RAM)、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing the hardware related to the terminal device through a program, and the program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, read-only memory (Read-Only Memory, ROM), random access device (RandomAccess Memory, RAM), magnetic disk or optical disk, etc.

实施例11Example 11

本发明的实施例还提供了一种存储介质。可选地，在本实施例中，上述存储介质可以用于保存上述实施例一所提供的图像处理方法所执行的程序代码。Embodiments of the present invention also provide a storage medium. Optionally, in this embodiment, the above-mentioned storage medium may be used to store the program code executed by the image processing method provided in the above-mentioned first embodiment.

可选地，在本实施例中，上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中，或者位于移动终端群中的任意一个移动终端中。Optionally, in this embodiment, the above-mentioned storage medium may be located in any computer terminal in a computer terminal group in a computer network, or in any mobile terminal in a mobile terminal group.

可选地，在本实施例中，存储介质被设置为存储用于执行以下步骤的程序代码：获取目标图像，其中，目标图像包含目标对象；对目标图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps: acquiring a target image, where the target image includes a target object; performing multi-scale feature extraction on the target image to obtain a plurality of first a feature map; using a multi-branch network structure to perform feature fusion on multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform feature fusion on multiple first feature maps through multiple branches ; Detect at least one second feature map to obtain a detection result of the target object.

可选地，上述存储介质还被设置为存储用于执行以下步骤的程序代码：对多个第一特征图进行通道合并，得到合并特征图；利用第一分支对合并特征图进行卷积处理，得到第一输出特征；利用第二分支对合并特征图进行卷积处理，得到第二输出特征；对第一输出特征和第二输出特征进行通道合并，得到至少一个第二特征图。Optionally, the above-mentioned storage medium is further configured to store program codes for performing the following steps: performing channel merging on a plurality of first feature maps to obtain a combined feature map; using the first branch to perform convolution processing on the combined feature map, obtaining a first output feature; using the second branch to perform convolution processing on the combined feature map to obtain a second output feature; performing channel merging on the first output feature and the second output feature to obtain at least one second feature map.

可选地，上述存储介质还被设置为存储用于执行以下步骤的程序代码：利用目标检测模型对目标图像进行检测，得到目标对象的检测结果，其中，目标检测模型基于目标样本图像训练得到，目标样本图像通过样本检测框对多个样本图像进行数据增强得到。Optionally, the above-mentioned storage medium is also set to store program codes for performing the following steps: using a target detection model to detect the target image to obtain a detection result of the target object, wherein the target detection model is obtained by training based on the target sample image, The target sample image is obtained by performing data enhancement on multiple sample images through the sample detection frame.

可选地，上述存储介质还被设置为存储用于执行以下步骤的程序代码：获取多个样本图像和多个样本图像对应的样本检测框，其中，样本检测框用于标注样本图像中的目标对象；确定多个样本图像中预设数量的样本检测框为目标检测框；对目标检测框对应的目标对象进行数据增强，得到目标样本图像；利用目标样本图像对初始检测模型进行训练，得到目标检测模型。Optionally, the above-mentioned storage medium is also configured to store program codes for performing the following steps: acquiring a plurality of sample images and sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used to mark objects in the sample images. object; determine a preset number of sample detection frames in multiple sample images as target detection frames; perform data enhancement on the target object corresponding to the target detection frame to obtain target sample images; use the target sample images to train the initial detection model to obtain the target Check the model.

可选地，上述存储介质还被设置为存储用于执行以下步骤的程序代码：对多个样本图像进行拼接，得到初始样本图像；对初始样本图像和预设样本图像进行混合，得到混合图像；确定混合图像中预设数量的样本检测框为目标检测框。Optionally, the above-mentioned storage medium is further configured to store program codes for performing the following steps: splicing a plurality of sample images to obtain an initial sample image; mixing the initial sample image and the preset sample image to obtain a mixed image; Determine a preset number of sample detection frames in the mixed image as target detection frames.

可选地，上述存储介质还被设置为存储用于执行以下步骤的程序代码：输出检测结果；接收第一反馈结果，其中，第一反馈结果用于对根据检测结果对合并特征图中的通道进行修改得到；基于第一反馈结果更新合并特征图。Optionally, the above-mentioned storage medium is further configured to store program codes for performing the following steps: outputting a detection result; receiving a first feedback result, wherein the first feedback result is used to pair the channels in the merged feature map according to the detection result. Modification is performed to obtain; the merged feature map is updated based on the first feedback result.

可选地，在本实施例中，存储介质被设置为存储用于执行以下步骤的程序代码：获取目标遥感图像，其中，目标遥感图像包含目标对象；对目标遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标对象的检测结果。Optionally, in this embodiment, the storage medium is set to store program codes for performing the following steps: acquiring a target remote sensing image, wherein the target remote sensing image includes a target object; performing multi-scale feature extraction on the target remote sensing image to obtain multiple first feature maps; using a multi-branch network structure to perform feature fusion on the multiple first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used for multiple first feature maps through multiple branches Perform feature fusion; perform detection on at least one second feature map to obtain a detection result of the target object.

可选地，在本实施例中，存储介质被设置为存储用于执行以下步骤的程序代码：获取农业遥感图像，其中，农业遥感图像包含农作物；对农业遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到农作物的检测结果。Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps: acquiring an agricultural remote sensing image, wherein the agricultural remote sensing image includes crops; performing multi-scale feature extraction on the agricultural remote sensing image to obtain multiple A multi-branch network structure is used to perform feature fusion on the plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used to perform a multi-branch network structure on the plurality of first feature maps through multiple branches. Feature fusion; detecting at least one second feature map to obtain a detection result of crops.

可选地，在本实施例中，存储介质被设置为存储用于执行以下步骤的程序代码：获取建筑物遥感图像，其中，建筑物遥感图像包含目标建筑物；对建筑物遥感图像进行多尺度特征提取，得到多个第一特征图；利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；对至少一个第二特征图进行检测，得到目标建筑物的检测结果。Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps: acquiring a remote sensing image of a building, wherein the remote sensing image of the building includes a target building; Feature extraction, to obtain a plurality of first feature maps; using a multi-branch network structure to perform feature fusion on the plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used for multiple The first feature map performs feature fusion; at least one second feature map is detected to obtain a detection result of the target building.

可选地，在本实施例中，存储介质被设置为存储用于执行以下步骤的程序代码：云服务器获取目标图像，其中，目标图像包含目标对象；云服务器对目标图像进行多尺度特征提取，得到多个第一特征图；云服务器利用多分支网络结构对多个第一特征图进行特征融合，得到至少一个第二特征图，其中，多分支网络结构用于通过多个分支对多个第一特征图进行特征融合；云服务器对至少一个第二特征图进行检测，得到目标对象的检测结果。Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps: the cloud server acquires a target image, wherein the target image contains the target object; the cloud server performs multi-scale feature extraction on the target image, Obtain a plurality of first feature maps; the cloud server uses a multi-branch network structure to perform feature fusion on the plurality of first feature maps to obtain at least one second feature map, wherein the multi-branch network structure is used for multiple A feature map performs feature fusion; the cloud server detects at least one second feature map to obtain a detection result of the target object.

上述本申请实施例序号仅仅为了描述，不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.

在本发明的上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

在本申请所提供的几个实施例中，应该理解到，所揭露的技术内容，可通过其它的方式实现。其中，以上所描述的装置实施例仅仅是示意性的，例如所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，单元或模块的间接耦合或通信连接，可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes .

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can be made. It should be regarded as the protection scope of the present invention.

Claims

1. An image processing method, comprising:

acquiring a target image, wherein the target image comprises a target object;

performing multi-scale feature extraction on the target image to obtain a plurality of first feature maps;

performing feature fusion on the plurality of first feature maps by using a multi-branch network structure to obtain at least one second feature map, wherein the multi-branch network structure is used for performing feature fusion on the plurality of first feature maps through a plurality of branches;

and detecting the at least one second characteristic diagram to obtain a detection result of the target object.

2. The method of claim 1, wherein the multi-drop network architecture comprises: a first branch, a second branch, wherein an output of the first branch is connected with an output of the second branch.

3. The method of claim 2, wherein performing feature fusion on the plurality of first feature maps using a multi-drop network architecture to obtain at least one second feature map comprises:

channel merging is carried out on the first feature maps to obtain a merged feature map;

performing convolution processing on the merged feature map by using the first branch to obtain a first output feature;

performing convolution processing on the merged feature map by using the second branch to obtain a second output feature;

and carrying out channel combination on the first output characteristic and the second output characteristic to obtain the at least one second characteristic diagram.

4. The method of claim 3, wherein the first branch comprises: at least one convolution block, each of the plurality of convolution blocks comprising a plurality of sub-convolution layers, the plurality of sub-convolution layers having different convolution kernels.

5. The method of claim 1, further comprising:

and detecting the target image by using a target detection model to obtain the detection result of the target object, wherein the target detection model is obtained by training based on a target sample image, and the target sample image is obtained by performing data enhancement on a plurality of sample images through a sample detection frame.

6. The method of claim 5, further comprising:

obtaining the plurality of sample images and the sample detection frames corresponding to the plurality of sample images, wherein the sample detection frames are used for marking target objects in the sample images;

determining a preset number of sample detection frames in the plurality of sample images as target detection frames;

performing data enhancement on the target object corresponding to the target detection frame to obtain a target sample image;

and training an initial detection model by using the target sample image to obtain the target detection model.

7. The method of claim 6, wherein determining a preset number of sample detection frames in the plurality of sample images as target detection frames comprises:

splicing the plurality of sample images to obtain an initial sample image;

mixing the initial sample image and a preset sample image to obtain a mixed image;

determining the preset number of the sample detection frames in the mixed image as the target detection frame.

8. The method of claim 3, further comprising:

outputting the detection result;

receiving a first feedback result, wherein the first feedback result is obtained by modifying a channel in the combined feature map according to the detection result;

updating the merged feature map based on the first feedback result.

9. An image processing method, comprising:

acquiring a target remote sensing image, wherein the target remote sensing image comprises a target object;

carrying out multi-scale feature extraction on the target remote sensing image to obtain a plurality of first feature maps;

10. An image processing method, comprising:

obtaining a building remote sensing image, wherein the building remote sensing image comprises a target building;

carrying out multi-scale feature extraction on the building remote sensing image to obtain a plurality of first feature maps;

and detecting the at least one second characteristic diagram to obtain a detection result of the target building.

11. The method of claim 10, further comprising:

and detecting the building remote sensing image by using a target detection model to obtain the detection result of the target building, wherein the target detection model is obtained by training based on a target sample image, and the target sample image is obtained by performing data enhancement on a plurality of sample images through a sample detection frame.

12. An image processing method, comprising:

the method comprises the steps that a cloud server obtains a target image, wherein the target image comprises a target object;

the cloud server performs multi-scale feature extraction on the target image to obtain a plurality of first feature maps;

the cloud server performs feature fusion on the plurality of first feature maps by using a multi-branch network structure to obtain at least one second feature map, wherein the multi-branch network structure is used for performing feature fusion on the plurality of first feature maps through a plurality of branches;

and the cloud server detects the at least one second characteristic diagram to obtain a detection result of the target object.

13. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method of any one of claims 1 to 12.

14. An electronic device, comprising: a memory and a processor for executing a program stored in the memory, wherein the program when executed performs the method of any one of claims 1 to 12.