CN117705059B - Positioning method and system for remote sensing mapping image of natural resource - Google Patents
Positioning method and system for remote sensing mapping image of natural resource Download PDFInfo
- Publication number
- CN117705059B CN117705059B CN202311723277.2A CN202311723277A CN117705059B CN 117705059 B CN117705059 B CN 117705059B CN 202311723277 A CN202311723277 A CN 202311723277A CN 117705059 B CN117705059 B CN 117705059B
- Authority
- CN
- China
- Prior art keywords
- interest
- description
- feature vectors
- region
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000013507 mapping Methods 0.000 title claims abstract description 38
- 239000013598 vector Substances 0.000 claims abstract description 201
- 238000013527 convolutional neural network Methods 0.000 claims description 13
- 238000012937 correction Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 10
- 238000013135 deep learning Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 5
- 230000004927 fusion Effects 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 210000002569 neuron Anatomy 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
- G01C11/04—Interpretation of pictures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Remote Sensing (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Radar, Positioning & Navigation (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Astronomy & Astrophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域Technical Field
本申请涉及影像定位领域,且更为具体地,涉及一种自然资源遥感测绘影像定位方法及系统。The present application relates to the field of image positioning, and more specifically, to a natural resource remote sensing mapping image positioning method and system.
背景技术Background Art
遥感测绘是一种利用卫星、航空器和无人机等设备,从远距离对地球表面的自然资源进行观测和测量的技术。遥感测绘可以提供大范围、高分辨率、多时相的影像数据,为自然资源的调查、评价、监测和管理提供重要的信息支持。Remote sensing mapping is a technology that uses satellites, aircraft, drones and other equipment to observe and measure natural resources on the earth's surface from a distance. Remote sensing mapping can provide large-scale, high-resolution, multi-temporal image data, providing important information support for the investigation, evaluation, monitoring and management of natural resources.
然而,遥感影像数据的规模和复杂性也给影像的检索和定位带来了挑战。传统的影像定位方法在处理大规模、复杂的自然资源影像数据时存在定位精度不高、处理效率低下等问题。因此,期待一种优化的自然资源遥感测绘影像定位方法及系统。However, the scale and complexity of remote sensing image data also bring challenges to image retrieval and positioning. Traditional image positioning methods have problems such as low positioning accuracy and low processing efficiency when processing large-scale and complex natural resource image data. Therefore, an optimized natural resource remote sensing mapping image positioning method and system is expected.
发明内容Summary of the invention
为了解决上述技术问题,提出了本申请。本申请的实施例提供了一种自然资源遥感测绘影像定位方法及系统,其可以结合自然语言处理技术,以根据用户输入的影像定位需求描述中蕴含的语义信息来从海量的遥感影像数据中检索出与之匹配的图像区域,以实现智能化定位。In order to solve the above technical problems, this application is proposed. The embodiments of this application provide a natural resource remote sensing mapping image positioning method and system, which can be combined with natural language processing technology to retrieve matching image areas from massive remote sensing image data according to the semantic information contained in the image positioning requirement description input by the user, so as to realize intelligent positioning.
根据本申请的一个方面,提供了一种自然资源遥感测绘影像定位方法,其包括:According to one aspect of the present application, a natural resource remote sensing mapping image positioning method is provided, which includes:
获取遥感影像数据;Acquire remote sensing image data;
对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量;Performing feature description based on the region of interest on the remote sensing image data to obtain a plurality of region of interest description feature vectors;
获取影像定位需求描述;Obtain image positioning requirement description;
对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量;以及semantically encoding the image positioning requirement description to obtain an image positioning requirement description feature vector; and
基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果。The positioning result is determined based on the multiple region of interest description feature vectors and the image positioning requirement description feature vector.
根据本申请的另一个方面,提供了一种自然资源遥感测绘影像定位系统,其包括:According to another aspect of the present application, a natural resource remote sensing mapping image positioning system is provided, comprising:
遥感影像数据获取模块,用于获取遥感影像数据;A remote sensing image data acquisition module is used to acquire remote sensing image data;
感兴趣特征描述模块,用于对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量;An interest feature description module is used to perform feature description based on the region of interest on the remote sensing image data to obtain a plurality of interest region description feature vectors;
影像定位需求描述获取模块,用于获取影像定位需求描述;An image positioning requirement description acquisition module is used to acquire an image positioning requirement description;
语义编码模块,用于对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量;以及A semantic encoding module, used for semantically encoding the image positioning requirement description to obtain an image positioning requirement description feature vector; and
定位结果分析模块,用于基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果。The positioning result analysis module is used to determine the positioning result based on the multiple region of interest description feature vectors and the image positioning requirement description feature vector.
与现有技术相比,本申请提供的自然资源遥感测绘影像定位方法及系统,其首先获取遥感影像数据,接着,对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量,然后,获取影像定位需求描述,接着,对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量,最后,基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果。这样,可以根据用户输入的影像定位需求描述中蕴含的语义信息来从海量的遥感影像数据中检索出与之匹配的图像区域,以实现智能化定位。Compared with the prior art, the natural resource remote sensing mapping image positioning method and system provided by the present application first obtains remote sensing image data, then performs feature description based on the region of interest on the remote sensing image data to obtain multiple region of interest description feature vectors, then obtains image positioning requirement description, then performs semantic encoding on the image positioning requirement description to obtain image positioning requirement description feature vectors, and finally determines the positioning result based on the multiple region of interest description feature vectors and the image positioning requirement description feature vectors. In this way, the image area matching the image positioning requirement description input by the user can be retrieved from the massive remote sensing image data to achieve intelligent positioning.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,以下附图并未刻意按实际尺寸等比例缩放绘制,重点在于示出本申请的主旨。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following briefly introduces the drawings required for use in the description of the embodiments. The following drawings are not intentionally scaled to the actual size, and the focus is on illustrating the main purpose of the present application.
图1为根据本申请实施例的自然资源遥感测绘影像定位方法的流程图。FIG. 1 is a flow chart of a natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
图2为根据本申请实施例的自然资源遥感测绘影像定位方法的架构示意图。FIG. 2 is a schematic diagram of the architecture of a natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
图3为根据本申请实施例的自然资源遥感测绘影像定位方法的子步骤S120的流程图。FIG. 3 is a flowchart of sub-step S120 of the natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
图4为根据本申请实施例的自然资源遥感测绘影像定位方法的子步骤S150的流程图。FIG. 4 is a flowchart of sub-step S150 of the natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
图5为根据本申请实施例的自然资源遥感测绘影像定位方法的子步骤S152的流程图。FIG. 5 is a flowchart of sub-step S152 of the natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
图6为根据本申请实施例的自然资源遥感测绘影像定位系统的框图。FIG. 6 is a block diagram of a natural resource remote sensing mapping image positioning system according to an embodiment of the present application.
图7为根据本申请实施例的自然资源遥感测绘影像定位方法的应用场景图。FIG. 7 is a diagram showing an application scenario of a natural resource remote sensing mapping image positioning method according to an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
下面将结合附图对本申请实施例中的技术方案进行清楚、完整地描述,显而易见地,所描述的实施例仅仅是本申请的部分实施例,而不是全部的实施例。基于本申请实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,也属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present application, not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by ordinary technicians in this field without creative work also fall within the scope of protection of the present application.
如本申请和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其他的步骤或元素。As shown in this application and claims, unless the context clearly indicates an exception, the words "a", "an", "an" and/or "the" do not refer to the singular and may also include the plural. Generally speaking, the terms "include" and "comprise" only indicate the inclusion of the steps and elements that have been clearly identified, and these steps and elements do not constitute an exclusive list. The method or device may also include other steps or elements.
虽然本申请对根据本申请的实施例的系统中的某些模块做出了各种引用,然而,任何数量的不同模块可以被使用并运行在用户终端和/或服务器上。所述模块仅是说明性的,并且所述系统和方法的不同方面可以使用不同模块。Although the present application makes various references to certain modules in the system according to the embodiments of the present application, any number of different modules can be used and run on the user terminal and/or server. The modules are only illustrative, and different aspects of the system and method can use different modules.
本申请中使用了流程图用来说明根据本申请的实施例的系统所执行的操作。应当理解的是,前面或下面操作不一定按照顺序来精确地执行。相反,根据需要,可以按照倒序或同时处理各种步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。Flowcharts are used in the present application to illustrate the operations performed by the system according to the embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed accurately in order. On the contrary, various steps may be processed in reverse order or simultaneously as required. At the same time, other operations may also be added to these processes, or a certain step or several steps of operations may be removed from these processes.
下面,将参考附图详细地描述根据本申请的示例实施例。显然,所描述的实施例仅仅是本申请的一部分实施例,而不是本申请的全部实施例,应理解,本申请不受这里描述的示例实施例的限制。Below, the exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. Obviously, the described embodiments are only part of the embodiments of the present application, rather than all the embodiments of the present application, and it should be understood that the present application is not limited to the exemplary embodiments described here.
针对上述技术问题,本申请的技术构思为结合自然语言处理技术,以根据用户输入的影像定位需求描述中蕴含的语义信息来从海量的遥感影像数据中检索出与之匹配的图像区域,以实现智能化定位。In response to the above technical problems, the technical concept of the present application is to combine natural language processing technology to retrieve matching image areas from massive remote sensing image data based on the semantic information contained in the image positioning requirement description input by the user, so as to achieve intelligent positioning.
基于此,图1为根据本申请实施例的自然资源遥感测绘影像定位方法的流程图。图2为根据本申请实施例的自然资源遥感测绘影像定位方法的架构示意图。如图1和图2所示,根据本申请实施例的自然资源遥感测绘影像定位方法,包括步骤:S110,获取遥感影像数据;S120,对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量;S130,获取影像定位需求描述;S140,对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量;以及,S150,基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果。Based on this, Figure 1 is a flow chart of a natural resource remote sensing sensing and mapping image positioning method according to an embodiment of the present application. Figure 2 is a schematic diagram of the architecture of a natural resource remote sensing and mapping image positioning method according to an embodiment of the present application. As shown in Figures 1 and 2, a natural resource remote sensing and mapping image positioning method according to an embodiment of the present application includes the following steps: S110, acquiring remote sensing image data; S120, performing feature description based on the region of interest on the remote sensing image data to obtain multiple region of interest description feature vectors; S130, acquiring an image positioning requirement description; S140, semantically encoding the image positioning requirement description to obtain an image positioning requirement description feature vector; and, S150, determining a positioning result based on the multiple region of interest description feature vectors and the image positioning requirement description feature vector.
应可以理解,在步骤S110中,获取用于定位的遥感影像数据,遥感影像数据通常是通过卫星、飞机或其他遥感平台获取的,可以提供地球表面的图像信息。在步骤S120中,对遥感影像数据中的感兴趣区域进行特征描述,感兴趣区域可以是具有特定属性或目标的区域,例如建筑物、道路、水体等,通过提取这些感兴趣区域的特征,可以得到每个区域的描述特征向量。在步骤S130中,获取影像定位的需求描述,这可以是用户对遥感影像中某个目标或区域的描述,例如"找到城市中心"或"定位河流的交汇处",影像定位需求描述可以是自然语言形式或其他形式的描述。在步骤S140中,对影像定位需求描述进行语义编码,将其转换为影像定位需求描述特征向量,语义编码可以使用自然语言处理或其他技术来将自然语言描述转化为计算机可处理的向量表示。在步骤S150中,使用多个感兴趣区域描述特征向量和影像定位需求描述特征向量来确定定位结果,这可以通过计算感兴趣区域描述特征向量与影像定位需求描述特征向量之间的相似度或匹配程度来实现。根据相似度或匹配程度,可以确定哪些感兴趣区域与影像定位需求匹配度较高,从而得出定位结果。这些步骤的组合可以帮助实现遥感影像的定位,通过将遥感影像数据与影像定位需求进行特征提取、编码和匹配,可以确定感兴趣区域的位置和定位结果。It should be understood that in step S110, remote sensing image data for positioning is obtained. Remote sensing image data is usually obtained through satellites, aircraft or other remote sensing platforms, and can provide image information of the earth's surface. In step S120, the region of interest in the remote sensing image data is characterized. The region of interest can be a region with specific attributes or targets, such as buildings, roads, water bodies, etc. By extracting the features of these regions of interest, a description feature vector of each region can be obtained. In step S130, the requirement description of image positioning is obtained, which can be a user's description of a certain target or area in the remote sensing image, such as "find the city center" or "locate the confluence of rivers". The image positioning requirement description can be a description in natural language form or other forms. In step S140, the image positioning requirement description is semantically encoded and converted into an image positioning requirement description feature vector. The semantic encoding can use natural language processing or other technologies to convert the natural language description into a computer-processable vector representation. In step S150, multiple feature vectors describing regions of interest and feature vectors describing image positioning requirements are used to determine positioning results, which can be achieved by calculating the similarity or degree of matching between the feature vectors describing regions of interest and the feature vectors describing image positioning requirements. Based on the similarity or degree of matching, it can be determined which regions of interest have a higher degree of matching with the image positioning requirements, thereby obtaining positioning results. The combination of these steps can help achieve remote sensing image positioning. By extracting, encoding and matching features of remote sensing image data with image positioning requirements, the location of the region of interest and the positioning results can be determined.
具体地,在本申请的技术方案中,首先获取遥感影像数据;并从所述遥感影像数据提取多个感兴趣区域。其中,所述感兴趣区域是所述遥感影像数据中具有特定特征的区域,这些感兴趣区域可能包括特定类型的地物、地貌、植被、水体等。通过对所述遥感影像数据提取多个感兴趣区域,可以缩小后续模型的检索范围,而避免大量的干扰信息和噪声对定位结果造成不良的影响。Specifically, in the technical solution of the present application, remote sensing image data is first acquired; and multiple regions of interest are extracted from the remote sensing image data. The regions of interest are regions with specific features in the remote sensing image data, and these regions of interest may include specific types of land objects, landforms, vegetation, water bodies, etc. By extracting multiple regions of interest from the remote sensing image data, the retrieval scope of the subsequent model can be narrowed, and a large amount of interference information and noise can be avoided to have an adverse effect on the positioning results.
在本申请的一个具体示例中,从所述遥感影像数据提取多个感兴趣区域可以由人工进行截取。具体来说,可以由专业的遥感图像解译人员使用图像处理软件,在遥感影像中手动选择感兴趣的区域。这种方法需要人工参与,对操作者的专业知识和经验有一定要求,但可以精确地选择具有特定特征的区域。在本申请的又一具体示例中,从所述遥感影像数据提取多个感兴趣区域可以采用计算机视觉和图像处理技术进行提取。例如,构建目标识别网络等方式。In a specific example of the present application, multiple regions of interest can be extracted from the remote sensing image data by manual capture. Specifically, a professional remote sensing image interpreter can use image processing software to manually select regions of interest in the remote sensing image. This method requires manual participation and has certain requirements on the operator's professional knowledge and experience, but it can accurately select regions with specific features. In another specific example of the present application, multiple regions of interest can be extracted from the remote sensing image data using computer vision and image processing technology. For example, by constructing a target recognition network and other methods.
然后,将所述多个感兴趣区域分别通过基于卷积神经网络模型的特征描述器以得到多个感兴趣区域描述特征向量。也就是,利用卷积神经网络模型模型的多层卷积和池化操作来提取各个感兴趣区域的高级语义特征。其中,这些高级语义特征可以更好地描述区域的形态、纹理、结构等特征。Then, the multiple regions of interest are respectively passed through a feature descriptor based on a convolutional neural network model to obtain multiple region of interest description feature vectors. That is, the high-level semantic features of each region of interest are extracted using the multi-layer convolution and pooling operations of the convolutional neural network model. Among them, these high-level semantic features can better describe the morphology, texture, structure and other features of the region.
相应地,在步骤S120中,如图3所示,对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量,包括:S121,从所述遥感影像数据提取多个感兴趣区域;以及,S122,利用深度学习网络模型对所述多个感兴趣区域进行特征提取和特征描述以得到所述多个感兴趣区域描述特征向量。Accordingly, in step S120, as shown in FIG3 , the remote sensing image data is subjected to feature description based on the region of interest to obtain a plurality of region of interest description feature vectors, including: S121, extracting a plurality of regions of interest from the remote sensing image data; and, S122, performing feature extraction and feature description on the plurality of regions of interest using a deep learning network model to obtain the plurality of region of interest description feature vectors.
其中,在步骤S122中,所述深度学习网络模型为基于卷积神经网络模型的特征描述器;其中,所述基于卷积神经网络模型的特征描述器包括输入层、卷积层、激活层、池化层和输出层。具体地,利用深度学习网络模型对所述多个感兴趣区域进行特征提取和特征描述以得到所述多个感兴趣区域描述特征向量,包括:将所述多个感兴趣区域分别通过所述基于卷积神经网络模型的特征描述器以得到所述多个感兴趣区域描述特征向量。Wherein, in step S122, the deep learning network model is a feature descriptor based on a convolutional neural network model; wherein the feature descriptor based on a convolutional neural network model includes an input layer, a convolution layer, an activation layer, a pooling layer, and an output layer. Specifically, using the deep learning network model to perform feature extraction and feature description on the multiple regions of interest to obtain the multiple regions of interest description feature vectors includes: passing the multiple regions of interest through the feature descriptor based on the convolutional neural network model respectively to obtain the multiple regions of interest description feature vectors.
应可以理解,卷积神经网络(Convolutional Neural Network,CNN)是一种深度学习网络模型,主要用于处理具有网格结构的数据,它在计算机视觉领域广泛应用,能够有效地捕捉图像中的空间局部特征,并具有平移不变性。卷积神经网络模型的主要组成部分如下:1.输入层:接受图像或感兴趣区域作为输入。2.卷积层:通过一系列的卷积操作,提取输入数据中的特征。卷积操作使用卷积核(也称为滤波器)对输入数据进行滑动窗口计算,从而捕捉局部特征。3.激活层:引入非线性变换,增加网络的表达能力。常用的激活函数包括ReLU(Rectified Linear Unit)函数。4.池化层:通过降采样操作减少特征图的尺寸,同时保留重要的特征。常用的池化操作包括最大池化和平均池化。5.输出层:将经过卷积、激活和池化等操作得到的特征映射转化为最终的特征向量表示。卷积神经网络模型的优势在于它能够自动学习图像中的特征表示,无需手动设计特征提取器。通过训练过程,网络可以学习到对于任务有用的特征,从而实现对感兴趣区域的特征提取和描述。在遥感影像定位中,使用基于卷积神经网络模型的特征描述器可以提取感兴趣区域的语义特征,用于后续的定位结果确定。It should be understood that Convolutional Neural Network (CNN) is a deep learning network model that is mainly used to process data with a grid structure. It is widely used in the field of computer vision and can effectively capture spatial local features in images with translation invariance. The main components of the convolutional neural network model are as follows: 1. Input layer: accepts images or regions of interest as input. 2. Convolution layer: extracts features from the input data through a series of convolution operations. The convolution operation uses a convolution kernel (also called a filter) to perform a sliding window calculation on the input data to capture local features. 3. Activation layer: introduces nonlinear transformations to increase the expressive power of the network. Common activation functions include the ReLU (Rectified Linear Unit) function. 4. Pooling layer: reduces the size of the feature map through downsampling operations while retaining important features. Common pooling operations include maximum pooling and average pooling. 5. Output layer: converts the feature map obtained through convolution, activation and pooling operations into the final feature vector representation. The advantage of the convolutional neural network model is that it can automatically learn the feature representation in the image without manually designing the feature extractor. Through the training process, the network can learn features that are useful for the task, thereby achieving feature extraction and description of the region of interest. In remote sensing image positioning, the feature descriptor based on the convolutional neural network model can extract the semantic features of the region of interest for subsequent positioning result determination.
于此同时,获取影像定位需求描述;并对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量。这里,所述影像定位需求描述为自然语言描述,其通常是人类理解和表达信息的方式,但计算机无法直接理解和识别文本数据。通过对所述影像定位需求描述进行语义编码,可以将自然语言转换为计算机可以处理的数值化特征向量,并提取所述影像定位需求描述中的语义信息,以精准地理解影像定位任务的具体要求和期望,明确定位的目标和范围。At the same time, the image positioning requirement description is obtained; and the image positioning requirement description is semantically encoded to obtain an image positioning requirement description feature vector. Here, the image positioning requirement description is a natural language description, which is usually the way humans understand and express information, but computers cannot directly understand and recognize text data. By semantically encoding the image positioning requirement description, natural language can be converted into a numerical feature vector that can be processed by a computer, and the semantic information in the image positioning requirement description can be extracted to accurately understand the specific requirements and expectations of the image positioning task, and clarify the positioning objectives and scope.
接着,使用投影层来分别融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到多个语义匹配特征向量。也就是,通过投影层将所述影像定位需求描述特征向量和各个感兴趣区域描述特征向量映射到同一语义特征空间中进行语义匹配。Next, the projection layer is used to fuse the image positioning requirement description feature vector and each of the multiple region of interest description feature vectors to obtain multiple semantic matching feature vectors. That is, the image positioning requirement description feature vector and each region of interest description feature vector are mapped to the same semantic feature space through the projection layer for semantic matching.
进而,将所述多个语义匹配特征向量通过分类器以得到多个分类结果,所述各个分类结果用于表示匹配度是否超过预定阈值。在本申请的实际应用场景中,将匹配度超过预定阈值的各个所述校正后语义匹配特征向量对应的各个所述感兴趣区域作为定位区域以得到多个定位区域;并将所述多个定位区域作为所述定位结果。Furthermore, the multiple semantic matching feature vectors are passed through a classifier to obtain multiple classification results, and each classification result is used to indicate whether the matching degree exceeds a predetermined threshold. In the actual application scenario of the present application, each region of interest corresponding to each corrected semantic matching feature vector whose matching degree exceeds a predetermined threshold is used as a positioning region to obtain multiple positioning regions; and the multiple positioning regions are used as the positioning results.
相应地,在步骤S150中,如图4所示,基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果,包括:S151,融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到多个语义匹配特征向量;以及,S152,基于所述多个语义匹配特征向量来确定符合所述影像定位需求描述的所述定位结果。Accordingly, in step S150, as shown in FIG4 , the positioning result is determined based on the multiple region of interest description feature vectors and the image positioning requirement description feature vector, including: S151, fusing the image positioning requirement description feature vector and each region of interest description feature vector in the multiple region of interest description feature vectors to obtain multiple semantic matching feature vectors; and, S152, determining the positioning result that meets the image positioning requirement description based on the multiple semantic matching feature vectors.
应可以理解,在步骤S151中,将影像定位需求描述特征向量与每个感兴趣区域描述特征向量进行融合,以得到多个语义匹配特征向量,融合可以采用不同的方法,例如将两个特征向量进行拼接、加权求和或使用其他融合策略,以综合考虑影像定位需求和感兴趣区域的特征。在步骤S152中,使用多个语义匹配特征向量来确定最终的定位结果,以满足影像定位需求描述。可以使用相似度度量或其他匹配算法来评估每个语义匹配特征向量与影像定位需求的匹配程度,根据匹配程度,可以选择具有最高匹配度的特征向量对应的感兴趣区域作为定位结果,或者进行进一步的决策和推理来确定最终的定位结果。通过这两个步骤的组合,可以将影像定位需求与感兴趣区域的特征进行融合和匹配,从而得到符合需求的定位结果。这种融合和匹配的过程可以帮助确定感兴趣区域与影像定位需求的语义匹配程度,进而提供准确的定位结果。It should be understood that in step S151, the feature vector describing the image positioning requirement is fused with the feature vector describing each region of interest to obtain multiple semantic matching feature vectors. The fusion can be performed in different ways, such as concatenating two feature vectors, weighted summing, or using other fusion strategies to comprehensively consider the image positioning requirement and the characteristics of the region of interest. In step S152, multiple semantic matching feature vectors are used to determine the final positioning result to meet the image positioning requirement description. A similarity metric or other matching algorithm can be used to evaluate the degree of matching between each semantic matching feature vector and the image positioning requirement. According to the degree of matching, the region of interest corresponding to the feature vector with the highest matching degree can be selected as the positioning result, or further decision-making and reasoning can be performed to determine the final positioning result. Through the combination of these two steps, the image positioning requirement can be fused and matched with the characteristics of the region of interest, so as to obtain a positioning result that meets the requirements. This fusion and matching process can help determine the degree of semantic matching between the region of interest and the image positioning requirement, thereby providing an accurate positioning result.
其中,在步骤S151中,融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到多个语义匹配特征向量,包括:使用投影层来分别融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到所述多个语义匹配特征向量。Among them, in step S151, the image positioning requirement description feature vector and each region of interest description feature vector in the multiple region of interest description feature vectors are fused to obtain multiple semantic matching feature vectors, including: using a projection layer to respectively fuse the image positioning requirement description feature vector and each region of interest description feature vector in the multiple region of interest description feature vectors to obtain the multiple semantic matching feature vectors.
值得一提的是,投影层(Projection Layer)是神经网络中的一种层,用于将输入数据从一个特征空间映射到另一个特征空间。在这种情况下,投影层用于将影像定位需求描述特征向量和感兴趣区域描述特征向量进行映射,以得到语义匹配特征向量。具体来说,投影层可以是一个全连接层(Fully Connected Layer),也称为线性层或密集层。全连接层中的每个神经元都与前一层的所有神经元相连接,通过权重和偏置进行线性变换。在这种情况下,投影层将影像定位需求描述特征向量和感兴趣区域描述特征向量分别输入两个全连接层,通过学习到的权重和偏置将它们映射到另一个特征空间。通过投影层的映射,可以将不同特征空间中的特征进行融合,得到多个语义匹配特征向量。这些特征向量将包含关于影像定位需求和感兴趣区域的语义信息,有助于后续的匹配和定位结果的确定。投影层的设计和参数学习可以根据具体的任务和数据进行调整和优化,以获得最佳的融合效果。It is worth mentioning that the projection layer is a layer in the neural network that is used to map input data from one feature space to another feature space. In this case, the projection layer is used to map the image positioning requirement description feature vector and the region of interest description feature vector to obtain a semantic matching feature vector. Specifically, the projection layer can be a fully connected layer, also known as a linear layer or a dense layer. Each neuron in the fully connected layer is connected to all neurons in the previous layer and linearly transformed through weights and biases. In this case, the projection layer inputs the image positioning requirement description feature vector and the region of interest description feature vector into two fully connected layers respectively, and maps them to another feature space through the learned weights and biases. Through the mapping of the projection layer, features in different feature spaces can be fused to obtain multiple semantic matching feature vectors. These feature vectors will contain semantic information about the image positioning requirements and the region of interest, which will help determine the subsequent matching and positioning results. The design and parameter learning of the projection layer can be adjusted and optimized according to the specific tasks and data to obtain the best fusion effect.
在本申请的一个具体示例中,使用投影层来分别融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到所述多个语义匹配特征向量,包括:以如下投影公式来分别融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量以得到所述多个语义匹配特征向量;其中,所述投影公式为:In a specific example of the present application, a projection layer is used to fuse the image positioning requirement description feature vector and each of the plurality of region of interest description feature vectors to obtain the plurality of semantic matching feature vectors, including: using the following projection formula to fuse the image positioning requirement description feature vector and each of the plurality of region of interest description feature vectors to obtain the plurality of semantic matching feature vectors; wherein the projection formula is:
其中,Vf为所述语义匹配特征向量,V1为所述影像定位需求描述特征向量,V2为各个所述感兴趣区域描述特征向量,[·;·]表示级联,表示向量的投影映射。Wherein, Vf is the semantic matching feature vector, V1 is the image positioning requirement description feature vector, V2 is the feature vector describing each region of interest, [·;·] represents cascade, Represents a projection map of a vector.
这里,将影像定位需求描述特征向量和感兴趣区域描述特征向量通过共享的投影层映射到同一语义特征空间之中,以将两者的高维特征分布流形束缚至相同的度量标尺里,使得它们可以直接进行对比和匹配。Here, the feature vector describing the image positioning requirements and the feature vector describing the region of interest are mapped into the same semantic feature space through a shared projection layer, so as to constrain the high-dimensional feature distribution manifolds of the two to the same measurement scale, so that they can be directly compared and matched.
进一步地,在步骤S152中,如图5所示,基于所述多个语义匹配特征向量来确定符合所述影像定位需求描述的所述定位结果,包括:S1521,对所述多个语义匹配特征向量进行特征分布校正以得到多个校正后语义匹配特征向量;S1522,将所述多个校正后语义匹配特征向量通过分类器以得到多个分类结果,所述各个分类结果用于表示匹配度是否超过预定阈值;S1523,将匹配度超过预定阈值的各个所述校正后语义匹配特征向量对应的各个所述感兴趣区域作为定位区域以得到多个定位区域;以及,S1524,将所述多个定位区域作为所述定位结果。Further, in step S152, as shown in Figure 5, the positioning result that meets the image positioning requirement description is determined based on the multiple semantic matching feature vectors, including: S1521, performing feature distribution correction on the multiple semantic matching feature vectors to obtain multiple corrected semantic matching feature vectors; S1522, passing the multiple corrected semantic matching feature vectors through a classifier to obtain multiple classification results, and each classification result is used to indicate whether the matching degree exceeds a predetermined threshold; S1523, taking each of the regions of interest corresponding to each of the corrected semantic matching feature vectors whose matching degree exceeds the predetermined threshold as a positioning area to obtain multiple positioning areas; and, S1524, taking the multiple positioning areas as the positioning results.
应可以理解,在步骤S1521中,对多个语义匹配特征向量进行特征分布校正,特征分布校正的目的是通过对特征向量进行归一化、标准化或其他处理,使得它们在特征空间中的分布更加一致或更加符合某种期望的分布,这样可以提高特征向量之间的比较和匹配的准确性。在步骤S1522中,将校正后的语义匹配特征向量输入分类器进行分类,分类器可以是二分类器,用于判断特征向量的匹配度是否超过预定阈值。分类器可以基于机器学习算法训练得到,通过学习已标记的样本数据来判断特征向量是否满足匹配的条件。分类结果可以是二进制的(匹配/不匹配)或是概率值,表示匹配的置信度。在步骤S1523中,根据匹配度超过预定阈值的校正后语义匹配特征向量,确定对应的感兴趣区域作为定位区域,这些定位区域代表了与影像定位需求描述相匹配的区域,可以被认为是可能的定位结果。在步骤S1524中,将多个定位区域作为最终的定位结果,这些定位区域是在前面的步骤中根据匹配度和阈值进行筛选和确定的,代表了与影像定位需求描述相匹配的可能位置,将这些定位区域作为定位结果可以提供多个候选位置供进一步的分析和决策。通过这些步骤的组合,可以根据语义匹配特征向量的校正、分类和阈值判断,确定符合影像定位需求描述的定位结果。这种方法可以提高定位的准确性和鲁棒性,同时提供多个可能的定位区域供选择和进一步处理。It should be understood that in step S1521, feature distribution correction is performed on multiple semantic matching feature vectors. The purpose of feature distribution correction is to make their distribution in the feature space more consistent or more consistent with a certain expected distribution by normalizing, standardizing or other processing of the feature vectors, so as to improve the accuracy of comparison and matching between feature vectors. In step S1522, the corrected semantic matching feature vector is input into the classifier for classification. The classifier can be a binary classifier for judging whether the matching degree of the feature vector exceeds a predetermined threshold. The classifier can be trained based on a machine learning algorithm to judge whether the feature vector meets the matching conditions by learning the labeled sample data. The classification result can be binary (match/mismatch) or a probability value, indicating the confidence of the match. In step S1523, according to the corrected semantic matching feature vector whose matching degree exceeds the predetermined threshold, the corresponding region of interest is determined as the positioning region. These positioning regions represent the regions that match the image positioning requirement description and can be considered as possible positioning results. In step S1524, multiple positioning areas are used as the final positioning results. These positioning areas are screened and determined according to the matching degree and threshold in the previous steps, representing possible locations that match the image positioning requirement description. Using these positioning areas as positioning results can provide multiple candidate locations for further analysis and decision-making. Through the combination of these steps, the positioning results that meet the image positioning requirement description can be determined based on the correction, classification and threshold judgment of the semantic matching feature vector. This method can improve the accuracy and robustness of positioning, while providing multiple possible positioning areas for selection and further processing.
在上述技术方案中,所述影像定位需求描述特征向量用于表达所述影像定位需求描述的编码文本语义特征,而所述多个感兴趣区域描述特征向量分别表达所述多个感兴趣区域的图像语义特征,由此,在使用投影层来分别融合所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量时,考虑到所述影像定位需求描述特征向量和所述感兴趣区域描述特征向量之间的跨模态语义特征差异可能导致语义特征在共享维度下的映射稀疏性,从而影响所获得的多个语义匹配特征向量的表达效果,因此期望基于所述影像定位需求描述特征向量和所述感兴趣区域描述特征向量各自的特征表达显著性和关键性来进行特征投影映射优化,从而提升所述多个语义匹配特征向量的表达效果。基于此,本申请的申请人对于所述影像定位需求描述特征向量和所述每个感兴趣区域描述特征向量进行校正。In the above technical solution, the image positioning requirement description feature vector is used to express the encoded text semantic features of the image positioning requirement description, and the multiple regions of interest description feature vectors respectively express the image semantic features of the multiple regions of interest. Therefore, when using the projection layer to fuse the image positioning requirement description feature vector and each region of interest description feature vector in the multiple regions of interest description feature vectors respectively, it is considered that the cross-modal semantic feature difference between the image positioning requirement description feature vector and the region of interest description feature vector may lead to the mapping sparsity of semantic features under the shared dimension, thereby affecting the expression effect of the obtained multiple semantic matching feature vectors. Therefore, it is expected to optimize the feature projection mapping based on the feature expression significance and criticality of each of the image positioning requirement description feature vector and the region of interest description feature vector, so as to improve the expression effect of the multiple semantic matching feature vectors. Based on this, the applicant of this application corrects the image positioning requirement description feature vector and each region of interest description feature vector.
相应地,在步骤S1521中,对所述多个语义匹配特征向量进行特征分布校正以得到多个校正后语义匹配特征向量,包括:以如下校正公式计算所述影像定位需求描述特征向量和所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量的多个校正特征向量;其中,所述校正公式为:Accordingly, in step S1521, feature distribution correction is performed on the multiple semantic matching feature vectors to obtain multiple corrected semantic matching feature vectors, including: calculating multiple corrected feature vectors of the image positioning requirement description feature vector and each of the multiple region of interest description feature vectors using the following correction formula; wherein the correction formula is:
其中,V1是所述影像定位需求描述特征向量,且V2是所述多个感兴趣区域描述特征向量中的各个感兴趣区域描述特征向量,表示特征向量的逐位置开方,v1max -1和v2max -1分别是特征向量V1和V2最大特征值的倒数,α和β是权重超参数,⊙表示按位置点乘,表示向量的减法,Vc是所述多个校正特征向量中的各个校正特征向量;以及,将所述多个语义匹配特征向量和所述多个校正特征向量分别进行融合以得到所述多个校正后语义匹配特征向量。Wherein, V1 is the image positioning requirement description feature vector, and V2 is each region of interest description feature vector in the plurality of region of interest description feature vectors, represents the position-by-position square root of the eigenvector, v 1max -1 and v 2max -1 are the inverses of the maximum eigenvalues of the eigenvectors V 1 and V 2 , respectively, α and β are weight hyperparameters, ⊙ represents the position-by-position point multiplication, represents vector subtraction, V c is each corrected feature vector in the multiple corrected feature vectors; and the multiple semantic matching feature vectors and the multiple corrected feature vectors are respectively fused to obtain the multiple corrected semantic matching feature vectors.
这里,通过所述影像定位需求描述特征向量和所述感兴趣区域描述特征向量的各个特征值的开方值来获得特征值集合的预分割的局部组,并从其中回归所述影像定位需求描述特征向量和所述感兴趣区域描述特征向量的关键最大值特征,这样,可以基于最远点采样的思想来提升特征值的按位置显著性分布,从而通过具有显著分布的关键特征来进行特征向量间的稀疏对应性控制,以实现校正特征向量Vc对于所述影像定位需求描述特征向量和所述感兴趣区域描述特征向量的原始流形几何的还原。这样,再将所述校正特征向量Vc与所述语义匹配特征向量融合,就可以提升所述语义匹配特征向量的表达效果,从而提升其通过分类器得到的分类结果的准确性。Here, the pre-segmented local group of the feature value set is obtained by taking the square root of each feature value of the feature vector describing the image positioning requirements and the feature vector describing the region of interest, and regressing the key maximum features of the feature vector describing the image positioning requirements and the feature vector describing the region of interest from them. In this way, the location-based significant distribution of the feature value can be improved based on the idea of farthest point sampling, so that the sparse correspondence between feature vectors can be controlled through key features with significant distribution, so as to achieve the restoration of the original manifold geometry of the feature vector describing the image positioning requirements and the feature vector describing the region of interest by the correction feature vector V c . In this way, the correction feature vector V c is fused with the semantic matching feature vector, so as to improve the expression effect of the semantic matching feature vector, thereby improving the accuracy of the classification result obtained by the classifier.
进一步地,在步骤S1522中,将所述多个校正后语义匹配特征向量通过分类器以得到多个分类结果,所述各个分类结果用于表示匹配度是否超过预定阈值,包括:使用所述分类器的全连接层对所述多个校正后语义匹配特征向量分别进行全连接编码以得到多个编码分类特征向量;以及,将所述多个编码分类特征向量分别输入所述分类器的Softmax分类函数以得到所述多个分类结果。Further, in step S1522, the multiple corrected semantic matching feature vectors are passed through a classifier to obtain multiple classification results, and each classification result is used to indicate whether the matching degree exceeds a predetermined threshold, including: using the fully connected layer of the classifier to fully connect encode the multiple corrected semantic matching feature vectors respectively to obtain multiple encoded classification feature vectors; and, inputting the multiple encoded classification feature vectors into the Softmax classification function of the classifier respectively to obtain the multiple classification results.
应可以理解,分类器的作用是利用给定的类别、已知的训练数据来学习分类规则和分类器,然后对未知数据进行分类(或预测)。逻辑回归(logistics)、SVM等常用于解决二分类问题,对于多分类问题(multi-class classification),同样也可以用逻辑回归或SVM,只是需要多个二分类来组成多分类,但这样容易出错且效率不高,常用的多分类方法有Softmax分类函数。It should be understood that the role of the classifier is to use the given categories and known training data to learn classification rules and classifiers, and then classify (or predict) unknown data. Logistic regression and SVM are often used to solve binary classification problems. For multi-class classification problems, logistic regression or SVM can also be used, but multiple binary classifications are required to form a multi-classification problem. However, this is prone to errors and is not efficient. Commonly used multi-classification methods include the Softmax classification function.
值得一提的是,全连接编码(Fully Connected Encoding)是指将输入数据通过全连接层进行编码的过程。全连接层是神经网络中的一种常见层类型,它的每个神经元都与上一层的所有神经元相连。在全连接编码中,输入数据的每个特征都与全连接层中的每个神经元相连,通过权重和偏置的组合来计算神经元的输出。全连接编码的作用是将输入数据转换为具有更高级别的表示,以提取数据中的更丰富的特征。通过全连接层的非线性变换和参数学习,可以捕捉输入数据中的复杂关系和模式。通过将校正后的语义匹配特征向量输入全连接层进行编码,可以将特征向量转换为更具表达能力的编码分类特征向量。这些编码分类特征向量经过Softmax分类函数后,可以得到多个分类结果,用于表示匹配度是否超过预定阈值。通过全连接编码和分类器的组合,可以对语义匹配进行更准确的分类和判定。It is worth mentioning that fully connected encoding refers to the process of encoding input data through a fully connected layer. A fully connected layer is a common layer type in a neural network, and each of its neurons is connected to all neurons in the previous layer. In fully connected encoding, each feature of the input data is connected to each neuron in the fully connected layer, and the output of the neuron is calculated through a combination of weights and biases. The role of fully connected encoding is to convert the input data into a higher-level representation to extract richer features in the data. Through the nonlinear transformation and parameter learning of the fully connected layer, the complex relationships and patterns in the input data can be captured. By inputting the corrected semantic matching feature vector into the fully connected layer for encoding, the feature vector can be converted into a more expressive encoded classification feature vector. After these encoded classification feature vectors pass through the Softmax classification function, multiple classification results can be obtained to indicate whether the matching degree exceeds the predetermined threshold. Through the combination of fully connected encoding and classifiers, semantic matching can be classified and judged more accurately.
综上,基于本申请实施例的自然资源遥感测绘影像定位方法被阐明,其可以根据用户输入的影像定位需求描述中蕴含的语义信息来从海量的遥感影像数据中检索出与之匹配的图像区域,以实现智能化定位。In summary, the natural resource remote sensing mapping image positioning method based on the embodiment of the present application is explained, which can retrieve matching image areas from massive remote sensing image data according to the semantic information contained in the image positioning requirement description input by the user to achieve intelligent positioning.
图6为根据本申请实施例的自然资源遥感测绘影像定位系统100的框图。如图6所示,根据本申请实施例的自然资源遥感测绘影像定位系统100,包括:遥感影像数据获取模块110,用于获取遥感影像数据;感兴趣特征描述模块120,用于对所述遥感影像数据进行基于感兴趣区域的特征描述以得到多个感兴趣区域描述特征向量;影像定位需求描述获取模块130,用于获取影像定位需求描述;语义编码模块140,用于对所述影像定位需求描述进行语义编码以得到影像定位需求描述特征向量;以及,定位结果分析模块150,用于基于所述多个感兴趣区域描述特征向量和所述影像定位需求描述特征向量来确定定位结果。FIG6 is a block diagram of a natural resource remote sensing image positioning system 100 according to an embodiment of the present application. As shown in FIG6, the natural resource remote sensing image positioning system 100 according to an embodiment of the present application includes: a remote sensing image data acquisition module 110 for acquiring remote sensing image data; an interest feature description module 120 for performing feature description based on the region of interest on the remote sensing image data to obtain multiple interest region description feature vectors; an image positioning requirement description acquisition module 130 for acquiring image positioning requirement description; a semantic encoding module 140 for performing semantic encoding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and a positioning result analysis module 150 for determining the positioning result based on the multiple interest region description feature vectors and the image positioning requirement description feature vector.
在一个示例中,在上述自然资源遥感测绘影像定位系统100中,所述感兴趣特征描述模块120,包括:感兴趣区域提取单元,用于从所述遥感影像数据提取多个感兴趣区域;以及,特征提取描述单元,用于利用深度学习网络模型对所述多个感兴趣区域进行特征提取和特征描述以得到所述多个感兴趣区域描述特征向量。In one example, in the above-mentioned natural resource remote sensing mapping image positioning system 100, the feature description module 120 of interest includes: a region of interest extraction unit, used to extract multiple regions of interest from the remote sensing image data; and a feature extraction and description unit, used to use a deep learning network model to perform feature extraction and feature description on the multiple regions of interest to obtain the multiple region of interest description feature vectors.
这里,本领域技术人员可以理解,上述自然资源遥感测绘影像定位系统100中的各个模块的具体功能和操作已经在上面参考图1到图5的自然资源遥感测绘影像定位方法的描述中得到了详细介绍,并因此,将省略其重复描述。Here, those skilled in the art can understand that the specific functions and operations of each module in the above-mentioned natural resource remote sensing mapping image positioning system 100 have been described in detail in the description of the natural resource remote sensing mapping image positioning method with reference to Figures 1 to 5 above, and therefore, its repeated description will be omitted.
如上所述,根据本申请实施例的自然资源遥感测绘影像定位系统100可以实现在各种无线终端中,例如具有自然资源遥感测绘影像定位算法的服务器等。在一个示例中,根据本申请实施例的自然资源遥感测绘影像定位系统100可以作为一个软件模块和/或硬件模块而集成到无线终端中。例如,该自然资源遥感测绘影像定位系统100可以是该无线终端的操作系统中的一个软件模块,或者可以是针对于该无线终端所开发的一个应用程序;当然,该自然资源遥感测绘影像定位系统100同样可以是该无线终端的众多硬件模块之一。As described above, the natural resource remote sensing image positioning system 100 according to the embodiment of the present application can be implemented in various wireless terminals, such as a server with a natural resource remote sensing image positioning algorithm. In one example, the natural resource remote sensing image positioning system 100 according to the embodiment of the present application can be integrated into the wireless terminal as a software module and/or a hardware module. For example, the natural resource remote sensing image positioning system 100 can be a software module in the operating system of the wireless terminal, or it can be an application developed for the wireless terminal; of course, the natural resource remote sensing image positioning system 100 can also be one of the many hardware modules of the wireless terminal.
替换地,在另一示例中,该自然资源遥感测绘影像定位系统100与该无线终端也可以是分立的设备,并且该自然资源遥感测绘影像定位系统100可以通过有线和/或无线网络连接到该无线终端,并且按照约定的数据格式来传输交互信息。Alternatively, in another example, the natural resource remote sensing mapping image positioning system 100 and the wireless terminal may also be separate devices, and the natural resource remote sensing mapping image positioning system 100 may be connected to the wireless terminal via a wired and/or wireless network and transmit interactive information in accordance with an agreed data format.
图7为根据本申请实施例的自然资源遥感测绘影像定位方法的应用场景图。如图7所示,在该应用场景中,首先,获取遥感影像数据(例如,图7中所示意的D1)以及影像定位需求描述(例如,图7中所示意的D2),然后,将所述遥感影像数据和所述影像定位需求描述输入至部署有自然资源遥感测绘影像定位算法的服务器(例如,图7中所示意的S)中,其中,所述服务器能够使用所述自然资源遥感测绘影像定位算法对所述遥感影像数据和所述影像定位需求描述进行处理以得到用于表示匹配度是否超过预定阈值的多个分类结果。Figure 7 is an application scenario diagram of the natural resource remote sensing image positioning method according to an embodiment of the present application. As shown in Figure 7, in this application scenario, first, remote sensing image data (e.g., D1 shown in Figure 7) and image positioning requirement description (e.g., D2 shown in Figure 7) are obtained, and then the remote sensing image data and the image positioning requirement description are input into a server (e.g., S shown in Figure 7) deployed with a natural resource remote sensing image positioning algorithm, wherein the server can use the natural resource remote sensing image positioning algorithm to process the remote sensing image data and the image positioning requirement description to obtain multiple classification results for indicating whether the matching degree exceeds a predetermined threshold.
根据本申请的另一方面,还提供了一种非易失性的计算机可读存储介质,其上存储有计算机可读的指令,当利用计算机执行所述指令时可以执行如前所述的方法。According to another aspect of the present application, a non-volatile computer-readable storage medium is provided, on which computer-readable instructions are stored. When the instructions are executed by a computer, the above-mentioned method can be executed.
技术中的程序部分可以被认为是以可执行的代码和/或相关数据的形式而存在的“产品”或“制品”,通过计算机可读的介质所参与或实现的。有形的、永久的储存介质可以包括任何计算机、处理器、或类似设备或相关的模块所用到的内存或存储器。例如,各种半导体存储器、磁带驱动器、磁盘驱动器或者类似任何能够为软件提供存储功能的设备。The program part in the technology can be considered as a "product" or "manufactured product" in the form of executable code and/or related data, which is participated in or realized by computer-readable media. Tangible and permanent storage media can include any memory or storage used by any computer, processor, or similar device or related module. For example, various semiconductor memories, tape drives, disk drives or any similar devices that can provide storage functions for software.
所有软件或其中的一部分有时可能会通过网络进行通信,如互联网或其他通信网络。此类通信可以将软件从一个计算机设备或处理器加载到另一个。在这里的用法除非限制了有形的“储存”介质,其他表示计算机或机器“可读介质”的术语都表示在处理器执行任何指令的过程中参与的介质。All software or parts thereof may sometimes be communicated over a network, such as the Internet or other communications network. Such communications can load software from one computer device or processor to another. As used herein, unless restricted to tangible "storage" media, other terms referring to computer or machine "readable media" refer to media that participate in the process of executing any instructions by the processor.
此外,本领域技术人员可以理解,本申请的各方面可以通过若干具有可专利性的种类或情况进行说明和描述,包括任何新的和有用的工序、机器、产品或物质的组合,或对他们的任何新的和有用的改进。相应地,本申请的各个方面可以完全由硬件执行、可以完全由软件(包括固件、常驻软件、微码等)执行、也可以由硬件和软件组合执行。以上硬件或软件均可被称为“数据块”、“模块”、“引擎”、“单元”、“组件”或“系统”。此外,本申请的各方面可能表现为位于一个或多个计算机可读介质中的计算机产品,该产品包括计算机可读程序编码。In addition, it will be appreciated by those skilled in the art that various aspects of the present application may be illustrated and described by a number of patentable categories or situations, including any new and useful process, machine, product or combination of substances, or any new and useful improvements thereto. Accordingly, various aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, microcode, etc.), or by a combination of hardware and software. The above hardware or software may all be referred to as "data blocks", "modules", "engines", "units", "components" or "systems". In addition, various aspects of the present application may be represented as a computer product located in one or more computer-readable media, which includes computer-readable program code.
除非另有定义,这里使用的所有术语(包括技术和科学术语)具有与本申请所属领域的普通技术人员共同理解的相同含义。还应当理解,诸如在通常字典里定义的那些术语应当被解释为具有与它们在相关技术的上下文中的含义相一致的含义,而不应用理想化或极度形式化的意义来解释,除非这里明确地这样定义。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this application belongs. It should also be understood that terms such as those defined in common dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant technology, and should not be interpreted in an idealized or extremely formal sense, unless explicitly defined as such herein.
上面是对本申请的说明,而不应被认为是对其的限制。尽管描述了本申请的若干示例性实施例,但本领域技术人员将容易地理解,在不背离本申请的新颖教学和优点的前提下可以对示例性实施例进行许多修改。因此,所有这些修改都意图包含在权利要求书所限定的本申请范围内。应当理解,上面是对本申请的说明,而不应被认为是限于所公开的特定实施例,并且对所公开的实施例以及其他实施例的修改意图包含在所附权利要求书的范围内。本申请由权利要求书及其等效物限定。The above is an explanation of the present application and should not be considered as a limitation thereof. Although several exemplary embodiments of the present application are described, it will be readily understood by those skilled in the art that many modifications may be made to the exemplary embodiments without departing from the novel teachings and advantages of the present application. Therefore, all of these modifications are intended to be included within the scope of the present application as defined by the claims. It should be understood that the above is an explanation of the present application and should not be considered to be limited to the specific embodiments disclosed, and modifications to the disclosed embodiments and other embodiments are intended to be included within the scope of the appended claims. The present application is defined by the claims and their equivalents.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311723277.2A CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311723277.2A CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117705059A CN117705059A (en) | 2024-03-15 |
CN117705059B true CN117705059B (en) | 2024-09-17 |
Family
ID=90149303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311723277.2A Active CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117705059B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118230175B (en) * | 2024-05-23 | 2024-08-13 | 济南市勘察测绘研究院 | Real estate mapping data processing method and system based on artificial intelligence |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115630236A (en) * | 2022-10-19 | 2023-01-20 | 感知天下(北京)信息科技有限公司 | Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102073748B (en) * | 2011-03-08 | 2012-07-25 | 武汉大学 | Visual keyword based remote sensing image semantic searching method |
CN107563438B (en) * | 2017-08-31 | 2019-08-30 | 西南交通大学 | A Fast and Robust Multimodal Remote Sensing Image Matching Method and System |
US20200401617A1 (en) * | 2019-06-21 | 2020-12-24 | White Raven Ltd | Visual positioning system |
CN112766199B (en) * | 2021-01-26 | 2022-04-29 | 武汉大学 | Hyperspectral image classification method based on self-adaptive multi-scale feature extraction model |
CN114972737B (en) * | 2022-06-08 | 2024-03-15 | 湖南大学 | Remote sensing image target detection system and method based on prototype contrast learning |
CN117218201A (en) * | 2023-10-11 | 2023-12-12 | 中国人民解放军战略支援部队信息工程大学 | Unmanned aerial vehicle image positioning precision improving method and system under GNSS refusing condition |
-
2023
- 2023-12-14 CN CN202311723277.2A patent/CN117705059B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115630236A (en) * | 2022-10-19 | 2023-01-20 | 感知天下(北京)信息科技有限公司 | Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN117705059A (en) | 2024-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109800648B (en) | Face detection and recognition method and device based on face key point correction | |
CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
CN111368896A (en) | A classification method of hyperspectral remote sensing images based on dense residual 3D convolutional neural network | |
CN114926746A (en) | SAR image change detection method based on multi-scale differential feature attention mechanism | |
CN104504366A (en) | System and method for smiling face recognition based on optical flow features | |
JP6892606B2 (en) | Positioning device, position identification method and computer program | |
CN113095370A (en) | Image recognition method and device, electronic equipment and storage medium | |
US11587323B2 (en) | Target model broker | |
CN117705059B (en) | Positioning method and system for remote sensing mapping image of natural resource | |
CN109492610B (en) | Pedestrian re-identification method and device and readable storage medium | |
CN116152587A (en) | Expression recognition model training method, facial expression recognition method and device | |
CN117056902A (en) | Password management method and system for Internet of things | |
CN111563528A (en) | SAR image classification method based on multi-scale feature learning network and bilateral filtering | |
CN115063831A (en) | A high-performance pedestrian retrieval and re-identification method and device | |
CN112115994A (en) | Training method and device of image recognition model, server and storage medium | |
CN118037423A (en) | Method and system for evaluating repayment willingness of farmers after agricultural loans | |
CN116402777B (en) | Power equipment detection method and system based on machine vision | |
CN117853596A (en) | Unmanned aerial vehicle remote sensing mapping method and system | |
Priya et al. | An Enhanced Animal Species Classification and Prediction Engine using CNN | |
CN117496126A (en) | Automatic image positioning system and method based on keywords | |
CN117079017A (en) | Credible small sample image identification and classification method | |
CN114417938B (en) | Electromagnetic target classification method embedded by using knowledge vector | |
CN113191134B (en) | Document quality verification method, device, equipment and medium based on attention mechanism | |
Jun et al. | Two-view correspondence learning via complex information extraction | |
CN118586434B (en) | Double-graph-driven supervised modeling method for multi-source domain migration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |