CN117705059B - Positioning method and system for remote sensing mapping image of natural resource - Google Patents
Positioning method and system for remote sensing mapping image of natural resource Download PDFInfo
- Publication number
- CN117705059B CN117705059B CN202311723277.2A CN202311723277A CN117705059B CN 117705059 B CN117705059 B CN 117705059B CN 202311723277 A CN202311723277 A CN 202311723277A CN 117705059 B CN117705059 B CN 117705059B
- Authority
- CN
- China
- Prior art keywords
- description
- interest
- region
- feature
- feature vectors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013507 mapping Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 44
- 239000013598 vector Substances 0.000 claims abstract description 208
- 238000012937 correction Methods 0.000 claims description 19
- 238000013527 convolutional neural network Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 12
- 238000009826 distribution Methods 0.000 claims description 11
- 238000013135 deep learning Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 8
- 230000004807 localization Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 210000002569 neuron Anatomy 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
- G01C11/04—Interpretation of pictures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Remote Sensing (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Radar, Positioning & Navigation (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Astronomy & Astrophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
A method and system for locating the remote-sensing mapping image of natural resource is disclosed. Firstly, remote sensing image data are acquired, then, feature description based on the region of interest is carried out on the remote sensing image data to obtain a plurality of region of interest description feature vectors, then, image positioning requirement description is acquired, then, semantic coding is carried out on the image positioning requirement description to obtain an image positioning requirement description feature vector, and finally, a positioning result is determined based on the plurality of region of interest description feature vectors and the image positioning requirement description feature vector. Therefore, the image area matched with the massive remote sensing image data can be retrieved according to semantic information contained in the image positioning requirement description input by the user, so that intelligent positioning is realized.
Description
Technical Field
The application relates to the field of image positioning, in particular to a natural resource remote sensing mapping image positioning method and system.
Background
Remote sensing mapping is a technology for observing and measuring natural resources on the surface of the earth from a long distance by using equipment such as satellites, aircrafts, unmanned aerial vehicles and the like. The remote sensing mapping can provide image data with large range, high resolution and multiple time phases, and provides important information support for investigation, evaluation, monitoring and management of natural resources.
However, the size and complexity of the remote sensing image data also presents challenges for image retrieval and positioning. The traditional image positioning method has the problems of low positioning precision, low processing efficiency and the like when processing large-scale and complex natural resource image data. Therefore, an optimized natural resource remote sensing mapping image positioning method and system are expected.
Disclosure of Invention
The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides a natural resource remote sensing mapping image positioning method and a system, which can be combined with a natural language processing technology to search an image area matched with massive remote sensing image data according to semantic information contained in image positioning demand description input by a user so as to realize intelligent positioning.
According to one aspect of the present application, there is provided a natural resource remote sensing mapping image positioning method, including:
Acquiring remote sensing image data;
performing feature description based on the region of interest on the remote sensing image data to obtain a plurality of region of interest description feature vectors;
acquiring an image positioning requirement description;
Carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and
Determining a positioning result based on the plurality of region of interest describing feature vectors and the image positioning demand describing feature vector.
According to another aspect of the present application, there is provided a natural resource remote sensing mapping image positioning system, comprising:
the remote sensing image data acquisition module is used for acquiring remote sensing image data;
the interesting feature description module is used for carrying out interesting region-based feature description on the remote sensing image data so as to obtain a plurality of interesting region description feature vectors;
the image positioning requirement description acquisition module is used for acquiring image positioning requirement description;
the semantic coding module is used for carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and
And the positioning result analysis module is used for determining a positioning result based on the plurality of region-of-interest description feature vectors and the image positioning requirement description feature vector.
Compared with the prior art, the method and the system for positioning the natural resource remote sensing mapping image are characterized in that firstly remote sensing image data are obtained, then, feature description based on the region of interest is carried out on the remote sensing image data to obtain a plurality of region of interest description feature vectors, then, image positioning requirement description is obtained, then, semantic coding is carried out on the image positioning requirement description to obtain an image positioning requirement description feature vector, and finally, a positioning result is determined based on the plurality of region of interest description feature vectors and the image positioning requirement description feature vector. Therefore, the image area matched with the massive remote sensing image data can be retrieved according to semantic information contained in the image positioning requirement description input by the user, so that intelligent positioning is realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly introduced below, the following drawings not being drawn to scale with respect to actual dimensions, emphasis instead being placed upon illustrating the gist of the present application.
Fig. 1 is a flowchart of a method for positioning a remote sensing mapping image of natural resources according to an embodiment of the application.
Fig. 2 is a schematic diagram of a positioning method of a remote sensing mapping image of natural resources according to an embodiment of the application.
Fig. 3 is a flowchart of a sub-step S120 of the positioning method of the remote sensing mapping image of natural resources according to an embodiment of the application.
Fig. 4 is a flowchart of sub-step S150 of the positioning method of the remote sensing mapping image of natural resources according to an embodiment of the application.
Fig. 5 is a flowchart of the substep S152 of the positioning method of the remote sensing mapping image of natural resources according to an embodiment of the application.
Fig. 6 is a block diagram of a natural resource remote sensing mapping image positioning system according to an embodiment of the application.
Fig. 7 is an application scenario diagram of a positioning method for remote sensing mapping images of natural resources according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are also within the scope of the application.
As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative, and different aspects of the systems and methods may use different modules.
A flowchart is used in the present application to describe the operations performed by a system according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
Aiming at the technical problems, the technical concept of the application is to combine the natural language processing technology to search the matched image area from massive remote sensing image data according to semantic information contained in the image positioning requirement description input by a user so as to realize intelligent positioning.
Based on this, fig. 1 is a flowchart of a positioning method for a remote sensing mapping image of natural resources according to an embodiment of the application. Fig. 2 is a schematic diagram of a positioning method of a remote sensing mapping image of natural resources according to an embodiment of the application. As shown in fig. 1 and fig. 2, the method for positioning a natural resource remote sensing mapping image according to an embodiment of the present application includes the steps of: s110, acquiring remote sensing image data; s120, performing feature description based on the region of interest on the remote sensing image data to obtain a plurality of feature vectors for describing the region of interest; s130, acquiring image positioning requirement description; s140, carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and S150, determining a positioning result based on the plurality of region-of-interest description feature vectors and the image positioning requirement description feature vector.
It should be understood that in step S110, remote sensing image data for positioning is acquired, and the remote sensing image data is typically acquired through a satellite, an aircraft or other remote sensing platform, so as to provide image information of the earth surface. In step S120, a feature description is performed on the region of interest in the remote sensing image data, where the region of interest may be a region with a specific attribute or target, such as a building, a road, a water body, etc., and the feature vector describing each region may be obtained by extracting the features of the region of interest. In step S130, a demand description of image positioning is acquired, which may be a description of a certain target or area in the remote sensing image by the user, for example, "find city center" or "locate intersection of river", and the image positioning demand description may be a description in natural language or other forms. In step S140, the image positioning requirement description is semantically encoded and converted into an image positioning requirement description feature vector, and the semantic encoding may use natural language processing or other techniques to convert the natural language description into a computer-processable vector representation. In step S150, the positioning result is determined using a plurality of region of interest description feature vectors and image positioning requirement description feature vectors, which may be achieved by calculating the degree of similarity or matching between the region of interest description feature vectors and the image positioning requirement description feature vectors. According to the similarity or the matching degree, the method can determine which regions of interest have higher matching degree with the image positioning requirement, so that a positioning result is obtained. The combination of the steps can help to realize the positioning of the remote sensing image, and the position and the positioning result of the region of interest can be determined by carrying out feature extraction, coding and matching on the remote sensing image data and the image positioning requirement.
Specifically, in the technical scheme of the application, remote sensing image data are firstly obtained; and extracting a plurality of regions of interest from the remote sensing image data. The region of interest is a region with specific characteristics in the remote sensing image data, and the region of interest may include a specific type of ground feature, a landform, vegetation, a water body and the like. By extracting a plurality of interested areas from the remote sensing image data, the search range of a subsequent model can be reduced, and adverse effects of a large amount of interference information and noise on a positioning result are avoided.
In a specific example of the present application, extracting a plurality of regions of interest from the remote sensing image data may be manually intercepted. Specifically, the region of interest may be manually selected in the remote sensing image by a professional remote sensing image interpreter using image processing software. This method requires manual intervention, requires expertise and experience of the operator, but allows for precise selection of regions with specific characteristics. In yet another specific example of the present application, extracting a plurality of regions of interest from the remote sensing image data may be performed using computer vision and image processing techniques. For example, a target recognition network is constructed.
The plurality of regions of interest are then passed through a feature descriptor based on a convolutional neural network model, respectively, to obtain a plurality of region of interest descriptive feature vectors. That is, the multi-layer convolution and pooling operations of the convolutional neural network model are utilized to extract advanced semantic features of each region of interest. These high-level semantic features may better describe morphological, texture, structural, etc. features of the region, among others.
Accordingly, in step S120, as shown in fig. 3, performing a feature description based on the region of interest on the remote sensing image data to obtain a plurality of feature vectors describing the region of interest, including: s121, extracting a plurality of regions of interest from the remote sensing image data; and S122, performing feature extraction and feature description on the plurality of regions of interest by using a deep learning network model to obtain a plurality of region-of-interest description feature vectors.
In step S122, the deep learning network model is a feature descriptor based on a convolutional neural network model; the feature descriptor based on the convolutional neural network model comprises an input layer, a convolutional layer, an activation layer, a pooling layer and an output layer. Specifically, performing feature extraction and feature description on the multiple regions of interest by using a deep learning network model to obtain multiple region of interest description feature vectors, including: and respectively passing the plurality of regions of interest through the feature descriptors based on the convolutional neural network model to obtain the plurality of region of interest description feature vectors.
It should be appreciated that convolutional neural networks (Convolutional Neural Network, CNN) are a deep learning network model, primarily for processing data having a grid structure, which are widely used in the field of computer vision, are capable of effectively capturing spatial local features in images, and have translational invariance. The main components of the convolutional neural network model are as follows: 1. input layer: the image or region of interest is accepted as input. 2. Convolution layer: features in the input data are extracted through a series of convolution operations. The convolution operation uses a convolution kernel (also called a filter) to perform a sliding window calculation on the input data to capture local features. 3. An activation layer: nonlinear transformation is introduced to increase the expressive power of the network. Common activation functions include the ReLU (RECTIFIED LINEAR Unit) function. 4. Pooling layer: the feature map is reduced in size by a downsampling operation while retaining the important features. Common pooling operations include maximum pooling and average pooling. 5. Output layer: and converting the feature mapping obtained through convolution, activation, pooling and other operations into a final feature vector representation. The convolutional neural network model has the advantage that it can automatically learn the feature representation in the image without manually designing a feature extractor. Through the training process, the network can learn the features useful for the task, thereby achieving feature extraction and description of the region of interest. In remote sensing image positioning, semantic features of the region of interest can be extracted by using a feature descriptor based on a convolutional neural network model and used for subsequent positioning result determination.
Meanwhile, acquiring an image positioning requirement description; and carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector. Here, the image localization requirement description is a natural language description, which is a way for human beings to understand and express information, but a computer cannot directly understand and recognize text data. By carrying out semantic coding on the image positioning requirement description, natural language can be converted into a numerical feature vector which can be processed by a computer, semantic information in the image positioning requirement description is extracted, so that specific requirements and expectations of an image positioning task can be accurately understood, and a target and a range of positioning are clearly defined.
Then, a projection layer is used for respectively fusing the image positioning requirement description feature vector and each region of interest description feature vector in the plurality of region of interest description feature vectors to obtain a plurality of semantic matching feature vectors. That is, the image positioning requirement description feature vector and each region of interest description feature vector are mapped into the same semantic feature space through a projection layer for semantic matching.
Further, the plurality of semantic matching feature vectors are passed through a classifier to obtain a plurality of classification results, each of which is used to indicate whether the degree of matching exceeds a predetermined threshold. In the actual application scene of the application, each interested area corresponding to each corrected semantic matching feature vector with the matching degree exceeding a preset threshold is used as a positioning area to obtain a plurality of positioning areas; and taking the plurality of positioning areas as the positioning result.
Accordingly, in step S150, as shown in fig. 4, determining a positioning result based on the plurality of region of interest description feature vectors and the image positioning requirement description feature vector includes: s151, fusing the image positioning demand description feature vector and each region of interest description feature vector in the plurality of region of interest description feature vectors to obtain a plurality of semantic matching feature vectors; and S152, determining the positioning result meeting the image positioning requirement description based on the semantic matching feature vectors.
It should be understood that in step S151, the image positioning requirement description feature vector and each region of interest description feature vector are fused to obtain a plurality of semantic matching feature vectors, and the fusion may be performed by different methods, for example, splicing, weighting and summing the two feature vectors or using other fusion strategies, so as to comprehensively consider the image positioning requirement and the features of the region of interest. In step S152, a final positioning result is determined using the plurality of semantic matching feature vectors to satisfy the image positioning requirement description. The matching degree of each semantic matching feature vector and the image positioning requirement can be evaluated by using a similarity measure or other matching algorithm, and according to the matching degree, a region of interest corresponding to the feature vector with the highest matching degree can be selected as a positioning result, or further decision and reasoning can be carried out to determine a final positioning result. Through the combination of the two steps, the image positioning requirement and the characteristics of the region of interest can be fused and matched, so that the positioning result meeting the requirement is obtained. The fusion and matching process can help to determine the semantic matching degree of the region of interest and the image positioning requirement, and further provide an accurate positioning result.
In step S151, fusing the image positioning requirement description feature vector and each of the plurality of region of interest description feature vectors to obtain a plurality of semantic matching feature vectors, including: and respectively fusing the image positioning requirement description feature vector and each region of interest description feature vector in the plurality of region of interest description feature vectors by using a projection layer to obtain the plurality of semantic matching feature vectors.
It is worth mentioning that the Projection Layer (Projection Layer) is a Layer in a neural network for mapping input data from one feature space to another. In this case, the projection layer is configured to map the image positioning requirement description feature vector and the region of interest description feature vector to obtain a semantic matching feature vector. In particular, the projection layer may be a fully connected layer (Fully Connected Layer), also referred to as a linear layer or dense layer. Each neuron in the fully connected layer is connected to all neurons in the previous layer, and is linearly transformed by weight and bias. In this case, the projection layer inputs the image localization requirement description feature vector and the region of interest description feature vector into two fully connected layers, respectively, which are mapped to another feature space by the learned weights and offsets. Through the mapping of the projection layer, the features in different feature spaces can be fused to obtain a plurality of semantic matching feature vectors. These feature vectors will contain semantic information about the image localization requirements and the region of interest, facilitating subsequent matching and determination of the localization results. The design and parameter learning of the projection layer can be adjusted and optimized according to specific tasks and data so as to obtain the optimal fusion effect.
In a specific example of the present application, using a projection layer to respectively fuse the image positioning requirement description feature vector and each region of interest description feature vector of the plurality of region of interest description feature vectors to obtain the plurality of semantically matched feature vectors includes: respectively fusing the image positioning demand description feature vector and each of the plurality of interest region description feature vectors by using the following projection formula to obtain a plurality of semantic matching feature vectors; wherein, the projection formula is:
Wherein V f is the semantic matching feature vector, V 1 is the image positioning requirement description feature vector, V 2 is each region of interest description feature vector, [ ·; the term "is used to refer to a cascade, Representing a projection mapping of the vector.
Here, the image positioning requirement description feature vector and the region of interest description feature vector are mapped into the same semantic feature space through the shared projection layer, so that the high-dimensional feature distribution manifold of the two is bound into the same measurement scale, and the two can be directly compared and matched.
Further, in step S152, as shown in fig. 5, determining the positioning result according to the image positioning requirement description based on the plurality of semantic matching feature vectors includes: s1521, performing feature distribution correction on the plurality of semantic matching feature vectors to obtain a plurality of corrected semantic matching feature vectors; s1522, passing the corrected semantic matching feature vectors through a classifier to obtain a plurality of classification results, wherein each classification result is used for indicating whether the matching degree exceeds a preset threshold; s1523, taking each region of interest corresponding to each corrected semantic matching feature vector with the matching degree exceeding a preset threshold value as a positioning region to obtain a plurality of positioning regions; and S1524, taking the plurality of positioning areas as the positioning result.
It should be appreciated that in step S1521, feature distribution correction is performed on a plurality of semantically matched feature vectors, where the purpose of feature distribution correction is to normalize, normalize or otherwise process the feature vectors so that their distribution in the feature space is more consistent or more consistent with some desired distribution, so that the accuracy of comparison and matching between the feature vectors can be improved. In step S1522, the corrected semantic matching feature vectors are input to a classifier, which may be a classifier for determining whether the matching degree of the feature vectors exceeds a predetermined threshold. The classifier can be trained based on a machine learning algorithm, and can judge whether the feature vectors meet the matching condition by learning the marked sample data. The classification result may be binary (match/no match) or a probability value, indicating the confidence of the match. In step S1523, corresponding regions of interest are determined as positioning regions, which represent regions matching the image positioning requirement description, based on the corrected semantic matching feature vectors having a matching degree exceeding a predetermined threshold, and may be regarded as possible positioning results. In step S1524, a plurality of positioning areas are used as final positioning results, the positioning areas are screened and determined according to the matching degree and the threshold in the previous steps, and represent possible positions matched with the image positioning requirement description, and a plurality of candidate positions can be provided as positioning results for further analysis and decision. Through the combination of the steps, the positioning result meeting the description of the image positioning requirement can be determined according to the correction, classification and threshold judgment of the semantic matching feature vector. This approach may increase the accuracy and robustness of the positioning while providing multiple possible positioning areas for selection and further processing.
In the above technical solution, the image positioning requirement description feature vector is used to express the encoded text semantic feature of the image positioning requirement description, and the plurality of region of interest description feature vectors express the image semantic features of the plurality of regions of interest respectively, so when the projection layer is used to fuse the image positioning requirement description feature vector and each region of interest description feature vector of the plurality of region of interest description feature vectors respectively, considering that cross-modal semantic feature differences between the image positioning requirement description feature vector and the region of interest description feature vector may cause mapping sparsity of semantic features in a shared dimension, thereby affecting the expression effect of the obtained plurality of semantic matching feature vectors, it is desirable to perform feature projection mapping optimization based on the feature expression significance and the criticality of each of the image positioning requirement description feature vector and the region of interest description feature vector, so as to promote the expression effect of the plurality of semantic matching feature vectors. Based on this, the applicant of the present application corrects the image localization requirement description feature vector and the each region of interest description feature vector.
Accordingly, in step S1521, performing feature distribution correction on the plurality of semantic matching feature vectors to obtain a plurality of corrected semantic matching feature vectors, including: calculating a plurality of correction feature vectors of each region of interest description feature vector in the image positioning requirement description feature vector and the plurality of region of interest description feature vectors according to the following correction formula; wherein, the correction formula is:
Wherein V 1 is the image positioning requirement description feature vector, and V 2 is each region of interest description feature vector of the plurality of region of interest description feature vectors, Representing the position-wise evolution of the feature vector, V 1max -1 and V 2max -1 being the inverse of the maximum eigenvalue of the feature vectors V 1 and V 2, respectively, α and β being the weight superparameter, as would be the case if the multiplication was performed by position,Representing the subtraction of vectors, V c being each of the plurality of correction feature vectors; and respectively fusing the plurality of semantic matching feature vectors and the plurality of correction feature vectors to obtain a plurality of corrected semantic matching feature vectors.
Here, the pre-segmented local group of the feature value set is obtained through the evolution values of the respective feature values of the image positioning requirement description feature vector and the region of interest description feature vector, and the key maximum value features of the image positioning requirement description feature vector and the region of interest description feature vector are regressed therefrom, so that the per-position saliency distribution of the feature values can be promoted based on the concept of the farthest point sampling, and the sparse correspondence control between the feature vectors is performed through the key features with the saliency distribution, so that the restoration of the original manifold geometry of the correction feature vector V c to the image positioning requirement description feature vector and the region of interest description feature vector is realized. Therefore, the correction feature vector V c is fused with the semantic matching feature vector, so that the expression effect of the semantic matching feature vector can be improved, and the accuracy of a classification result obtained by the classifier is improved.
Further, in step S1522, passing the plurality of corrected semantic matching feature vectors through a classifier to obtain a plurality of classification results, where each classification result is used to indicate whether the matching degree exceeds a predetermined threshold, including: performing full-connection coding on the corrected semantic matching feature vectors by using a full-connection layer of the classifier to obtain a plurality of coding classification feature vectors; and respectively inputting the plurality of coding classification feature vectors into a Softmax classification function of the classifier to obtain a plurality of classification results.
It should be appreciated that the role of the classifier is to learn the classification rules and classifier using a given class, known training data, and then classify (or predict) the unknown data. Logistic regression (logistics), SVM, etc. are commonly used to solve the classification problem, and for multi-classification problems (multi-class classification), logistic regression or SVM can be used as well, but multiple bi-classifications are required to compose multiple classifications, but this is error-prone and inefficient, and the commonly used multi-classification method is the Softmax classification function.
It should be noted that the full-connection encoding (Fully Connected Encoding) refers to a process of encoding input data through the full-connection layer. Fully connected layers are a common layer type in neural networks, each neuron of which is connected to all neurons of the upper layer. In fully connected coding, each feature of the input data is connected to each neuron in the fully connected layer, and the output of the neuron is calculated by a combination of weights and offsets. The function of full-concatenated coding is to transform input data into a representation with a higher level to extract richer features in the data. Complex relationships and patterns in the input data can be captured through non-linear transformations and parameter learning at the fully connected layer. The corrected semantic matching feature vector is input into the full-connection layer for coding, so that the feature vector can be converted into a coding classification feature vector with more expressive capacity. After the coding classification feature vectors are subjected to a Softmax classification function, a plurality of classification results can be obtained and are used for indicating whether the matching degree exceeds a preset threshold value. By combining full-concatenated coding and classifiers, more accurate classification and determination of semantic matches can be made.
In summary, the method for positioning the natural resource remote sensing mapping image according to the embodiment of the application is explained, and the image area matched with the image area can be retrieved from massive remote sensing image data according to semantic information contained in the image positioning requirement description input by a user, so that intelligent positioning is realized.
Fig. 6 is a block diagram of a natural resource remote sensing mapping image positioning system 100 according to an embodiment of the application. As shown in fig. 6, a system 100 for positioning a remote-sensing mapping image of natural resources according to an embodiment of the present application includes: the remote sensing image data acquisition module 110 is configured to acquire remote sensing image data; the interesting feature description module 120 is configured to perform interesting region-based feature description on the remote sensing image data to obtain a plurality of interesting region description feature vectors; an image positioning requirement description acquiring module 130, configured to acquire an image positioning requirement description; the semantic coding module 140 is configured to perform semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and a positioning result analysis module 150, configured to determine a positioning result based on the plurality of region of interest description feature vectors and the image positioning requirement description feature vector.
In one example, in the above-mentioned natural resource remote sensing mapping image positioning system 100, the interesting feature description module 120 includes: the region of interest extraction unit is used for extracting a plurality of regions of interest from the remote sensing image data; and a feature extraction description unit for performing feature extraction and feature description on the plurality of regions of interest by using a deep learning network model to obtain a plurality of region-of-interest description feature vectors.
Here, it will be understood by those skilled in the art that the specific functions and operations of the respective modules in the above-described natural resource remote sensing survey image positioning system 100 have been described in detail in the above description of the natural resource remote sensing survey image positioning method with reference to fig. 1 to 5, and thus, repetitive descriptions thereof will be omitted.
As described above, the system 100 for positioning a remote-sensing mapping image of natural resources according to the embodiment of the present application may be implemented in various wireless terminals, such as a server having a remote-sensing mapping image positioning algorithm of natural resources. In one example, the natural resource telemetry image positioning system 100 according to embodiments of the present application may be integrated into a wireless terminal as a software module and/or hardware module. For example, the natural resource remote sensing mapping image location system 100 may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the remote-sensing mapping image positioning system 100 can also be one of a plurality of hardware modules of the wireless terminal.
Alternatively, in another example, the natural resource telemetry image positioning system 100 and the wireless terminal may be separate devices, and the natural resource telemetry image positioning system 100 may be connected to the wireless terminal via a wired and/or wireless network and communicate interactive information in accordance with a agreed data format.
Fig. 7 is an application scenario diagram of a positioning method for remote sensing mapping images of natural resources according to an embodiment of the present application. As shown in fig. 7, in this application scenario, first, remote sensing image data (for example, D1 illustrated in fig. 7) and an image positioning requirement description (for example, D2 illustrated in fig. 7) are acquired, and then, the remote sensing image data and the image positioning requirement description are input to a server (for example, S illustrated in fig. 7) where a natural resource remote sensing mapping image positioning algorithm is deployed, wherein the server can process the remote sensing image data and the image positioning requirement description using the natural resource remote sensing mapping image positioning algorithm to obtain a plurality of classification results for indicating whether the matching degree exceeds a predetermined threshold.
According to another aspect of the present application there is also provided a non-volatile computer readable storage medium having stored thereon computer readable instructions which when executed by a computer can perform a method as described above.
Program portions of the technology may be considered to be "products" or "articles of manufacture" in the form of executable code and/or associated data, embodied or carried out by a computer readable medium. A tangible, persistent storage medium may include any memory or storage used by a computer, processor, or similar device or related module. Such as various semiconductor memories, tape drives, disk drives, or the like, capable of providing storage functionality for software.
All or a portion of the software may sometimes communicate over a network, such as the internet or other communication network. Such communication may load software from one computer device or processor to another. Unless limited to a tangible "storage" medium, other terms used herein to refer to a computer or machine "readable medium" mean any medium that participates in the execution of any instructions by a processor.
Furthermore, those skilled in the art will appreciate that the various aspects of the application are illustrated and described in the context of a number of patentable categories or circumstances, including any novel and useful procedures, machines, products, or materials, or any novel and useful modifications thereof. Accordingly, aspects of the application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The foregoing is illustrative of the present application and is not to be construed as limiting thereof. Although a few exemplary embodiments of this application have been described, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this application. Accordingly, all such modifications are intended to be included within the scope of this application as defined in the following claims. It is to be understood that the foregoing is illustrative of the present application and is not to be construed as limited to the specific embodiments disclosed, and that modifications to the disclosed embodiments, as well as other embodiments, are intended to be included within the scope of the appended claims. The application is defined by the claims and their equivalents.
Claims (8)
1. A positioning method for a natural resource remote sensing mapping image is characterized by comprising the following steps:
Acquiring remote sensing image data;
performing feature description based on the region of interest on the remote sensing image data to obtain a plurality of region of interest description feature vectors;
acquiring an image positioning requirement description;
Carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and
Determining a positioning result based on the plurality of region of interest describing feature vectors and the image positioning demand describing feature vector;
determining a positioning result based on the plurality of region of interest description feature vectors and the image positioning demand description feature vector, comprising:
fusing the image positioning demand description feature vector and each region of interest description feature vector in the plurality of region of interest description feature vectors to obtain a plurality of semantic matching feature vectors;
determining the positioning result conforming to the image positioning requirement description based on the plurality of semantic matching feature vectors;
determining the positioning result that meets the image positioning requirement description based on the plurality of semantic matching feature vectors comprises:
performing feature distribution correction on the plurality of semantic matching feature vectors to obtain a plurality of corrected semantic matching feature vectors;
passing the plurality of corrected semantic matching feature vectors through a classifier to obtain a plurality of classification results, wherein each classification result is used for indicating whether the matching degree exceeds a preset threshold value;
taking each region of interest corresponding to each corrected semantic matching feature vector with the matching degree exceeding a preset threshold value as a positioning region to obtain a plurality of positioning regions;
taking the plurality of positioning areas as the positioning result;
Performing feature distribution correction on the plurality of semantic matching feature vectors to obtain a plurality of corrected semantic matching feature vectors, including: calculating a plurality of correction feature vectors of each region of interest description feature vector in the image positioning requirement description feature vector and the plurality of region of interest description feature vectors according to the following correction formula; wherein, the correction formula is:
Wherein V 1 is the image positioning requirement description feature vector, and V 2 is each region of interest description feature vector of the plurality of region of interest description feature vectors, Representing the position-wise evolution of the feature vector, V 1max -1 and V 2max -1 being the inverse of the maximum eigenvalue of the feature vectors V 1 and V 2, respectively, α and β being the weight superparameter, as would be the case if the multiplication was performed by position,Representing the subtraction of vectors, V c being each of the plurality of correction feature vectors; and respectively fusing the plurality of semantic matching feature vectors and the plurality of correction feature vectors to obtain a plurality of corrected semantic matching feature vectors.
2. The method of claim 1, wherein performing region-of-interest-based feature description on the remote sensing image data to obtain a plurality of region-of-interest-descriptive feature vectors, comprises:
Extracting a plurality of regions of interest from the remote sensing image data; and
And carrying out feature extraction and feature description on the plurality of regions of interest by using a deep learning network model to obtain a plurality of region of interest description feature vectors.
3. The natural resource remote sensing mapping image positioning method according to claim 2, wherein the deep learning network model is a feature descriptor based on a convolutional neural network model;
The feature descriptor based on the convolutional neural network model comprises an input layer, a convolutional layer, an activation layer, a pooling layer and an output layer.
4. The method of claim 3, wherein performing feature extraction and feature description on the plurality of regions of interest using a deep learning network model to obtain the plurality of region of interest description feature vectors, comprises:
and respectively passing the plurality of regions of interest through the feature descriptors based on the convolutional neural network model to obtain the plurality of region of interest description feature vectors.
5. The method of claim 4, wherein fusing the image positioning demand description feature vector and each of the plurality of region of interest description feature vectors to obtain a plurality of semantic matching feature vectors, comprises:
And respectively fusing the image positioning requirement description feature vector and each region of interest description feature vector in the plurality of region of interest description feature vectors by using a projection layer to obtain the plurality of semantic matching feature vectors.
6. The method of claim 5, wherein using a projection layer to respectively fuse the image positioning demand description feature vector and each of the plurality of region of interest description feature vectors to obtain the plurality of semantically matched feature vectors, comprises:
Respectively fusing the image positioning demand description feature vector and each of the plurality of interest region description feature vectors by using the following projection formula to obtain a plurality of semantic matching feature vectors;
wherein, the projection formula is:
Wherein V f is the semantic matching feature vector, V 1 is the image positioning requirement description feature vector, V 2 is each region of interest description feature vector, [ ·; the term "is used to refer to a cascade, Representing a projection mapping of the vector.
7. A natural resource remote sensing mapping image positioning system for performing the natural resource remote sensing mapping image positioning method as set forth in claim 1, comprising:
the remote sensing image data acquisition module is used for acquiring remote sensing image data;
the interesting feature description module is used for carrying out interesting region-based feature description on the remote sensing image data so as to obtain a plurality of interesting region description feature vectors;
the image positioning requirement description acquisition module is used for acquiring image positioning requirement description;
the semantic coding module is used for carrying out semantic coding on the image positioning requirement description to obtain an image positioning requirement description feature vector; and
And the positioning result analysis module is used for determining a positioning result based on the plurality of region-of-interest description feature vectors and the image positioning requirement description feature vector.
8. The natural resource remote sensing mapping image localization system of claim 7, wherein the feature of interest description module comprises:
The region of interest extraction unit is used for extracting a plurality of regions of interest from the remote sensing image data; and
And the feature extraction and description unit is used for carrying out feature extraction and feature description on the multiple regions of interest by using a deep learning network model so as to obtain multiple region of interest description feature vectors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311723277.2A CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311723277.2A CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117705059A CN117705059A (en) | 2024-03-15 |
CN117705059B true CN117705059B (en) | 2024-09-17 |
Family
ID=90149303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311723277.2A Active CN117705059B (en) | 2023-12-14 | 2023-12-14 | Positioning method and system for remote sensing mapping image of natural resource |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117705059B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118230175B (en) * | 2024-05-23 | 2024-08-13 | 济南市勘察测绘研究院 | Real estate mapping data processing method and system based on artificial intelligence |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115630236A (en) * | 2022-10-19 | 2023-01-20 | 感知天下(北京)信息科技有限公司 | Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102073748B (en) * | 2011-03-08 | 2012-07-25 | 武汉大学 | Visual keyword based remote sensing image semantic searching method |
CN107563438B (en) * | 2017-08-31 | 2019-08-30 | 西南交通大学 | A kind of multi-modal Remote Sensing Images Matching Method and system of fast robust |
US20200401617A1 (en) * | 2019-06-21 | 2020-12-24 | White Raven Ltd | Visual positioning system |
CN112766199B (en) * | 2021-01-26 | 2022-04-29 | 武汉大学 | Hyperspectral image classification method based on self-adaptive multi-scale feature extraction model |
CN114972737B (en) * | 2022-06-08 | 2024-03-15 | 湖南大学 | Remote sensing image target detection system and method based on prototype contrast learning |
CN117218201A (en) * | 2023-10-11 | 2023-12-12 | 中国人民解放军战略支援部队信息工程大学 | Unmanned aerial vehicle image positioning precision improving method and system under GNSS refusing condition |
-
2023
- 2023-12-14 CN CN202311723277.2A patent/CN117705059B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115630236A (en) * | 2022-10-19 | 2023-01-20 | 感知天下(北京)信息科技有限公司 | Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN117705059A (en) | 2024-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837846B (en) | Image recognition model construction method, image recognition method and device | |
CN110569901A (en) | Channel selection-based countermeasure elimination weak supervision target detection method | |
CN104504366A (en) | System and method for smiling face recognition based on optical flow features | |
CN117705059B (en) | Positioning method and system for remote sensing mapping image of natural resource | |
JP6892606B2 (en) | Positioning device, position identification method and computer program | |
CN109800815B (en) | Training method, wheat recognition method and training system based on random forest model | |
CN114511735A (en) | Hyperspectral image classification method and system of cascade empty spectral feature fusion and kernel extreme learning machine | |
CN117351659B (en) | Hydrogeological disaster monitoring device and monitoring method | |
Hoang | Classification of asphalt pavement cracks using Laplacian pyramid‐based image processing and a hybrid computational approach | |
CN115359248A (en) | Robot navigation obstacle avoidance method and system based on meta-learning | |
CN114036326B (en) | Image retrieval and classification method, system, terminal and storage medium | |
Kokilambal | Intelligent content based image retrieval model using adadelta optimized residual network | |
WO2024078112A1 (en) | Method for intelligent recognition of ship outfitting items, and computer device | |
Cui et al. | Global context dependencies aware network for efficient semantic segmentation of fine-resolution remoted sensing images | |
CN116704378A (en) | Homeland mapping data classification method based on self-growing convolution neural network | |
CN115994242A (en) | Image retrieval method, device, equipment and storage medium | |
CN115953584A (en) | End-to-end target detection method and system with learnable sparsity | |
CN108154107A (en) | A kind of method of the scene type of determining remote sensing images ownership | |
Tang et al. | A segmentation map difference-based domain adaptive change detection method | |
Bi et al. | CASA-Net: a context-aware correlation convolutional network for scale-adaptive crack detection | |
Jun et al. | Two-view correspondence learning via complex information extraction | |
CN110942179A (en) | Automatic driving route planning method and device and vehicle | |
CN118298213B (en) | Small sample image classification method based on text prompt weighted aggregation | |
Anandan et al. | Prediction Of Soil Texture Using Convolution Neural Network with Enhanced Regression Model | |
CN117078604B (en) | Unmanned laboratory intelligent management method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |