CN116246161A

CN116246161A - Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge

Info

Publication number: CN116246161A
Application number: CN202211559725.5A
Authority: CN
Inventors: 赵理君; 胡昌苗; 李宏益; 霍连志; 唐娉
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-06-09

Abstract

The invention provides a method and a device for identifying the target fine type of a remote sensing image under the guidance of domain knowledge, wherein the method comprises the following steps: acquiring an original remote sensing image to be identified, and extracting multidimensional features of the original remote sensing image to obtain an edge feature image and a texture feature image of the original remote sensing image; performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image; identifying an initial category of the target to be identified based on the fused feature image; and acquiring a target knowledge template base corresponding to the initial category, and identifying the fine type of the target to be identified according to the target knowledge template base. The original remote sensing image is subjected to multi-dimensional feature extraction and fusion, target identification is performed based on the fusion features, and the preliminary identification result is refined and reconfirmed by combining priori knowledge, so that the identification precision of the target is improved, and the high-precision identification requirement of the fine type of the target is met.

Description

Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge

Technical Field

The invention relates to the technical field of remote sensing image recognition, in particular to a method and a device for recognizing a target fine type of a remote sensing image under the guidance of domain knowledge.

Background

Object detection and recognition have become one of important tasks of high-resolution remote sensing image interpretation, and specifically refer to a process of detecting an object in a category of interest in a remote sensing image, and information required to be detected and recognized includes position and category information of the object and the like. Along with the rapid development of artificial intelligence technology, the mode of target detection and recognition is changed from automation to intellectualization, so that a target recognition method based on deep learning is generated, the method does not need professional knowledge and experience required by manual feature design, can automatically learn how to extract features from data, gradually combines low-level features into medium-level (more abstract and semantic meaning) features, obtains feature representation with good generalization, strong robustness and high distinguishing property, and can provide an effective framework for target extraction in images.

Although the target recognition method based on deep learning has unusual expression in a plurality of open-source target detection data sets, for a remote sensing image, the target recognition accuracy is low due to the influence of a plurality of factors such as large imaging scene, low target occurrence frequency, obvious scale effect, large observation angle difference, large intra-class difference, high inter-class similarity and the like of the remote sensing image, and the requirement of the current practical application cannot be completely met. Under the circumstance, the detection and identification of the target simply by means of the characteristics of the image cannot meet the requirement of practicality, and particularly for the identification of more refined types (such as models), more external knowledge input and prior constraint are required to be considered to improve the accuracy, rationality and usability of the final result, and a remote sensing image target fine type identification method under the guidance of knowledge in the research field is required.

Disclosure of Invention

The invention provides a method and a device for identifying the target fine type of a remote sensing image under the guidance of domain knowledge, which are used for solving the defect that the target identification precision based on the remote sensing image in the prior art is insufficient and the high-precision detection requirement cannot be met.

The invention provides a remote sensing image target fine type identification method under the guidance of domain knowledge, which comprises the following steps:

acquiring an original remote sensing image to be identified, and extracting multidimensional features of the original remote sensing image to obtain an edge feature image and a texture feature image of the original remote sensing image;

performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image;

identifying an initial category of a target to be identified in the original remote sensing image based on the fusion characteristic image;

and acquiring a target knowledge template base corresponding to the initial category, and identifying the fine type of the target to be identified according to the target knowledge template base.

In one embodiment, the step of performing feature extraction and fusion of the multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image includes:

According to a sliding window with a preset size and a preset sliding step length, carrying out overlapped grid division processing on the original remote sensing image to obtain a plurality of grid images corresponding to the original remote sensing image; the edge feature image comprises edge feature subgraphs corresponding to the grid images, and the texture feature image comprises texture feature subgraphs corresponding to the grid images;

inputting a target grid image, an edge feature subgraph corresponding to the target grid image and a texture feature subgraph corresponding to the target grid image into a multi-flow convolutional neural network model which is pre-trained, and carrying out feature extraction and fusion of the multi-flow convolutional neural network on the target grid image, the edge feature subgraph corresponding to the target grid image and the texture feature subgraph corresponding to the target grid image by utilizing the multi-flow convolutional neural network model to obtain a fused feature image of the original remote sensing image; the target grid image is any one of the plurality of grid images;

the multi-stream convolutional neural network model comprises a plurality of convolutional neural networks with shared weights and an attention mechanism, wherein the plurality of convolutional neural networks comprise a first convolutional neural network, a second convolutional neural network and a third convolutional neural network;

The input of the first convolution neural network is the grid image, the input of the second convolution neural network is the edge feature subgraph, and the input of the third convolution neural network is the texture feature subgraph; the attention mechanism is used for carrying out feature fusion on the outputs of the convolutional neural networks.

In one embodiment, before the feature extraction and fusion of the multi-stream convolutional neural network are performed on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image, the method further includes:

generating a fine classification system according to each preset category; acquiring an initial remote sensing image under the fine classification system; the initial remote sensing image comprises classification targets under each preset category;

acquiring annotation information of the initial remote sensing image, performing overlapped clipping processing on the initial remote sensing image based on the annotation information, and determining a sample image containing the classification target from the clipping image;

constructing a first sample data set based on the sample image, and performing iterative training on a preset basic multiflow convolutional neural network model by using the first sample data set;

Constructing a target template image corresponding to the fine classification system in the target knowledge template base based on the image blocks corresponding to the classification targets in the sample images under the preset categories;

performing binarization processing on the average value image of the image block corresponding to the classification target in the sample image under each preset category to construct a target template binarization image corresponding to the fine classification system in the target knowledge template base; the target knowledge template library comprises target template images and target template binarization images under the preset categories, and attribute information and global geographic space distribution range information of classification targets corresponding to the target template images and the target template binarization images;

performing rotary amplification processing on a target image block in the target template image according to a preset angle interval to obtain a plurality of angle images corresponding to the target template image; the target image block is an image block in the target template image, and a classification recognition result obtained by classification recognition through the multi-flow convolutional neural network model is inconsistent with a true value;

constructing a second sample dataset based on the target template image and the angle image;

And performing iterative training on a preset basic fine classification model by using the second sample data set.

In one embodiment, after the identifying the fine type of the target to be identified according to the target knowledge template base, the method further includes:

converting the target pixel coordinates in each grid image into pixel coordinates in the original remote sensing image; the target pixel coordinates are pixel coordinates corresponding to the target to be identified in the grid image;

acquiring projection parameter information of the original remote sensing image, and determining geospatial position information of the target to be identified according to the projection parameter information and the pixel coordinates;

determining a global geospatial distribution range of the target to be identified based on the target knowledge template library, and determining whether the geospatial position of the target to be identified is within the global geospatial distribution range according to the geospatial position information and the global geospatial distribution range;

if yes, performing de-duplication stitching processing on each grid image based on a preset non-maximum suppression algorithm.

In one embodiment, the step of identifying the initial category of the object to be identified in the original remote sensing image based on the fused feature image includes:

Determining an outsourcing rectangle of the target to be identified in the original remote sensing image and a long-direction included angle of the outsourcing rectangle according to the fusion characteristic image; the long-direction included angle is an included angle between the long side of the outer rectangle and the calibration direction;

performing rotary cutting processing on the image blocks covered by the outsourcing rectangle according to the long-direction included angle and the calibration direction to obtain a target image containing the target to be identified;

inputting the target image into a pre-trained target fine classification model to obtain a class prediction value output by the target fine classification model; the category predicted value is a probability value that a target to be identified in the original remote sensing image is a preset category;

and sorting the category predicted values in a descending order, and selecting a preset number of categories as initial categories of the targets to be identified according to the sorting order.

In one embodiment, the initial category includes a plurality of, and the step of identifying the fine type of the object to be identified based on the object knowledge template library includes:

acquiring a priori target in the target knowledge template library;

determining the size mean deviation degree of the target to be identified and the prior target according to the spatial resolution of the original remote sensing image;

Screening the initial category according to the size mean deviation degree, and determining a first target category of the target to be identified;

performing binarization processing on the target image to obtain a target binarization image corresponding to the target image;

calculating the similarity between the target to be identified and the prior target in the first template binarization image based on the target binarization image; the first template binarization image is a template binarization image corresponding to the first target category in the target knowledge template base;

and determining the fine type of the target to be identified according to the similarity.

In one embodiment, the step of calculating the similarity of the object to be identified and the prior object in the first template binarized image based on the object binarized image comprises:

based on a difference hash algorithm, respectively calculating a target hash value of the target binarized image and a template hash value of the first template binarized image;

calculating the Hamming distance between the target hash value and the template hash value;

and determining the similarity between the target to be identified and the prior target in the first template binarized image according to the Hamming distance.

The invention also provides a device for identifying the target fine type of the remote sensing image under the guidance of domain knowledge, which comprises the following steps:

the characteristic extraction module is used for acquiring an original remote sensing image to be identified, and extracting multidimensional characteristics of the original remote sensing image to obtain an edge characteristic image and a texture characteristic image of the original remote sensing image;

the feature fusion module is used for carrying out feature extraction and fusion of a multi-flow convolution neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image;

the first identification module is used for identifying the initial category of the target to be identified in the original remote sensing image based on the fusion characteristic image;

the second recognition module is used for obtaining a target knowledge template base corresponding to the initial category and recognizing the fine type of the target to be recognized according to the target knowledge template base.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the steps of the remote sensing image target fine type identification method under the guidance of any one of the above domain knowledge are realized when the processor executes the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the remote sensing image target fine type recognition method under the guidance of any of the above-mentioned domain knowledge.

The invention also provides a computer program product comprising a computer program which when executed by a processor implements the steps of the remote sensing image target fine type recognition method under the guidance of any one of the above domain knowledge.

According to the method and the device for identifying the target fine type of the remote sensing image under the guidance of the domain knowledge, the edge feature image and the texture feature image of the original remote sensing image are obtained by extracting the multidimensional features of the original remote sensing image; carrying out feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image; identifying an initial category of a target to be identified in an original remote sensing image based on the fusion characteristic image; and acquiring a target knowledge template base corresponding to the initial category, and identifying the fine type of the target to be identified according to the acquired target knowledge template base. The original remote sensing image is subjected to multi-dimensional feature extraction and fusion such as edge features and texture features, the primary identification of the target is performed based on the fusion features, and the result of the primary identification is refined and reconfirmed by combining the priori knowledge of the target knowledge template base, so that the identification precision of the target is improved, and the high-precision identification requirement of the fine type of the target is met.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a remote sensing image target fine type recognition method under the guidance of domain knowledge;

FIG. 2 is a second flow chart of the method for identifying the target fine type of the remote sensing image under the guidance of domain knowledge;

FIG. 3 is a schematic structural diagram of a remote sensing image target fine type recognition device under the guidance of domain knowledge;

fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The method and the device for identifying the target fine type of the remote sensing image under the guidance of domain knowledge are described below with reference to fig. 1 to 4.

The method for identifying the target fine type of the remote sensing image under the guidance of the domain knowledge is used for identifying the target of interest in the remote sensing image, the existing target identification method is based on a deep learning model, and the target identification is carried out by means of the characteristics extracted from the image. According to the target fine type recognition method based on the remote sensing image under the guidance of the domain knowledge, the feature information of different dimensions of the remote sensing image is extracted and used as model input data respectively, and the result is further confirmed and corrected by further utilizing the target priori knowledge after the model is output, so that the accuracy and the rationality of target fine type recognition are improved.

Specifically, referring to fig. 1, fig. 1 is a flow chart of a method for identifying a target fine type of a remote sensing image under guidance of domain knowledge provided by the embodiment of the present invention, based on fig. 1, the method for identifying a target fine type of a remote sensing image under guidance of domain knowledge provided by the embodiment of the present invention includes:

step 100, obtaining an original remote sensing image to be identified, and extracting multidimensional features of the original remote sensing image to obtain an edge feature image and a texture feature image of the original remote sensing image;

when target identification is carried out, an original remote sensing image to be identified is firstly obtained, and multidimensional feature extraction is carried out on the obtained original remote sensing image to obtain an edge feature image and a texture feature image of the original remote sensing image. The acquired original remote sensing image is a high-resolution remote sensing image; the edge feature image comprises edge features of the original remote sensing image, the texture feature image comprises texture features of the original remote sensing image, and the edge features and the texture features are obtained by extracting features of the original remote sensing image from different dimensions based on different feature extraction modes.

Step 200, performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image;

The fusion mode of the intermediate feature output layer of the multi-stream convolutional neural network of the original remote sensing image, the edge feature image and the texture feature image comprises feature fusion based on an attention mechanism, namely, the feature fusion is carried out on the output features of the intermediate layer of the network corresponding to the original remote sensing image, the edge feature image and the texture feature image based on the attention mechanism, so that the fusion feature image corresponding to the original remote sensing image is obtained.

Step 300, identifying an initial category of a target to be identified in the original remote sensing image based on the fusion characteristic image;

and identifying the initial category of the target to be identified in the original remote sensing image based on the fusion characteristic image, wherein the target to be identified in the remote sensing image is an interested target, the target to be identified comprises one or more targets, and when the target to be identified comprises a plurality of targets, the categories of the targets to be identified can be the same or different. Likewise, the initial category of the object to be identified includes one or more.

Step 400, obtaining a target knowledge template base corresponding to the initial category, and identifying the fine type of the target to be identified according to the target knowledge template base.

Based on the identified initial category of the target to be identified, a target knowledge template base corresponding to the initial category is obtained, the target knowledge template base is priori knowledge of the target to be identified, the target knowledge template base contains information such as fine category, size and the like of the target to be identified, and the initial category of the target to be identified can be further confirmed according to the target knowledge template base.

It can be understood that when classifying objects, the objects can be classified into categories with different precision, such as major categories, minor categories, and the like, according to the difference between different objects. The finer the classification of the targets, the fewer the distinguishing features among the targets of different categories, and the initial category of the target to be identified is further confirmed based on priori knowledge in the target knowledge template base, so that the fine type of the target to be identified can be identified, and the classification precision of the target to be identified is improved.

In this embodiment, the edge feature image and the texture feature image of the original remote sensing image are obtained by performing multi-dimensional feature extraction on the original remote sensing image; performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image; identifying an initial category of a target to be identified in an original remote sensing image based on the fusion characteristic image; and acquiring a target knowledge template base corresponding to the initial category, and identifying the target to be identified according to the acquired target knowledge template base. The method comprises the steps of extracting edge features and texture features of an original remote sensing image, learning and fusing features of a multi-stream convolutional neural network, primarily identifying a target based on the fused features, refining and reconfirming a primary identification result by combining prior knowledge of a knowledge template base, so that accuracy of target fine type identification is improved, and high-precision detection requirements of fine types of targets to be identified are met.

Further, in step 300, the identifying the initial category of the target to be identified in the original remote sensing image based on the fused feature image specifically includes:

step 301, determining an outsourcing rectangle of a target to be identified in the original remote sensing image and a long-direction included angle of the outsourcing rectangle according to the fusion characteristics; the long-direction included angle is an included angle between the long side of the outer rectangle and the calibration direction;

step 302, performing rotary cutting processing on the image block covered by the outer rectangle according to the long-direction included angle and the calibration direction to obtain a target image containing the target to be identified;

step 303, inputting the target image into a pre-trained target fine classification model to obtain a class prediction value output by the target fine classification model; the category predicted value is a probability value that a target to be identified in the original remote sensing image is a preset category;

and step 304, sorting the category predicted values in a descending order, and selecting a preset number of categories as initial categories of the targets to be identified according to the sorting order.

When the initial category of the target to be identified in the original remote sensing image is identified, determining an outsourcing rectangle of the target to be identified in the original remote sensing image and a long-direction included angle of the outsourcing rectangle according to the boundary box and category information of the target position detected and identified in the fusion characteristic image. The long-direction included angle of the outer wrapping rectangle is an included angle between the long side of the outer wrapping rectangle and the calibration direction, the calibration direction is a set geographic direction, such as the north direction, the long-direction included angle can be a clockwise included angle between the long side of the outer wrapping rectangle and the calibration direction, and can also be a counterclockwise included angle between the long side of the outer wrapping rectangle and the calibration direction, and the method is not limited specifically.

And carrying out direction normalization and cutting treatment on the outsourcing rectangle according to the long-direction included angle of the outsourcing rectangle of the target to be identified, and obtaining a target image containing the target to be identified. Specifically, according to the long-direction included angle and the calibration direction of the outsourcing rectangle of the target to be identified, the outsourcing rectangle is rotated, the outsourcing rectangle is uniformly rotated to the long-side pointing calibration direction, the direction normalization of the outsourcing rectangle is realized, and the rotated outsourcing rectangle is cut to obtain a target image containing the target to be identified.

And inputting the target image into a target fine classification model which is trained to obtain a class prediction value output by the target fine classification model, wherein the class prediction value is used for representing the probability value that the class of the target to be identified is a preset class. That is, the probability that the target to be identified is a preset category is predicted by the target fine classification model, the preset category includes a plurality of probability values predicted by the target fine classification model, and the probability values that the target to be identified is each preset category are included. And sorting the class predicted values of the targets to be identified in a descending order, selecting a preset number of classes according to the sorting order as initial classes of the targets to be identified, for example, selecting the first K classes with the highest class predicted values according to the sorting order as initial classes of the targets to be identified. When a plurality of targets to be identified exist, respectively selecting the first K categories as initial categories according to the category predicted value of each target to be identified, wherein the first K categories are the categories with the highest matching degree with the target to be identified.

Further, based on the K initial categories of the target to be identified, obtaining target priori knowledge in a target knowledge template base corresponding to the K initial categories, in step 400, identifying a fine type of the target to be identified according to the target knowledge template base, including:

step 401, acquiring a priori targets in the target knowledge template base;

step 402, determining the size mean deviation degree of the target to be identified and the prior target according to the spatial resolution of the original remote sensing image;

step 403, screening the initial category according to the size mean deviation, and determining a first target category of the target to be identified;

step 404, performing binarization processing on the target image to obtain a target binarized image corresponding to the target image;

step 405, calculating the similarity between the target to be identified and the prior target in the first template binary image based on the target binary image; the first template binarization image is a template binarization image corresponding to the first target category in the target knowledge template base;

and step 406, determining the fine type of the target to be identified according to the similarity.

When identifying the target to be identified, the fine category of the target to be identified needs to be identified, specifically, firstly, according to the spatial resolution of the original remote sensing image and the prior target in the target knowledge template base, wherein the prior target is the prior knowledge of the target in each initial category. And determining the size mean deviation of the target to be identified and the prior target according to the spatial resolution of the original remote sensing image, screening the initial category according to the size mean deviation, and determining the first target category of the target to be identified. Specifically, based on a target knowledge template library, sequentially confirming the class consistency of the prior knowledge of the targets of the K classes and the corresponding classes in the target knowledge template library, firstly describing the prior knowledge of the current class in the target knowledge template library, combining the spatial resolution of the original remote sensing image, and calculating the mean deviation degree of the size (the size comprises the perimeter, the length-width ratio and the like) of the target to be identified in the current target image and the prior target based on the prior target in the target knowledge template library corresponding to the current initial class. If the deviation of the size mean exceeds a preset threshold T1, the current initial category is not consistent with the prior knowledge under the category, and the next initial category is continuously judged; cycling until consistency judgment of the Kth initial category is completed; if the K initial categories are inconsistent, the target to be identified in the target image is considered not to belong to any one of the plurality of initial categories, and the current identification result of the original remote sensing image is deleted; if there is some initial category consistent with a priori knowledge of the category, then it may be determined that the category is the first target category of the target to be identified.

After screening the K initial categories, if a first target category meeting the conditions exists, performing binarization processing on a target image containing a target to be identified, and obtaining a target binarized image corresponding to the target image.

Calculating the similarity between the target to be identified and the prior target in the first template binarization image based on the target binarization image and the template binarization image, wherein the target binarization image contains the binarization image of the target to be identified, and the template binarization image contains the binarization image of the prior target; the first template binarization image is a target template binarization image screened from a plurality of target knowledge template bases corresponding to K initial categories based on screening results of the K initial categories. Specifically, the similarity between the target to be identified and the prior target is calculated based on a binary image of the target to be identified in the target binary image and a binary image of the prior target in the template binary image. And according to the similarity between the target to be identified and the prior target, performing similarity matching between the target to be identified and the prior target in the binarized image of the first template, and further determining the fine category of the target to be identified. Specifically, if the similarity between the target to be identified and the prior target in the first template binary image exceeds a preset similarity threshold, the category of the prior target, namely the category corresponding to the first template binary image, is the fine category of the target to be identified; otherwise, if the similarity between the object to be identified and the prior object in the first template binarized image does not exceed the preset similarity threshold, the category of the object to be identified is not matched with the category of the prior object in the first template binarized image.

Further, in step 405, when calculating the similarity between the object to be identified and the prior object in the binarized image of the first template, the method specifically includes:

step 4051, calculating a target hash value of the target binarized image and a template hash value of the template binarized image respectively based on a difference hash algorithm;

step 4052, calculating the hamming distance between the target hash value and the template hash value;

step 4053, determining the similarity between the target to be identified and the prior target in the binarized image of the first template according to the hamming distance.

When calculating the similarity between the target to be identified and the prior target in the first template binarization image, firstly, calculating the target hash value of the target binarization image and the template hash value of the template binarization image respectively based on a difference hash algorithm, calculating the Hamming distance between the target hash value and the template hash value according to the calculated target hash value and the template hash value, and determining the similarity between the target to be identified and the prior target in the first template image according to the calculated Hamming distance. Specifically, if the hamming distance is smaller than a preset threshold T2 (T2 is a positive number), the similarity between the target to be identified and the prior target in the binarized image of the first template meets the condition, the current judgment type result is correct, the first target type is the fine type of the target to be identified, and otherwise, the current identification result is deleted.

In this embodiment, the prior targets in the knowledge template library are used to screen the initial category of the target to be identified, the similarity matching is performed between the target to be identified and the prior targets, and the category of the target to be identified is further confirmed according to the similarity between the target to be identified and the prior targets, so that the accuracy of identifying the fine type of the target to be identified can be improved.

Referring to another flow chart of a remote sensing image target fine type recognition method under the guidance of domain knowledge shown in fig. 2, in a preferred embodiment, recognition of an initial category and a fine category of a target to be recognized is respectively implemented based on different detection models and a target knowledge template library constructed in advance, wherein the target knowledge template library comprises target template images and target template binarized images under each category. Specifically, before detecting and identifying the target to be identified, a corresponding sample data set needs to be constructed for model training, and in this embodiment, the position bounding box of the target to be identified is obtained based on the detection of the pre-trained multi-stream convolutional neural network model, and the initial category of the target to be identified is obtained based on the identification of the pre-trained fine classification model of the target.

In step 200, the edge feature image and the texture feature image corresponding to the original remote sensing image each include a plurality of edge feature images, and performing feature extraction and fusion of the multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image specifically includes:

step 201, performing overlapped grid division processing on the original remote sensing image according to a sliding window with a preset size and a preset sliding step length to obtain a plurality of grid images corresponding to the original remote sensing image; the edge feature image comprises edge feature subgraphs corresponding to the grid images, and the texture feature image comprises texture feature subgraphs corresponding to the grid images;

step 202, inputting a target grid image, an edge feature sub-image corresponding to the target grid image and a texture feature sub-image corresponding to the target grid image into a multi-stream convolutional neural network model which is pre-trained, and carrying out feature extraction and fusion of the multi-stream convolutional neural network on the target grid image, the edge feature sub-image corresponding to the target grid image and the texture feature sub-image corresponding to the target grid image by utilizing the multi-stream convolutional neural network model to obtain a fused feature image of the original remote sensing image; the target grid image is any one of the plurality of grid images;

And carrying out grid division on the obtained high-resolution original remote sensing image to obtain a plurality of grid images, wherein the edge feature image of the original remote sensing image comprises edge feature subgraphs corresponding to all the grid images, and correspondingly, the texture feature image of the original remote sensing image comprises texture feature subgraphs corresponding to all the grid images. For the extraction of the edge features and the texture features, the multi-dimensional feature extraction is performed on the original remote sensing image before the grid division is performed on the original remote sensing image, so as to obtain an edge feature image and a texture feature image of the original image, and after the division is performed on the original remote sensing image, the grid division is performed on the edge feature image and the texture feature image in the same mode as the original remote sensing image, so as to obtain an edge feature sub-image and a texture feature sub-image corresponding to each grid image; after the original remote sensing image is subjected to grid division, namely after the original remote sensing image is subjected to grid division, each grid image is subjected to multi-dimensional feature extraction, so that an edge feature sub-image and a texture feature sub-image corresponding to each grid image are obtained, and the sequence of the multi-dimensional feature extraction and the grid division is not limited.

In fig. 2, an original remote sensing image is first subjected to grid division to obtain a plurality of grid images, and each grid image is respectively subjected to edge feature extraction and texture feature extraction to obtain an edge feature sub-image and a texture feature sub-image corresponding to the grid image. And taking each grid image as a target grid image, inputting the target grid image and the edge feature subgraph and the texture feature subgraph corresponding to the target grid image into a multi-flow convolution neural network model, and performing feature self-learning of the convolution neural network on the target grid image, the edge feature subgraph and the texture feature subgraph by using the multi-flow convolution neural network model. When the original remote sensing image is subjected to grid division, carrying out overlapped image grid division according to a sliding window G x G with a preset size and a preset sliding step G x G, respectively extracting edge characteristics and texture characteristics of each obtained grid image, inputting each grid image and the corresponding edge characteristic image and texture characteristic image into a trained multiflow convolutional neural network model, carrying out characteristic self-learning and characteristic fusion of the convolutional neural network on the grid image, the edge characteristic image and the texture characteristic image by utilizing the multiflow convolutional neural network model, and predicting the position boundary frame and the category of an object to be identified by utilizing the fused characteristic image.

Further, in fig. 2, a weight sharing target detection backbone network and an attention mechanism are provided in the multi-stream convolutional neural network model, wherein the weight sharing target detection backbone network is formed by a plurality of convolutional neural networks, and input data of different convolutional neural networks are different and used for processing different data. Specifically, the plurality of convolutional neural networks include a first convolutional neural network, a second convolutional neural network and a third convolutional neural network, wherein the input of the first convolutional neural network is a target grid image of the original remote sensing image, the input of the second convolutional neural network is an edge feature subgraph of the target grid image, and the input of the third convolutional neural network is a texture feature subgraph of the target grid image. And (3) for any grid image obtained by grid division, respectively carrying out characteristic extraction of a multi-stream convolutional neural network on the grid image, an edge characteristic subgraph of the grid image and a texture characteristic subgraph of the grid image by utilizing a weight sharing target detection backbone network to obtain a corresponding characteristic image 1, a characteristic image 2 and a characteristic image 3. Based on the attention mechanism, feature fusion is carried out on the feature image 1, the feature image 2 and the feature image 3, and corresponding fusion feature images are obtained.

And predicting the position boundary frame and the category of the target to be identified in the grid image according to the fusion characteristic image, and carrying out direction normalization and cutting processing on the position boundary frame of the target to be identified based on the position boundary frame of the target to be identified to obtain a target image containing the target to be identified. The position bounding box is the wrapping rectangle of the object to be identified.

In the schematic diagram of the target fine type recognition flow of the remote sensing image under the guidance of the domain knowledge shown in fig. 2, the method includes a model training stage, specifically, first, a sample data set and a target knowledge template library are constructed, and model training is performed by using the constructed sample data set, and before step 200, the method may further include:

step 001, generating a fine classification system according to each preset category; acquiring an initial remote sensing image under the fine classification system; the initial remote sensing image comprises classification targets under each preset classification;

step 002, obtaining the labeling information of the initial remote sensing image, performing overlapping clipping processing on the initial remote sensing image based on the labeling information, and determining a sample image containing the classification target from the clipping image;

step 003, constructing a first sample data set based on the sample image, and performing iterative training on a preset basic multiflow convolutional neural network model by using the first sample data set;

Step 004, constructing a target template image corresponding to the fine classification system in the target knowledge template base based on the image blocks corresponding to the classification targets in the sample images under the preset categories;

step 005, performing binarization processing on the average value image of the image block corresponding to the classification target in the sample image under each preset category to construct a target template binarization image in a target knowledge template base corresponding to the fine classification system; the target knowledge template library comprises target template images and target template binarization images under the preset categories, and attribute information and global geographic space distribution range information of classification targets corresponding to the target template images and the target template binarization images;

step 006, performing rotation amplification processing on a target image block in the target template image according to a preset angle interval to obtain a plurality of angle images corresponding to the sample image; the target image block is an image block in the target template image, and a classification recognition result obtained by classification recognition through the multi-flow convolutional neural network model is inconsistent with a true value;

Step 007, constructing a second sample data set based on the target template image and the angle image;

and 008, performing iterative training on a preset basic fine classification model by using the second sample data set.

When a sample data set is constructed, a fine classification system S is generated according to each preset category of interest, then an initial remote sensing image of each category under the fine classification system S is obtained, the initial remote sensing image is in a meter-level or sub-meter-level high resolution, and classification targets under each preset category are contained, wherein the classification targets are objects to be identified of interest, namely objects to be detected and identified. And acquiring labeling information of the initial remote sensing image, wherein the labeling information is used for labeling the category of the classification target in the initial remote sensing image, sequentially recording four-corner coordinates of the classification target according to the sequence of the classification target from head to tail and from left to right, giving the four-corner coordinates to the fine category, and correspondingly storing the labeling result according to a storage format required by model training. And carrying out overlapped cutting processing on the initial remote sensing image based on the marking information, determining a sample image containing a classification target from a plurality of images obtained by cutting, and constructing a first sample data set based on the sample image. And carrying out iterative training on a preset basic multi-stream convolutional neural network model by using the constructed first sample data set.

Further, based on the sample images, a target knowledge template base Z and a second sample data set, i.e. a sample data set TS for training a target fine classification model, are established, respectively. When the target knowledge template base Z is constructed, a target image block and a corresponding binarization image block corresponding to each fine category are generated according to the fine classification system S, and a target template image and a target template binarization image under the fine category are constructed. In the binary image corresponding to each fine category, the pixel value of the region corresponding to the classification target is a first pixel value, for example, 0, and the pixel value of the background region outside the classification target is a second pixel value, for example, 255, wherein the original image corresponding to the binary image is a mean image of the image block of the region covered by the classification target in the sample image. And for the target priori knowledge in the target knowledge template base corresponding to any fine category, the target priori knowledge comprises attribute information and spatial distribution range information of the classification target in the fine category, and the attribute information of the classification target comprises the name, the length, the width and the like of the classification target.

For a sample data set TS, its composition includes two parts: one part is an image block (containing target fine category information) of a position boundary box coverage area of a classification target in a sample image in the first sample data set, namely a target template image in a target template knowledge base Z, and the other part is an image block subjected to rotation transformation aiming at a specific image block of which the classification and identification result of the multi-flow convolutional neural network model M is inconsistent with a true value in the first sample data set. That is, when the sample data set TS is constructed, according to a certain angle interval δ (in this embodiment, δ takes a value of 5), only the target image block in which the target detection and identification result of the multi-stream convolutional neural network model M is inconsistent with the true value is rotationally transformed in the first sample data set, so as to obtain a plurality of corresponding angle images, and the angle images are expanded into the target template image in the target template knowledge base Z, so as to obtain the second sample data set, that is, the sample data set TS. The true value of the target image block can be determined according to the labeling information of the sample image, and the sample data set TS is utilized to carry out iterative training on a preset basic fine classification model to obtain the target fine classification model.

Further, based on fig. 2, the target image is input into a pre-trained target fine classification model, the first K initial categories with highest matching degree for category prediction based on the target image are output, and based on the target template images in the target knowledge template library, category consistency judgment and space consistency judgment are respectively performed on the K initial categories. The class consistency judgment is used for screening the K initial classes, and screening a first target class of the target to be identified according to the size mean deviation degree of the target to be identified and the prior target. The space consistency judgment is to determine the geographic space position of the target to be identified according to the original remote sensing image, and further confirm the target to be identified based on the space distribution range information in the target knowledge template base. Specifically, step 400 further includes:

step 501, converting the coordinates of the target pixel in each grid image into the coordinates of the pixel in the original remote sensing image; the target pixel coordinates are pixel coordinates corresponding to the target to be identified in the grid image;

step 502, obtaining projection parameter information of the original remote sensing image, and determining geospatial position information of the target to be identified according to the projection parameter information and the pixel coordinates;

Step 503, determining global geospatial distribution range information of the target to be identified based on the target knowledge template base, and determining whether the geospatial position of the target to be identified is within the global geospatial distribution range according to the geospatial position information and the global geospatial distribution range information;

and step 504, if yes, performing de-duplication stitching processing on each grid image based on a preset non-maximum suppression algorithm.

It should be noted that, any target image input into the target fine classification model is obtained based on a target grid image, K initial categories are identified based on the target image corresponding to the target grid image, and fine categories of the target to be identified are identified based on the target image and the target template binarized image, so as to obtain an identification result corresponding to the target grid image. Summarizing the recognition results of all the grid images to obtain a final recognition result of the target to be recognized in the original remote sensing image. And when the space consistency is determined, converting the target pixel coordinates in each grid image into pixel coordinates in the original remote sensing image, wherein the target pixel coordinates are pixel coordinates corresponding to the target to be identified in the grid image. Acquiring projection parameter information of an original remote sensing image, and determining geographic space position information of a target to be identified according to the projection parameter information; determining the spatial distribution range information of the target to be identified according to the target knowledge template library, and determining whether the spatial position of the target to be identified is in the spatial distribution range according to the geographic spatial position information and the global geographic spatial distribution range information of the target to be identified; if so, performing de-duplication splicing processing on each grid image based on a preset non-maximum suppression algorithm to remove an overlapping area when dividing the grid so as to obtain a final recognition result of the target to be recognized, wherein the recognition result comprises a longitude and latitude coordinate set and fine category information of a position boundary rectangular frame of the target to be recognized, and performing de-duplication splicing processing on the grid image after the fine category of the target to be recognized is recognized so as to obtain a recognition result which can be finally used for output display.

Further, in the process of converting the coordinates of the target pixel points, converting the coordinates of the target pixel in the grid image into the coordinates of the pixel in the original remote sensing image, specifically, converting the coordinates of the pixel of the target to be identified in each grid image into the coordinates of the pixel in the original remote sensing image according to the position coordinates of the target pixel in the original remote sensing image based on the detection and identification results of all the grid images, wherein the conversion formula is as follows:

X＝Xstart+Xpos

Y＝Ystart+Ypos

x and Y are pixel coordinates in the converted original remote sensing image, xstart and Ystart are pixel coordinates corresponding to the original point set in the original remote sensing image for the current grid image, and Xpos and Ypos are pixel coordinates of a bounding box of the position of any target to be identified in the current grid image. And calculating geographic coordinates of characteristic points (target center points used in the embodiment) of the target to be identified according to pixel coordinates of a position boundary box of the target to be identified and projection parameter information of an original remote sensing image, so as to obtain geographic space position information of the target to be identified, wherein the geographic space position information comprises position longitude and latitude.

In the embodiment, through the grid division processing, the target identification is performed on each grid image, the problem of large imaging scene of the remote sensing image during target identification is solved, the category consistency judgment is performed after the initial category of the target to be identified is identified, the space consistency judgment is performed after the fine category of the target to be identified is identified, the identification accuracy of the target is further improved, and meanwhile, the identification effectiveness and the identification rationality of the target are ensured.

The device for identifying the target fine type of the remote sensing image under the guidance of the domain knowledge, which is provided by the invention, is described below, and the device for identifying the target fine type of the remote sensing image under the guidance of the domain knowledge, which is described below, and the method for identifying the target fine type of the remote sensing image under the guidance of the domain knowledge, which is described above, can be correspondingly referred to each other.

Referring to fig. 3, a device for identifying a target fine type of a remote sensing image under guidance of domain knowledge provided by an embodiment of the present invention includes:

the feature extraction module 10 is configured to obtain an original remote sensing image to be identified, and perform multi-dimensional feature extraction on the original remote sensing image to obtain an edge feature image and a texture feature image of the original remote sensing image;

the feature fusion module 20 is configured to perform feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image, so as to obtain a fused feature image of the original remote sensing image;

a first identifying module 30, configured to identify an initial category of an object to be identified in the original remote sensing image based on the fused feature image;

the second identifying module 40 is configured to obtain a target knowledge template base corresponding to the initial category, and identify the fine type of the target to be identified according to the target knowledge template base.

In one embodiment, the feature fusion module 20 is further configured to:

inputting a target grid image, an edge feature subgraph corresponding to the target grid image and a texture feature subgraph corresponding to the target grid image into a pretrained multi-flow convolutional neural network model, and performing feature extraction and fusion of the multi-flow convolutional neural network on the target grid image, the edge feature subgraph corresponding to the target grid image and the texture feature subgraph corresponding to the target grid image by utilizing the multi-flow convolutional neural network model to obtain a fusion feature image of the original remote sensing image, wherein the target grid image is any one of the grid images;

In one embodiment, the remote sensing image target fine type recognition device under the guidance of the domain knowledge further comprises a model training module, configured to:

In one embodiment, the remote sensing image target fine type recognition device under the guidance of the domain knowledge further comprises a third recognition module, configured to:

Converting target pixels in each grid image into pixel coordinates in the original remote sensing image; the target pixel coordinates are pixel coordinates corresponding to the target to be identified in the grid image;

In one embodiment, the first identification module 30 is further configured to:

In one embodiment, the initial category includes a plurality of the second identification module 40 is further configured to:

acquiring a priori target in the target knowledge template library;

Calculating the similarity between the target to be identified and the prior target in the first template image based on the target binarization image; the first template binarization image is a template binarization image corresponding to the first target category in the target knowledge template base;

In one embodiment, the second identifying module 40 is further configured to:

Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform a telemetry image target fine type recognition method under domain knowledge guidance, the method comprising:

Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, the computer can perform a method for identifying a fine type of a remote sensing image target under guidance of domain knowledge provided by the above methods, where the method includes:

In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method for fine type recognition of a remote sensing image object under the guidance of domain knowledge provided by the above methods, the method comprising:

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The method for identifying the target fine type of the remote sensing image under the guidance of domain knowledge is characterized by comprising the following steps of:

2. The method for identifying the target fine type of the remote sensing image under the guidance of domain knowledge according to claim 1, wherein the step of performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image to obtain a fused feature image of the original remote sensing image comprises the following steps:

3. The method for identifying the target fine type of the remote sensing image under the guidance of domain knowledge according to claim 2, wherein the steps of performing feature extraction and fusion of a multi-stream convolutional neural network on the original remote sensing image, the edge feature image and the texture feature image, and before obtaining the fused feature image of the original remote sensing image, further comprise:

4. A method for identifying a fine type of a remote sensing image target under the guidance of domain knowledge according to claim 3, wherein after the fine type of the target to be identified is identified according to the target knowledge template base, the method further comprises:

5. The method for identifying the fine type of the target in the remote sensing image under the guidance of domain knowledge according to claim 1, wherein the step of identifying the initial category of the target to be identified in the original remote sensing image based on the fused feature image comprises the following steps:

6. The domain knowledge guided remote sensing image target fine type recognition method according to claim 5, wherein the initial category includes a plurality of the steps of recognizing the fine type of the target to be recognized from the target knowledge template base, comprising:

Acquiring a priori target in the target knowledge template library;

7. The domain knowledge guided remote sensing image target fine type recognition method according to claim 6, wherein the step of calculating the similarity of the target to be recognized and a priori target in the first template binarized image based on the target binarized image comprises:

8. The utility model provides a remote sensing image target fine type recognition device under domain knowledge guide which characterized in that includes:

the second recognition module is used for acquiring a target knowledge template base corresponding to the initial category and recognizing the fine type of the target to be recognized according to the target knowledge template base.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method for fine pattern recognition of remote sensing images under the guidance of domain knowledge according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the domain knowledge guided remote sensing image target fine type recognition method of any one of claims 1 to 7.