CN111860207B

CN111860207B - Multi-scale remote sensing image ground object classification method, system, device and medium

Info

Publication number: CN111860207B
Application number: CN202010606564.5A
Authority: CN
Inventors: 张鹏
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat sen University.Shenzhen; Sun Yat Sen University
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2023-10-24
Anticipated expiration: 2040-06-29
Also published as: CN111860207A

Abstract

The application discloses a multi-scale remote sensing image ground object classification method, a system, a device and a medium, wherein the method comprises the following steps: based on different preset scales, performing image blocking on the obtained remote sensing image to obtain an initial image block set; then dividing the image blocks under each scale in the initial image block set based on a preset semantic division model to obtain classification results of different scales; then splicing the classification results corresponding to the image blocks under the same scale to obtain an initial ground object classification result graph set under different scales; and finally, merging the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph. The application can eliminate discontinuous linear joints between adjacent image blocks, has strong practicability and can be widely applied to the technical field of image processing.

Description

Multi-scale remote sensing image ground object classification method, system, device and medium

Technical Field

The application relates to the technical field of image processing, in particular to a method, a system, a device and a medium for classifying ground features based on multi-scale remote sensing images.

Background

In the processing of space flight or aviation remote sensing images, classification and extraction of ground features are an important task. The process of classification of features is typically as follows: firstly, analyzing spectrum information and space information of various ground objects in a remote sensing image; then, selecting proper image features capable of reflecting the spectrum and the spatial information of the ground object; extracting the image features at each pixel in the image, and judging the belonging ground object category of each pixel according to the feature value; finally, the classification results of the pixels are comprehensively processed in view of application requirements, and the ground feature classification result of the whole remote sensing image is obtained.

The remote sensing image ground object classification method mainly comprises the following steps: a pixel-based method, a neighborhood-based method, an object-based method, etc. In recent years, with the rapid development of deep learning technology, a semantic segmentation technology based on a deep convolutional neural network is gradually applied to classification of ground objects of remote sensing images, and a processing effect superior to that of a traditional method is obtained. Typical semantic segmentation models include FCN, segNet, U-Net, deep, etc.

For training and convenient use, the input of the semantic segmentation model is usually an original image with a fixed size, the output is a semantic annotation image with the same size as the image, and at each pixel of the image, the corresponding semantic type is annotated with a different pixel value. Here, the image size of the model input and output cannot be too large nor too small, and common sizes are 128×128, 256×256, 512×512, or the like.

In the classification of the ground object of the remote sensing image, the input data is an aerospace or aviation remote sensing image, the image size is usually large, the grade of tens of thousands of multiplier ten thousand pixels can be achieved, and the output data is a ground object distribution map with the same size as the input data. Therefore, the semantic segmentation model cannot be directly applied to the classification of the ground features of the remote sensing image, and the most commonly used method is as follows: firstly, dividing an input remote sensing image into small image blocks with fixed sizes which are suitable for a semantic segmentation model; then, carrying out semantic segmentation on each image block to obtain a ground feature distribution map corresponding to each image block; and finally, combining the feature distribution graphs corresponding to all the image blocks to obtain the feature distribution graph of the whole remote sensing image.

However, since the feature classification result of one image block is subject to the image of all pixels in the image block, it is difficult to maintain consistency between the feature classification result of one image block and the feature classification result of the other image block at the junction of two adjacent image blocks, and a very obvious discontinuous linear seam is exhibited.

Disclosure of Invention

In view of the above, the embodiments of the present application provide a method, a system, a device, and a medium for classifying ground objects based on multi-scale remote sensing images, which can eliminate discontinuous linear seams between adjacent image blocks.

The first aspect of the application provides a multi-scale remote sensing image ground object classification method, which comprises the following steps:

based on different preset scales, performing image blocking on the obtained remote sensing image to obtain an initial image block set;

dividing the image blocks under each scale in the initial image block set based on a preset semantic division model to obtain classification results of different scales;

splicing the classification results corresponding to the image blocks under the same scale to obtain an initial ground object classification result graph set under different scales;

and combining the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph.

In some embodiments, the step of performing image segmentation on the obtained remote sensing image based on different preset scales to obtain an initial image block set includes:

acquiring a remote sensing image and configuring different scales;

based on a preset dividing sequence, dividing the remote sensing image into image blocks under different scales according to configured scales to obtain an initial image block set.

In some embodiments, when the obtained remote sensing image is subjected to image blocking, moving a plurality of pixel distances to the inside of the remote sensing image for the image block at the boundary of the remote sensing image so that the image block at the boundary of the remote sensing image does not exceed the boundary of the remote sensing image.

In some embodiments, the step of dividing the image block under each scale in the initial image block set based on the preset semantic division model to obtain classification results of different scales includes:

directly inputting an image block which is equal to the image size standard required by the semantic segmentation model into the semantic segmentation model, and outputting a classification result which is the same as the input image block in size;

after the image blocks smaller than the image size standard are amplified, inputting the semantic segmentation model, and reducing the obtained classification result to the size which is the same as the size of the image blocks before the amplification treatment;

and (3) carrying out reduction processing on the image blocks larger than the image size standard, inputting the semantic segmentation model, and amplifying the obtained classification result to the size which is the same as the size of the image blocks before the reduction processing.

In some embodiments, the step of stitching the classification results corresponding to the image blocks under the same scale to obtain the initial feature classification result graph set under different scales includes:

splicing the classification results corresponding to the image blocks under the same scale according to the arrangement sequence of the image blocks to obtain an initial ground object classification result graph set under different scales;

if one pixel corresponds to a plurality of image blocks, calculating the distance between the pixel and the centers of the plurality of image blocks, and taking the pixel value corresponding to the image block closest to the pixel as the value of the pixel.

In some embodiments, the step of merging the initial feature classification result graph sets under different scales to obtain the target feature classification result graph based on the voting strategy includes:

three semantic values under different scales corresponding to each pixel value in the ground object classification result diagram are obtained;

if any two semantic values corresponding to one pixel value are the same, the semantic value is given to the pixel;

if all three semantic values corresponding to one pixel value are different, the semantic value of the middle scale of the three semantic values is assigned to the pixel.

The second aspect of the application provides a multi-scale remote sensing image ground object classification system, which comprises:

the blocking module is used for carrying out image blocking on the acquired remote sensing images based on different preset scales to obtain an initial image block set;

the segmentation module is used for segmenting the image blocks under each scale in the initial image block set based on a preset semantic segmentation model to obtain classification results of different scales;

the splicing module is used for splicing the classification results corresponding to the image blocks under the same scale to obtain an initial ground object classification result graph set under different scales;

and the merging module is used for merging the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph.

In some embodiments, the segmentation module comprises:

the input unit is used for directly inputting the image blocks with the image size standard which is equal to the requirement of the semantic segmentation model into the semantic segmentation model and outputting classification results with the same size as the input image blocks;

the first scaling unit is used for amplifying the image blocks smaller than the image size standard, inputting the semantic segmentation model, and reducing the obtained classification result to the size which is the same as the size of the image blocks before the amplifying treatment;

the second scaling unit is used for inputting the semantic segmentation model after the image blocks larger than the image size standard are subjected to the reduction processing, and amplifying the obtained classification result to the size which is the same as the size of the image blocks before the reduction processing;

the splice module includes:

the splicing unit is used for splicing the classification results corresponding to the image blocks under the same scale according to the arrangement sequence of the image blocks to obtain an initial ground object classification result graph set under different scales;

a first replacing unit, configured to calculate a distance between a pixel and a center of a plurality of image blocks if there is the pixel corresponding to the plurality of image blocks, and take a pixel value corresponding to an image block closest to the pixel as a value of the pixel;

the merging module comprises:

the acquisition unit is used for acquiring three semantic values under different scales corresponding to each pixel value in the ground object classification result diagram;

a second replacing unit, configured to assign a semantic value to a pixel if any two semantic values corresponding to the pixel value are the same;

and the third replacing unit is used for giving the semantic value of the middle scale of the three semantic values to the pixel if the three semantic values corresponding to the pixel value are all different.

A third aspect of the application provides an apparatus comprising a processor and a memory;

the memory is used for storing programs;

the processor is configured to perform the method according to the first aspect of the application according to the program.

A fourth aspect of the present application provides a storage medium storing a program for execution by a processor to perform the method of the first aspect of the present application.

According to the embodiment of the application, firstly, based on different preset scales, image blocking is carried out on the obtained remote sensing images to obtain an initial image block set; then dividing the image blocks under each scale in the initial image block set based on a preset semantic division model to obtain classification results of different scales; then splicing the classification results corresponding to the image blocks under the same scale to obtain an initial ground object classification result graph set under different scales; finally, combining the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph; the application can eliminate discontinuous linear joints between adjacent image blocks and has strong practicability.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating the overall steps of an embodiment of the present application;

fig. 2 is an effect diagram of the method for cloud classification of remote sensing images according to the embodiment of the application.

Detailed Description

The application is further explained and illustrated below with reference to the drawing and the specific embodiments of the present specification. The step numbers in the embodiments of the present application are set for convenience of illustration, and the order of steps is not limited in any way, and the execution order of the steps in the embodiments can be adaptively adjusted according to the understanding of those skilled in the art.

Referring to fig. 1, the method of the embodiment of the present application includes the steps of:

s1, performing image segmentation on an acquired remote sensing image based on different preset scales to obtain an initial image block set;

step S1 of the present embodiment includes S11-S12:

s11, acquiring remote sensing images and configuring different scales;

s12, dividing the remote sensing image into image blocks under different scales according to the configured scales based on a preset dividing sequence to obtain an initial image block set.

Specifically, in this embodiment, when image segmentation is performed on an obtained remote sensing image, a plurality of pixel distances are moved to the inside of the remote sensing image for an image block at a remote sensing image boundary, so that the image block at the remote sensing image boundary does not exceed the remote sensing image boundary.

In this embodiment, a large-sized remote sensing image can be input, and the image blocks are divided into 192 x 192, 256 x 256, 320 x 320 three scales from the upper left corner of the image in order from left to right and from top to bottom. If the range of the image blocks does not reach the rightmost side boundary and the bottommost side boundary of the whole image, no overlap exists between the image blocks; otherwise, the image block needs to be moved to the left or the upper side by a certain pixel distance, and the image block and the adjacent left or upper image block have a certain overlap.

After the image is segmented, three groups of image blocks with different scales are obtained, the sizes of the three groups of image blocks are 192 x 192, 256 x 256 and 320 x 320 respectively, and the number of the three groups of image blocks is related to the size of the input remote sensing image.

S2, dividing the image blocks under each scale in the initial image block set based on a preset semantic division model to obtain classification results of different scales;

step S2 of the present embodiment includes S21 to S23:

s21, directly inputting an image block which is equal to the image size standard required by the semantic segmentation model into the semantic segmentation model, and outputting a classification result which is the same as the input image block in size;

s22, after the image blocks smaller than the image size standard are amplified, inputting the semantic segmentation model, and reducing the obtained classification result to the size which is the same as the size of the image blocks before the amplification treatment;

s23, carrying out reduction processing on the image blocks larger than the image size standard, inputting the semantic segmentation model, and amplifying the obtained classification result to the size which is the same as the size of the image blocks before the reduction processing.

The embodiment processes all image blocks at various scales based on the trained semantic segmentation model. If the size of the image block is consistent with the image size required by the semantic segmentation model, directly inputting the image block into the model, and outputting a classification result with the same size as the image block; if the size of the image block is smaller than the image size required by the semantic segmentation model, the image block is amplified and then input into the model, and the output classification result is reduced to be the same as the image block size; if the size of the image block is larger than the image size required by the semantic segmentation model, the image block is reduced and then input into the model, and the output classification result is enlarged to be the same as the image block size.

After the semantic segmentation is finished, three groups of classification results with different scales are obtained, and the sizes and the number of the three groups of classification results are consistent with those of the image blocks with the corresponding scales.

S3, splicing classification results corresponding to the image blocks under the same scale to obtain an initial ground object classification result graph set under different scales;

specifically, step S3 of the present embodiment includes S31-S32:

s31, splicing classification results corresponding to the image blocks under the same scale according to the arrangement sequence of the image blocks to obtain an initial ground object classification result graph set under different scales;

and S32, if one pixel corresponds to a plurality of image blocks, calculating the distance between the pixel and the centers of the plurality of image blocks, and taking the pixel value corresponding to the image block closest to the pixel as the value of the pixel.

According to the embodiment, according to the arrangement sequence of the image blocks, the classification results corresponding to the image blocks under the same scale are spliced together to form a ground object classification result diagram with a large size under the scale. If a certain pixel corresponds to a plurality of image blocks, the distance between the pixel and the centers of the image blocks is calculated, and the corresponding pixel value of the image block with the closest distance is taken as the pixel value.

And after the same-scale stitching is finished, three ground object classification result diagrams under different scales are obtained, and the sizes of the ground object classification result diagrams are the same as the input large-size remote sensing images.

And S4, combining the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph.

Step S4 of the present embodiment includes S41 to S43:

s41, acquiring three semantic values under different scales corresponding to each pixel value in the ground object classification result diagram;

s42, if any two semantic values corresponding to one pixel value are the same, assigning the semantic value to the pixel;

s43, if all three semantic values corresponding to one pixel value are different, assigning the semantic value of the middle scale of the three semantic values to the pixel.

In the embodiment, the voting strategy is utilized to combine the ground object classification result graphs under a plurality of scales to form a ground object classification result graph. Specifically, each pixel value of the output result graph corresponds to three semantic values under different scales, and if any two of the semantic values are the same, the semantic value is assigned to the pixel; if all three semantic values are not identical, the intermediate-scale semantic value is assigned to the pixel. And after the multi-scale voting is finished, an output ground object classification result graph is obtained, and the size of the ground object classification result graph is the same as that of the input remote sensing image.

The following specifically describes the specific implementation procedure of the present application:

the processing process of the image block joint based on the multi-scale remote sensing image ground object classification comprises four steps of image blocking, semantic segmentation, same-scale splicing and multi-scale voting.

In the image blocking stage, the following operations are performed: 1) Reading an input large-size remote sensing image; 2) Dividing the image blocks according to a scale 1 (192 x 192); 3) Dividing the image blocks according to a scale 2 (256×256); 4) The tiles are partitioned by dimension 3 (320 x 320).

In the semantic segmentation stage, the following operations are performed: 1) Semantic segmentation is carried out on each image block with the scale of 1, and classification results (192 x 192) are obtained; 2) Semantic segmentation is carried out on each image block with the scale of 2, and classification results (256 x 256) are obtained; 3) And carrying out semantic segmentation on each image block with the dimension of 3 to obtain a classification result (320 x 320).

In the same-scale splicing stage, the following operations are performed: 1) Splicing the classification results corresponding to the image blocks under the scale 1; 2) Splicing the classification results corresponding to the image blocks under the scale 2; 3) And splicing the classification results corresponding to the image blocks under the scale 3.

In the multi-scale voting phase, the following operations are performed: 1) Combining the classification result graphs of the scale 1, the scale 2 and the scale 3 by utilizing a voting strategy to form a ground object classification result graph with the same size as the input remote sensing image; 2) And outputting a ground object classification result graph.

Fig. 2 is a processing effect diagram of the method of the present application. The image block joint processing method is used for remote sensing image cloud classification. In fig. 2, reference numeral 201 is an input remote sensing image, and reference numeral 202 is a cloud classification result of a conventional method of not processing an image block seam; reference numeral 203 is the cloud classification result of the method of the present application. In 202 and 203 of fig. 2, light gray is a cloudless area, dark gray is a thin cloud area, and white is a thick cloud area. As can be seen by comparing the local region 2021 of the conventional method cloud classification result 202 with the local region 2031 of the inventive method cloud classification result 203, there is a distinct linear seam in the conventional method cloud classification result 202, whereas there is substantially no image block seam in the inventive method cloud classification result 203.

Corresponding to the method of fig. 1, the embodiment of the application provides a multi-scale remote sensing image ground object classification system, which comprises:

In some embodiments, the segmentation module comprises:

the splice module includes:

the merging module comprises:

a third replacing unit, configured to assign a semantic value of an intermediate scale of three semantic values to a pixel if all three semantic values corresponding to the pixel value are different

Corresponding to the method of fig. 1, an embodiment of the present application provides an apparatus, including a processor and a memory;

the memory is used for storing programs;

the processor is configured to perform the method described in fig. 1 according to the program.

Corresponding to the method of fig. 1, an embodiment of the present application provides a storage medium storing a program that is executed by a processor to perform the method as described in fig. 1.

In summary, the method adopts a multi-scale technology, firstly performs image blocking on the remote sensing image under a plurality of scales, then performs semantic segmentation for feature classification under each scale, and finally combines feature classification results under each scale by utilizing a voting strategy to obtain feature classification results of the whole remote sensing image. The application avoids the problem of image block joint under a single scale.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments described above, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.

Claims

1. The method for classifying the ground features based on the multi-scale remote sensing images is characterized by comprising the following steps of:

combining the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph;

the step of dividing the image blocks under each scale in the initial image block set based on the preset semantic division model to obtain classification results of different scales comprises the following steps:

after the image blocks larger than the image size standard are subjected to reduction processing, inputting the semantic segmentation model, and amplifying the obtained classification result to the size which is the same as the size of the image blocks before the reduction processing;

the step of splicing the classification results corresponding to the image blocks under the same scale to obtain an initial feature classification result graph set under different scales comprises the following steps:

if one pixel corresponds to a plurality of image blocks, calculating the distance between the pixel and the centers of the plurality of image blocks, and taking the pixel value corresponding to the image block closest to the pixel as the value of the pixel;

the step of merging the initial feature classification result graph sets under different scales based on the voting strategy to obtain a target feature classification result graph comprises the following steps:

2. The method for classifying ground features based on multi-scale remote sensing images according to claim 1, wherein the step of performing image segmentation on the acquired remote sensing images based on different preset scales to obtain an initial image block set comprises the following steps:

acquiring a remote sensing image and configuring different scales;

3. The multi-scale remote sensing image ground feature classification method based on claim 2, wherein when the acquired remote sensing image is subjected to image segmentation, the image blocks at the boundary of the remote sensing image are moved to the inside of the remote sensing image by a plurality of pixel distances, so that the image blocks at the boundary of the remote sensing image do not exceed the boundary of the remote sensing image.

4. Multi-scale remote sensing image ground object classification system is characterized by comprising:

the merging module is used for merging the initial ground object classification result graph sets under different scales based on a voting strategy to obtain a target ground object classification result graph;

the segmentation module comprises:

the splice module includes:

the merging module comprises:

5. An apparatus comprising a processor and a memory;

the memory is used for storing programs;

the processor is configured to perform the method of any of claims 1-3 according to the program.

6. A storage medium storing a program for execution by a processor to perform the method of any one of claims 1-3.