CN115908437A

CN115908437A - Model training method, image segmentation method, terminal device and computer medium

Info

Publication number: CN115908437A
Application number: CN202211387197.XA
Authority: CN
Inventors: 周军
Original assignee: Ping An Bank Co Ltd
Current assignee: Ping An Bank Co Ltd
Priority date: 2022-11-07
Filing date: 2022-11-07
Publication date: 2023-04-04

Abstract

The model training method comprises the steps of carrying out filtering processing on an original training image set, amplifying the contrast of the original training image set, and carrying out resampling processing on the original training image set, so that the number of small-class samples is increased, and the training effect of a segmentation model is improved; according to the image segmentation method, the remote sensing images to be detected are subjected to predictive addition through the first segmentation model and the second segmentation model, and the accuracy of image segmentation is improved.

Description

Model training method, image segmentation method, terminal device and computer medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a model training method, an image segmentation method, a terminal device, and a computer medium.

Background

Remote sensing refers to the remote survey of a target by using a sensor carried by an aircraft, and the main expression of the remote sensing is remote imaging. With the continuous improvement and improvement of remote sensing technology and deep learning technology and the continuous development of hardware equipment, the application of remote sensing images is closely related to our life. The remote sensing image interpretation is widely applied to the fields of geographic national situation survey, national soil resource survey, urban construction, agricultural production, forest protection and the like, and has gradually become an important way for people to understand the earth ground.

The remote sensing image is widely applied to the agricultural field, and the functions of crop yield estimation, crop area statistics, crop health monitoring and the like can be realized by using the remote sensing image. The existing method for calculating the area of the crop region based on the remote sensing image generally utilizes a segmentation model to segment pixel points of different crop regions so as to realize the statistics of the area of the crop region, however, the existing segmentation method generally utilizes the RGB three-channel remote sensing image for segmentation, the segmentation accuracy is not high, and the segmented crop region is inaccurate.

Disclosure of Invention

In order to solve the technical problem, the application provides a model training method, an image segmentation method, a terminal device and a computer medium.

In order to solve the above problem, the present application provides a first technical solution: a model training method is provided, which is applied to a segmentation model and comprises the following steps: obtaining a remote sensing image, cutting the remote sensing image into a plurality of images with preset sizes, and taking the images with the preset sizes as an original training image set; filtering the original training image set; resampling the original training image set after filtering processing, and adding the images after resampling processing into the training image set; training the segmentation model based on the training image set.

The original training image set comprises a first near-infrared channel image and a three-channel image; the step of filtering the original training image set includes: performing Fourier transform on the first near-infrared channel image to obtain a frequency domain image of a near-infrared channel; filtering the frequency domain image to filter low-frequency information of the frequency domain image; performing inverse Fourier transform on the filtered frequency domain image to obtain a second near-infrared channel image; and overlapping the second near-infrared channel image and the three-channel image, and adding the overlapped image into the original training image set.

Wherein, the step of resampling the original training image set after the filtering process and adding the resampled image into the training image set comprises: cutting out a first image from the remote sensing image, wherein the size of the first image is located in a preset pixel area; and scaling the first image to a preset size so as to add the scaled first image into the training image set.

Wherein, the step of resampling the original training image set after the filtering process and adding the resampled image into the training image set comprises: cutting a plurality of first images from the remote sensing image, wherein the size of the first images is located in a preset pixel area, and the first images are used as a first training image set; acquiring a plurality of second images from the original training image set to serve as a second training image set; selecting a preset number of images from the first training image set and the second training image set, and performing turning processing and/or cutting processing on the selected images; and splicing the processed images of the preset number, and zooming the spliced images to the preset size so as to add the zoomed images into a training image set.

Wherein the preset number comprises 4 and/or 6, and the number ratio of the first training image set to the second training image set is 4:6.

wherein the preset number comprises 4 and 6; after the step of stitching the processed images of the preset number and scaling the stitched images to the preset size to add the scaled images to the training image set, the model training method further includes: when the preset number is 4, zooming the spliced images to the preset size and using the images as a third training image set; when the preset number is 6, zooming the spliced images to the preset size and using the images as a fourth training image set; the number ratio of the first training image set, the third training image set, and the fourth training image set is 4:2:1.

wherein the step of training the segmentation model based on the training image set comprises: acquiring a segmentation label of the training image set; acquiring a smooth weight based on the ratio of the number of training images of each segmentation label to the total number of the training image sets; calculating a loss function for each of the segmentation labels based on the smoothing weights to train the segmentation model with the loss function.

In order to solve the above problem, the present application provides a second technical solution: providing an image segmentation method, applied to a segmentation model obtained by using the training method as described above, wherein the segmentation model includes a first segmentation model and a second segmentation model, the pixel size of a training image set of the first segmentation model is a first size, the pixel size of a training image set of the second segmentation model is a second size, and the second size is at least twice the first size; the image segmentation method comprises the following steps: inputting the remote sensing image to be detected into the first segmentation model to obtain a first prediction result of the remote sensing image to be detected; inputting the remote sensing image to be detected into the second segmentation model to obtain a second prediction result of the remote sensing image to be detected, wherein the first prediction result and the second prediction result are prediction results of the same segmentation region relative to the remote sensing image to be detected; and adding the first prediction result and the second prediction result according to a preset weight to obtain a classification result of the remote sensing image to be detected.

In order to solve the above problems, the present application provides a third technical solution: there is provided a terminal device comprising a processor and a memory connected to the processor, wherein the memory has a program image stored therein, and the processor retrieves the program image stored in the memory to perform the method as described in any one of the above.

In order to solve the above problem, the present application provides a fourth technical solution: there is provided a computer readable storage medium storing program instructions which are executed to implement the method as described above.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:

FIG. 1 is a schematic flow chart diagram of a first embodiment of a model training method provided herein;

FIG. 2 is a schematic diagram illustrating the operation of one embodiment of a remotely sensed image provided herein;

FIG. 3 is a schematic flow chart diagram of a second embodiment of a model training method provided herein;

FIG. 4 is a schematic flow chart diagram of a third embodiment of a model training method provided herein;

FIG. 5 is a schematic diagram illustrating the operation of one embodiment of a resampling process as provided herein;

FIG. 6 is a flowchart illustrating a first embodiment of an image segmentation method provided herein;

FIG. 7 is an operational schematic diagram of one embodiment of model integration provided herein;

FIG. 8 is a schematic diagram illustrating the operation of one embodiment of the morphological processing provided herein;

FIG. 9 is a flowchart illustrating a second embodiment of an image segmentation method provided by the present application;

FIG. 10 is a schematic diagram illustrating the operation of one embodiment of image segmentation provided herein;

FIG. 11 is a block diagram of an embodiment of a terminal device provided herein;

FIG. 12 is a block diagram of an embodiment of a computer-readable storage medium provided herein.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any inventive step based on the embodiments in the present application, are within the scope of protection of the present application.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present application, the directional indications are only used to explain the relative position relationship between the components, the motion situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present application.

In the agricultural field, the remote sensing image has unique characteristics, and a series of items such as crop yield estimation, crop area statistics, crop health monitoring and the like can be well realized by using the remote sensing image. The combination of remote sensing images and Artificial Intelligence (AI) can greatly enable agriculture, and the development of assisted agriculture is more stepped.

The existing methods for calculating the area of a crop region based on remote sensing images usually adopt an example segmentation network model, and the idea of segmenting various irregular regions of crops can be realized by utilizing a segmentation network, so that pixel points of different crop regions are segmented to realize the statistics of the area of the crop region. And is often a remote sensing image using three channels of RGB (red, green, blue). The existing method for calculating the area of the crop region by using the near-infrared remote sensing image only takes the near-infrared image as a simple channel image, the phenomenon of data imbalance of each type is serious because of the inconsistency of the types of the crop regions in the agricultural field, the existing method for calculating the area of the crop region by using the near-infrared remote sensing image depends on a model and labeled data, and the characteristics of the crop remote sensing image are not carefully analyzed, so that the effect of the divided crop region is general.

In view of the above, the present application provides a model training method, an image segmentation method, a terminal device, and a computer medium, which are applied to a segmentation model, the model training method is used for training the segmentation model, and the image segmentation method is used for segmenting a remote sensing image through the segmentation model and calculating an area of a crop region. According to the image segmentation method, data of the near-infrared channel are fully utilized in a data layer, so that the segmentation accuracy is improved, the data imbalance is reduced by resampling in a data processing stage, the model prediction accuracy is improved by using a multi-model fusion and TTA (test time augmentation) mode in a model stage, and the accuracy of agricultural remote sensing image statistics on crop area is greatly improved.

The model training method and the image segmentation method can be applied to terminal equipment and applied to the technical field of computers. Specifically, the segmentation model may be used to segment and predict the remote sensing image, and calculate the area of the crop region according to the prediction result, and the calculated area of the crop region may be used in business scenarios such as e-commerce, e-payment, securities, e-bank, tax transaction, credit card, online shopping, insurance, and for example, may be used to assist in transacting business such as agricultural loan of a bank, and the use of the terminal device is not specifically limited herein.

The terminal equipment of the application can be a server, and can also be a system in which the server and a local terminal are matched with each other. Accordingly, the electronic terminal may include various components, such as various units, sub-units, modules, and sub-modules, which are all disposed in the server, or disposed in the server and the local terminal, respectively.

Further, the server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules, for example, software or software modules for providing distributed servers, or as a single software or software module, and is not limited herein. In some possible implementations, the model training method and/or the image segmentation method of the embodiments of the present application may be implemented by a processor calling computer-readable instructions stored in a memory.

Referring to fig. 1-2, fig. 1 is a schematic flow chart of a first embodiment of a model training method provided by the present application, and fig. 2 is a schematic operation diagram of an embodiment of a remote sensing image provided by the present application. As shown in fig. 1, the model training method proposed in this embodiment is applied to a segmentation model, and includes the following specific steps:

step S11: and acquiring a remote sensing image, cutting the remote sensing image into a plurality of images with preset sizes, and taking the images with the preset sizes as an original training image set.

Specifically, the segmentation model is a model for segmenting an image, the segmentation model is used for predicting what object or type each frame of pixel of the input image belongs to, the segmentation model includes but is not limited to a Transformer segmentation network model manufactured under the framework of an encoding-decoding framework, an attention mechanism and the like, the segmentation model may also be a detection model based on a visual recognition algorithm, and an application framework and a principle of the segmentation model are not particularly limited.

The remote sensing image is an image for detecting a target through various sensors, and the segmentation model can analyze, reason and judge through characteristic information of various recognition targets provided by the remote sensing image so as to achieve the purpose of recognizing the target or phenomenon. The remote sensing image can be obtained through remote sensing shooting, data entry, network query and the like. The remote sensing image may be a multi-channel image, for example, a plurality of bands or spectrum segments may be captured by a remote sensing technology, and the remote sensing image may include an RGB three-channel image and a near-infrared channel image.

Since the remote sensing image is usually used for performing remote surveying and the area of the remote sensing image is large, in this embodiment, the remote sensing image is cut into a plurality of original training images with preset sizes to form an original training image set, and then the type prediction of the original training images is realized by performing semantic segmentation of each frame of pixel points on the original training images. The preset size includes, but is not limited to, regular sizes such as 512 × 512, 1024 × 1024, 1536 × 1536, 2048 × 2048, 2248 × 2248, and the selection of the preset size may be determined according to the original size of the remote sensing image, the segmentation purpose, the type of the segmentation model, and the like, and is not specifically limited herein.

As shown in fig. 2, when the remote sensing image is cut, a dashed frame with a preset size may be generated on the remote sensing image, and the remote sensing image is cut according to the position of the dashed frame. Specifically, the cutting mode can start cutting from any direction of the upper left corner, the upper right corner, the lower left corner, the lower right corner and the like, when the opposite direction of the starting direction is cut, if the edge area of the remote sensing image does not meet the preset size, pixel filling can be carried out on the edge area of the remote sensing image, so that the cut image of the edge area can meet the preset size, the original training image set comprises images of all areas of the remote sensing image, and the result accuracy of the segmentation model is ensured.

Step S12: and carrying out filtering processing on the original training image set.

In order to improve the capturing capability of the segmentation model on the gradient transformation information of the original training image, filtering processing may be performed on the original training image set before performing segmentation prediction on the original training image, where the filtering processing includes, but is not limited to, low-frequency filtering, high-pass filtering, and the like.

Step S13: and performing resampling processing on the original training image set after filtering processing, and adding the images after resampling processing into the training image set.

After the filtering processing is carried out on the images of the original training image set, the low-frequency information of the images is filtered out, so that the segmentation model is more sensitive to the gradient transformation information of the original training image. The segmentation models have different purposes, and when there are many objects or types in the remote sensing image and the sample size difference between the objects or types is large, the original training image set after filtering processing can be resampled in order to improve the training effect of the segmentation models on a small number of samples. Specifically, the resampling processing comprises random angle turning, random size cutting, gaussian blur, normalization processing and the like, and the number of training images of the small-category targets can be increased through the resampling processing so as to reduce the category imbalance phenomenon of the remote sensing images.

Step S14: the segmentation model is trained based on a training image set.

And after filtering processing and resampling processing, acquiring a training image set, and training the segmentation model based on the training image set. The training image set comprises original training images and training images added after resampling processing.

Therefore, in the embodiment of the application, the model training method comprises the steps of obtaining a remote sensing image, cutting the remote sensing image into a plurality of images with preset sizes, and taking the images with the preset sizes as an original training image set; filtering the original training image set; resampling the original training image set after filtering processing, and adding the images after resampling processing into the training image set; the segmentation model is trained based on a training image set. According to the model training method, the original training image set is subjected to filtering processing, the contrast of the original training image set is amplified, the original training image set is subjected to resampling processing, the number of small-class samples is increased, and the training effect of the segmented model is improved.

In one embodiment, the original training image set comprises a first near-infrared channel image and a three-channel image, so that the original training image set has the advantages of near-infrared and visible light band information, the images are rich in color and have relatively rich geological information and earth surface environment information, the segmentation model can detect objects and types by utilizing rich image information, and the accuracy of model training is improved.

Specifically, please refer to fig. 3, wherein fig. 3 is a schematic flowchart of a second embodiment of the model training method provided in the present application. As shown in fig. 3, in the present embodiment, the step S12 further includes the following steps:

step S21: and performing Fourier transform on the first near-infrared channel image to obtain a frequency domain image of the near-infrared channel.

Specifically, the first near-infrared channel image is a remote sensing image shot under a near-infrared band of 0.76 to 0.90 micrometers, the three-channel image is a remote sensing image loaded and rendered through three channels of red, green and blue, and the original training image set can be understood as a channel image including the near-infrared band and a composite image in which the three bands are rendered through the three channels of red, green and blue, that is, the original training image set can be understood as including four remote sensing image bands.

The near-infrared band is located in a high-reflection area of the plant, and a large amount of plant information can be reflected by using the near-infrared band to perform remote sensing image acquisition, so that the near-infrared channel map can be used for identifying and classifying the plant. In the agricultural field, for example, when calculating the agricultural crop area of remote sensing data by a segmentation model, the detection accuracy of the segmentation model can be increased by using a near infrared channel image. Further, when the original training image set is subjected to filtering processing, in order to further enlarge the effective region of the near-infrared channel image, fast Fourier Transform (FFT) may be performed on the first near-infrared channel image of the original training image set to convert the first near-infrared channel image into a Fourier frequency domain image, where the specific formula is as follows:

wherein, F (omega) is the image function of F (t), and F (t) is the image primitive function of F (omega).

Step S22: and carrying out filtering processing on the frequency domain image so as to filter the low-frequency information of the frequency domain image.

And after the frequency domain image of the first near-infrared channel image is obtained, carrying out high-pass filtering processing on the frequency domain image so as to filter the low-frequency information of the frequency domain image. Specifically, the frequency domain image may be filtered by a first-order high-pass filter, and the specific formula is as follows:

Y(n)＝αX(n)+(1-α)Y(n-1)；

in the formula: alpha is a filter coefficient; x (n) is the sampling value of the time; y (n-1) is the last filtering output value; y (n) is the output value of the current filtering. The first-order high-pass filtering method adopts the sampling value of the time and the output value of the last filtering to carry out weighting to obtain an effective filtering value, so that the output has a feedback effect on the input.

Further, in order to ensure that the parameters of the first-order high-pass filter can be suitable for crops, when the remote sensing image is used for training the segmentation model, the Support Vector Machine (SVM) of the gaussian kernel function can be synchronously used for training the filter parameters, or the segmentation model and the support vector machine of the gaussian kernel function can be fused, so that the segmentation model can simultaneously update the filter parameters when being trained and updated according to the loss function, and the reliability of filtering processing is improved.

Step S23: and performing inverse Fourier transform on the filtered frequency domain image to obtain a second near-infrared channel image.

And after the frequency domain image filtering processing is finished, performing inverse Fourier transform on the filtered frequency domain image to convert the filtered frequency domain image into an image of a near-infrared channel so as to obtain a second near-infrared channel image. It is understood that the first near-infrared channel image is an image without being subjected to wave filtering processing, and the second near-infrared channel image is an image subjected to wave filtering processing. The formula for the inverse fourier transform is as follows:

Step S24: and overlapping the second near-infrared channel image and the three-channel image, and adding the overlapped image into the original training image set.

After the inverse Fourier transform is performed, the frequency domain image of the near-infrared channel is converted into a spatial domain image, namely a second near-infrared channel image. The second near-infrared channel image is filtered, low-frequency information is filtered, and information with obvious gradient transformation is reserved, so that the contrast of the second near-infrared channel image is increased, and the plant information contained in the second near-infrared channel image is easy to analyze by the segmentation model when the second near-infrared channel image is detected.

And after the second near-infrared channel image is obtained, overlapping the second near-infrared channel image with the original three-channel image, and adding the overlapped image into the original training image set. Specifically, the superimposing manner may be sequential superimposing, that is, superimposing the second near-infrared channel image on the channels of the three-channel image, for example, the channel sequence of the three-channel image is R, G, B, and the near-infrared channel is denoted as N, that is, the superimposed image is a four-channel image of RGBN.

Therefore, in the embodiment of the application, the low-frequency filtering processing is carried out on the near-infrared channel image, so that the effective information and the effective area of the near-infrared channel image are further amplified, and the accuracy of model segmentation is improved.

In an embodiment, step S13 further includes the steps of: cutting a first image from the remote sensing image, wherein the size of the first image is located in a preset pixel area; and scaling the first image to a preset size so as to add the scaled first image into the training image set.

Specifically, in this embodiment, the resampling process is performed by randomly cropping any area of the remote sensing image to crop the first image from the remote sensing image, and the size of the first image is within the preset pixel area. The specific range of the preset pixel region is related to the preset size of the original training image set, the training duration, the training requirement, and other factors, for example, when the size of the original training image set is 512 × 512, the preset pixel region may be set in a 384-1536 pixel region, and the range of the preset pixel region may also be set to be larger, which is not limited herein.

After the randomly cropped first image is obtained, the first image is zoomed to a preset size, and the zoomed first image is added into the training image set to increase the number of training images.

In an embodiment, please refer to fig. 4-5, fig. 4 is a flowchart illustrating a third embodiment of the model training method provided by the present application, and fig. 5 is an operation diagram illustrating an embodiment of the resampling process provided by the present application. As shown in fig. 4, in the present embodiment, step S13 further includes the following steps:

step S31: cutting out a plurality of first images from the remote sensing image, wherein the size of the first images is located in a preset pixel area, and the first images are used as a first training image set.

Similarly, in a manner similar to the resampling processing method of the previous embodiment, random cropping is performed on an arbitrary area of the remote sensing image to crop a first image from the remote sensing image, where the size of the first image is within the preset pixel area. A first image cropped from the remote sensing image is used as a first training image set.

Step S32: a plurality of second images are acquired from the original training image set as a second training image set.

And acquiring a plurality of second images from the images in the original training image set, wherein the second images are remote sensing image areas cut according to a preset size, and the plurality of second images are used as a second training image set.

Step S33: and selecting a preset number of images from the first training image set and the second training image set, and turning and/or cutting the selected images.

The preset number of images are selected from the first training image set and the second training image set, and it is understood that the selected images may be all first images, all second images, or at least one first image and at least one second image.

Turning and/or cropping the selected image, for example, in one embodiment, the selected image may be turned at a random angle, the turning angle including 0 ° -360 °; in another embodiment, the selected image may also be cut in a random size, specifically, the cut size should be greater than 0, and the cut size should be smaller than or equal to the maximum value of the preset pixel area of the first image, or the cut size should be smaller than or equal to the preset size; in other embodiments, the selected image may be subjected to random angle flipping processing, and then the image subjected to the flipping processing is cut in random size, or the selected image is subjected to random size cutting processing, and then the image subjected to the cutting processing is subjected to random angle flipping processing.

The training image can be enhanced by the turning processing of the random angle and/or the cutting processing of the random size, so that the segmentation model can be helped to extract and learn the relevant characteristics of the training image in a mode of not being influenced by the position, illumination and the like, and the accuracy of the segmentation model is improved.

Step S34: and splicing the processed images of the preset number, and zooming the spliced images to a preset size so as to add the zoomed images into a training image set.

After the selected images are subjected to flipping processing and/or cropping processing, the processed images of the preset number are spliced, and the splicing combination modes can be various, for example, when the preset number is 4, the combined images can include upper left, upper right, lower left and lower right, that is, the splicing combination modes can be 24. Therefore, any splicing mode can be randomly selected to splice the processed images with the preset number. After image stitching, the scaled images are added to the training image set.

Further, after the selected images are cut and/or turned, before a preset number of images are spliced, the processed images can be scaled to a preset size, at this time, after the preset number of images are spliced, the spliced size is the product of the preset size and the preset number, and in order to ensure that the size of the training image set is fixed, the spliced images need to be scaled again to be scaled to the preset size. For example, when the preset size is 512 × 512 and the preset number is 4, after the processed image is scaled to 512 × 512, the size of the stitched image is 1024 × 1024, at this time, the size needs to be scaled again to 512 × 512 image, and the scaled image is added to the training image set.

It can be understood that, in this embodiment, the resampling processing mode includes cutting any region of the remote sensing image, and stitching a preset number of first images and second images, and both the first training image set and the stitched and scaled images may be used as training images of the segmentation model. After resampling processing is carried out on the original training image set, the number of small-class samples can be increased remarkably, and then the training effect of the segmentation model is improved.

Optionally, the preset number includes 4 and/or 6; the number ratio of the first training image set to the second training image set is 4:6.

specifically, when the preset number is 4, randomly selecting 4 images from a first training image set and a second training image set, and splicing the 4 images to form a new training image; and when the preset number is 6, randomly selecting 6 images from the first training image set and the second training image set, and splicing the 6 images to form a new training image.

In order to control the training degree of the segmentation model on the simple samples and the complex samples, the number ratio of the first training image set to the second training image set may be limited, that is, the weight ratio of the first image to the second image is set to 4:6.

further, the preset number includes 4 and 6. After step S34, the model training method further includes: when the preset number is 4, zooming the spliced images to a preset size and using the zoomed images as a third training image set; when the preset number is 6, zooming the spliced images to a preset size and using the images as a fourth training image set; the number ratio of the first training image set, the third training image set and the fourth training image set is 4:2:1.

specifically, when the preset number is 4 and 6, the preset number of resampling is 4 and the preset number of resampling is 6. The training images obtained after resampling are formed by splicing a plurality of images, the characteristics of the samples are complex, in order to ensure the authenticity of the samples, the training degree of the segmentation model on simple samples and complex samples is controlled, the number ratio of the first training image set, the third training image set and the fourth training image set can be limited, and in this embodiment, the number ratio of the first training image set, the third training image set and the fourth training image set can be 4:2:1.

it can be understood that the training image sets for training in the segmentation model include an original training image set cut out from the remote sensing image according to a preset size, a first training image set cut out from the remote sensing image at will, a third training image set subjected to resampling processing with a preset number of 4, and a fourth training image set subjected to resampling processing with a preset number of 6.

In one embodiment, step S14 further includes the steps of: acquiring a segmentation label of a training image set; acquiring a smooth weight based on the ratio of the number of the training images of each segmentation label to the total number of the training image sets; and calculating a loss function of each segmentation label based on the smooth weight so as to train the segmentation model by using the loss function.

Specifically, since the size of the remote sensing image is generally large, when there are many objects or types in the remote sensing image and the difference in sample size between the objects or types is large, for example, when the area of the crop region is calculated using the segmentation model, the ratio of the number of images of soybean crops to the total number of images is 0.9, the ratio of the number of images of wheat crops to the total number of images is 0.01, and the difference between the number of soybean crops and the number of wheat crops is 90 times, which results in sample imbalance. Therefore, in order to improve the training effect of the segmentation model on a small number of samples, the multiple difference can be corrected by setting a smoothing weight.

Specifically, the smoothing weight is calculated according to the ratio of the number of training images of each segmentation label to the total number of training image sets, and the formula is as follows:

wherein m represents the category of crop planting to be divided (such as wheat, corn, soybean, rice, etc.), e is natural base number, and S _i Expressed as the ratio of the number of training images comprising the i-th crop to the number of total training images.

In an embodiment, the original training image set includes a first near-infrared channel image and a three-channel image, and in this embodiment, the training image set is further normalized, for example, the three-channel image is normalized according to the Imagenet standard, and the mean and variance of the three-channel image are [111.45638763,113.8965259,112.22587782], [25.93080005,25.29681979,26.92847348]; and (3) carrying out normalization processing on the first near-infrared channel image by calculating a statistical result of the total data, wherein the mean value and the variance are 118.30 and 30.93 respectively.

In an embodiment, the model training method may use an AdamW optimizer or two optimization algorithms of AdaGrad and RMSProp to obtain model parameters of the segmented network model, and a learning rate of the AdamW optimizer may be set to 0.00006.

The present application further proposes an image segmentation method, which is applied to the segmentation model obtained by training according to any of the above embodiments, where the segmentation model includes a first segmentation model and a second segmentation model, the pixel size of the training image set of the first segmentation model is a first size, the pixel size of the training image set of the second segmentation model is a second size, and the second size is at least twice the first size.

Referring to fig. 6-7, fig. 6 is a flowchart illustrating a first embodiment of an image segmentation method provided by the present application, and fig. 7 is an operation diagram illustrating an embodiment of model integration provided by the present application. As shown in fig. 6, in the embodiment of the present application, an image segmentation method includes:

step S41: and inputting the remote sensing image to be detected into the first segmentation model to obtain a first prediction result of the remote sensing image to be detected.

The first segmentation model and the second segmentation model are both segmentation models trained by using any of the above embodiments, the training image set of the second segmentation model has a pixel size at least twice that of the training image set of the first segmentation model, and the multiples of the second size and the first size may be multiples of integers such as 3 times, four times, etc., for example, the first size is 512 × 512 and the second size is 1536 × 1536, or the first size is 1024 × 1024 and the second size is 2048 × 2048, where the first size, the second size and the multiples thereof are not particularly limited. It is to be understood that the first segmentation model is a small-size segmentation model, the second segmentation model is a large-size segmentation model, and the first segmentation model and the second segmentation model may use a segmentation network model of the same architecture.

Step S42: and inputting the remote sensing image to be detected into a second segmentation model to obtain a second prediction result of the remote sensing image to be detected, wherein the first prediction result and the second prediction result are prediction results of the same segmentation region relative to the remote sensing image to be detected.

Specifically, as shown in fig. 7, the first prediction result is a prediction result obtained by analyzing a plurality of first cut images when the remote sensing image to be detected is cut by the first cutting model according to the first size, and the second prediction result is a prediction result obtained by analyzing a plurality of second cut images when the remote sensing image to be detected is cut by the second cutting model according to the second size. When the second size is twice the first size, and the second segmentation model analyzes the second segmentation image located in the same segmentation area relative to the same segmentation area of the remote sensing image to be detected, the first segmentation model needs to analyze four first segmentation images located in the same segmentation area, so that the first prediction result and the second prediction result are prediction results of the same segmentation area.

Step S43: and adding the first prediction result and the second prediction result according to a preset weight to obtain a classification result of the remote sensing image to be detected.

And after a first prediction result and a second prediction result of the same segmentation region of the remote sensing image to be detected are obtained, the first prediction result and the second prediction result are added according to preset weight so as to obtain a classification result of the remote sensing image to be detected. Exemplary, the sum formula is as follows:

result＝0.6*B5+0.4*B2

wherein, B5 is the second prediction result of the second segmentation model, B2 is the first prediction result of the first segmentation model, and result is the classification result. As shown in the formula, the influence of the result of the second segmentation model on the classification result is 60%, and the influence of the first segmentation model on the classification result is 40%.

It can be understood that when the first segmentation model and the second segmentation model perform analysis and prediction on the same segmentation region, the prediction result is the confidence that the crop in the segmentation region belongs to a certain object or type, and therefore, after the first prediction result and the second prediction result are summed according to the preset weight, the prediction confidence that the crop in the segmentation region belongs to a certain object or type can be obtained, so as to obtain the classification result of the remote sensing image to be detected according to the prediction confidence.

In the embodiment, the remote sensing image to be detected is analyzed by using the large-size segmentation model and the small-size segmentation model respectively, and the final fused result is obtained after the analysis results are weighted, so that the accuracy of image segmentation is further improved while the image segmentation speed is ensured.

In an embodiment, after the classification result of the remote sensing image to be detected for a certain segmentation region is obtained, the classification results of all segmentation regions of the whole remote sensing image to be detected can be counted to obtain a final remote sensing classification map.

In an embodiment, after step S43, the remote sensing image segmentation method further includes: and carrying out morphological processing on the remote sensing image to be detected based on the classification result to obtain a remote sensing classification image.

Referring to fig. 8, fig. 8 is a schematic operation diagram of an embodiment of the morphological processing provided by the present application. As shown in fig. 8, a segmentation prediction graph can be obtained according to the classification results of all the segmented regions, and because the segmentation prediction graph is obtained by combining the prediction results of a plurality of segmented regions on the remote sensing image to be detected, a plurality of holes or small connected regions exist in the segmentation prediction graph, which affects the visual effect of the segmentation prediction graph.

Therefore, the holes can be filled and the small connected regions can be removed by performing morphological processing on the segmentation prediction map. Specific morphological processing methods include, but are not limited to, etching, swelling, and opening/closing operations. For example, in one embodiment, the segmentation prediction map may be expanded to increase the size of the lighter objects and decrease the size of the darker objects in the image, so as to fill the area smaller than the specified pixel size in the closed area on the segmentation prediction map, for example, the area smaller than 3*3 pixels in the closed area may be filled; in other embodiments, the minimum value of the area at each position in the image can be used as the output gray value of the position by performing the erosion operation on the segmentation prediction map to erase the connected area smaller than the specified tassel size, for example, the connected area smaller than 4*4 pixels can be erased.

As shown in fig. 8, after morphological processing is performed on the segmentation prediction graph, holes and small connected regions in the segmentation prediction graph can be removed, so that the same kind of regions of the remote sensing classification graph are coherent, the visual effect of the remote sensing classification graph is better, and the use experience of a user is improved.

Referring to fig. 9-10, fig. 9 is a flowchart illustrating a second embodiment of the image segmentation method provided by the present application, and fig. 10 is an operation diagram illustrating an embodiment of image segmentation provided by the present application.

As shown in fig. 9, in an embodiment, step S41 further includes the following steps:

step S51: and inputting the remote sensing image to be detected into the first segmentation model so as to segment the remote sensing image to be detected into a plurality of images to be detected with first sizes.

Specifically, when the remote sensing image to be detected is cut, a dashed frame with a first size can be generated on the remote sensing image, and the remote sensing image is cut according to the position of the dashed frame. The cutting can be started from any direction of the upper left corner, the upper right corner, the lower left corner, the lower right corner and the like, when the edge area of the remote sensing image to be detected is cut, when the area to be cut does not meet the size of the first size, pixel filling can be carried out on the area to be cut, so that the image to be detected, which is cut by the area to be cut, can meet the first size, and the accuracy of the result of the segmentation model is ensured.

Step S52: and analyzing the image to be detected, the first reversed image of the image to be detected and the second reversed image of the image to be detected respectively.

After a plurality of images to be detected with first sizes are obtained, each image to be detected is analyzed, and an up-down turning image (namely, a first turning image) of the image to be detected and a left-right turning image (namely, a second turning image) of the image to be detected are analyzed simultaneously, so that each image to be detected is analyzed three times.

It can be understood that the first reversed image and the second reversed image are images obtained by reversing the image to be detected in different directions, and the effect of test data enhancement (TTA) is achieved by transforming the image to be detected.

Step S53: and carrying out arithmetic averaging on the analysis result of the image to be detected, the analysis result of the first reversed image and the analysis result of the second reversed image, and taking the result of the arithmetic averaging as a first prediction result of the image to be detected.

And obtaining an analysis result of the image to be detected, an analysis result of the first reversed image and an analysis result of the second reversed image, carrying out arithmetic average on the results of the three-time analysis, and taking the average of the three-time analysis as a final first prediction result of the image to be detected.

In this embodiment, since the image to be detected is subjected to two times of transformation enhancement, and the average value of the two times of transformation results and the original result is used as the final prediction result, the effects of smoothing the result and improving the generalization ability are achieved, and the accuracy of the first segmentation model is further improved.

In an embodiment, the processing procedure of the remote sensing image to be detected by the second segmentation model is similar to the steps S51 to S53, and the cutting size (second size) of the second segmentation model is at least twice as large as the cutting size (first size) of the first segmentation model, and when the second size is twice as large as the first size, as shown in fig. 10.

Referring to fig. 11, fig. 11 is a schematic frame diagram of a terminal device according to an embodiment of the present disclosure. As shown in fig. 11, the terminal device 100 includes a processor 101 and a memory 102 connected to the processor 101, wherein the memory 102 stores program data, and the processor 101 retrieves the program data stored in the memory 102 to execute all the methods described above.

Optionally, in an embodiment, the processor 101 is configured to execute program data to implement the following model training method: obtaining a remote sensing image, cutting the remote sensing image into a plurality of images with preset sizes, and taking the images with the preset sizes as an original training image set; filtering the original training image set; resampling the original training image set after filtering processing, and adding the images after resampling processing into the training image set; the segmentation model is trained based on a training image set.

In another embodiment, the processor 101 is configured to execute program data to implement an image segmentation method as follows: inputting the remote sensing image to be detected into a first segmentation model to obtain a first prediction result of the remote sensing image to be detected; inputting the remote sensing image to be detected into a second segmentation model to obtain a second prediction result of the remote sensing image to be detected, wherein the first prediction result and the second prediction result are prediction results of the same segmentation region relative to the remote sensing image to be detected; and adding the first prediction result and the second prediction result according to a preset weight to obtain a classification result of the remote sensing image to be detected.

The processor 101 may also be referred to as a Central Processing Unit (CPU). The processor 101 may be an electronic chip having signal processing capabilities. The processor 101 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 102 may be a memory bank, a TF card, etc., and may store all information in the terminal device 100, including input raw data, computer programs, intermediate operation results, and final operation results, all stored in the storage 102. Which stores and retrieves information based on the location specified by the processor 101. With the memory 102, the terminal device 100 has a memory function to ensure normal operation. The storage 102 of the terminal device 100 may be classified into a main storage (internal storage) and a sub-storage (external storage) according to the purpose, and there is a classification method into an external storage and an internal storage. The external memory is usually a magnetic medium, an optical disk, or the like, and can store information for a long period of time. The memory refers to a storage component on the main board, which is used for storing data and programs currently being executed, but is only used for temporarily storing the programs and the data, and the data is lost when the power is turned off or the power is cut off.

Referring to fig. 12, fig. 12 is a block diagram of an embodiment of a computer-readable storage medium provided in the present application. As shown in fig. 12, the computer readable storage medium 110 has stored therein program instructions 111 that are capable of implementing all of the methods described above.

The unit in which the functional units in the embodiments of the present application are integrated may be stored in the computer-readable storage medium 110 if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present application may be substantially implemented or contribute to the prior art, or all or part of the technical solution may be embodied in the form of a software product, and the computer-readable storage medium 110 includes several instructions in a program instruction 111 to enable a computer device (which may be a personal computer, a system server, or a network device, etc.), an electronic device (for example, MP3, MP4, etc., and may also be a mobile terminal such as a mobile phone, a tablet computer, a wearable device, etc., or a desktop computer, etc.), or a processor (processor) to execute all or part of the steps of the method of the embodiments of the present application.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media 110 (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It is to be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by the computer-readable storage medium 110. These computer-readable storage media 110 may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the program instructions 111, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer-readable storage media 110 may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the program instructions 111 stored in the computer-readable storage media 110 produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer-readable storage media 110 may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the program instructions 111 that execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device (such as a personal computer, server, network device, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions).

The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims

1. A model training method is applied to a segmentation model and comprises the following steps:

obtaining a remote sensing image, cutting the remote sensing image into a plurality of images with preset sizes, and taking the images with the preset sizes as an original training image set;

filtering the original training image set;

resampling the original training image set after filtering processing, and adding the images after resampling processing into the training image set;

training the segmentation model based on the training image set.

2. The model training method of claim 1, wherein the original training image set comprises a first near-infrared channel image and a three-channel image; the step of filtering the original training image set includes:

performing Fourier transform on the first near-infrared channel image to obtain a frequency domain image of a near-infrared channel;

filtering the frequency domain image to filter low-frequency information of the frequency domain image;

performing inverse Fourier transform on the filtered frequency domain image to obtain a second near-infrared channel image;

and overlapping the second near-infrared channel image and the three-channel image, and adding the overlapped image into the original training image set.

3. The model training method according to claim 1, wherein the step of resampling the filtered original training image set and adding the resampled image to the training image set comprises:

cutting out a first image from the remote sensing image, wherein the size of the first image is located in a preset pixel area;

and scaling the first image to a preset size so as to add the scaled first image into the training image set.

4. The model training method according to claim 1, wherein the step of resampling the filtered original training image set and adding the resampled image to the training image set comprises:

cutting a plurality of first images from the remote sensing image, wherein the size of the first images is located in a preset pixel area, and the first images are used as a first training image set;

acquiring a plurality of second images from the original training image set to serve as a second training image set;

selecting a preset number of images from the first training image set and the second training image set, and performing turning processing and/or cutting processing on the selected images;

and splicing the processed images of the preset number, and zooming the spliced images to the preset size to add the zoomed images into a training image set.

5. The model training method according to claim 4, wherein the preset number comprises 4 and/or 6, and the ratio of the number of the first training image set and the second training image set is 4:6.

6. the model training method of claim 5, wherein the preset number comprises 4 and 6; after the step of stitching the processed images of the preset number and scaling the stitched images to the preset size to add the scaled images to the training image set, the model training method further includes:

when the preset number is 4, zooming the spliced images to the preset size and using the images as a third training image set;

when the preset number is 6, zooming the spliced images to the preset size to be used as a fourth training image set;

the number ratio of the first training image set, the third training image set, and the fourth training image set is 4:2:1.

7. the model training method of claim 1, wherein the step of training the segmentation model based on the training image set comprises:

acquiring a segmentation label of the training image set;

acquiring a smooth weight based on the ratio of the number of training images of each segmentation label to the total number of the training image sets;

calculating a loss function for each of the segmentation labels based on the smoothing weights to train the segmentation model with the loss function.

8. An image segmentation method applied to a segmentation model obtained by the model training method according to any one of claims 1 to 7, wherein the segmentation model includes a first segmentation model and a second segmentation model, a pixel size of a training image set of the first segmentation model is a first size, a pixel size of a training image set of the second segmentation model is a second size, and the second size is at least twice the first size;

the image segmentation method comprises the following steps:

inputting the remote sensing image to be detected into the first segmentation model to obtain a first prediction result of the remote sensing image to be detected;

inputting the remote sensing image to be detected into the second segmentation model to obtain a second prediction result of the remote sensing image to be detected, wherein the first prediction result and the second prediction result are prediction results of the same segmentation region relative to the remote sensing image to be detected;

and adding the first prediction result and the second prediction result according to a preset weight to obtain a classification result of the remote sensing image to be detected.

9. The image segmentation method according to claim 8, wherein after the step of summing the first prediction result and the second prediction result according to a preset weight to obtain the classification result of the remote sensing image, the remote sensing image segmentation method further comprises:

and carrying out morphological processing on the remote sensing image to be detected based on the classification result to obtain a remote sensing classification chart.

10. The image segmentation method according to claim 8, wherein the step of inputting the remote sensing image to be detected into the first segmentation model to obtain the first prediction result of the remote sensing image to be detected comprises:

inputting the remote sensing image to be detected into the first segmentation model so as to segment the remote sensing image to be detected into a plurality of images to be detected with first sizes;

analyzing the image to be detected, a first overturning image of the image to be detected and a second overturning image of the image to be detected respectively;

and carrying out arithmetic mean on the analysis result of the image to be detected, the analysis result of the first reversed image and the analysis result of the second reversed image, and taking the arithmetic mean result as a first prediction result of the image to be detected.

11. A terminal device comprising a processor and a memory coupled to the processor, wherein the memory has a program image stored therein, and the processor retrieves the program image stored in the memory to perform the method of any of claims 1-10.

12. A computer-readable storage medium, characterized in that program instructions are stored, which are executed to implement the method according to any of claims 1-10.