CN107818555B

CN107818555B - Multi-dictionary remote sensing image space-time fusion method based on maximum posterior

Info

Publication number: CN107818555B
Application number: CN201711022705.3A
Authority: CN
Inventors: 何楚; 张芷; 郭闯创; 熊德辉
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2020-03-10
Anticipated expiration: 2037-10-27
Also published as: CN107818555A

Abstract

The invention relates to a multi-dictionary remote sensing image space-time fusion method based on maximum posterior, which comprises the steps of firstly, roughly classifying high-low resolution difference images, taking out image blocks for each class, and forming a training sample matrix of each class, thereby training a multi-class high-low resolution dictionary. The multi-dictionary learning considers that different landforms in the image have different shapes and textures, so that the trained dictionary is more targeted, and the difference between the landforms can be better captured. Before solving the sparse coefficient, a maximum posterior probability model is adopted to select the dictionary group, the maximum posterior probability is calculated through a likelihood function from the region pixel to the dictionary and a prior function between the region pixel dictionaries, and therefore the low-resolution differential input image pixel points are distributed to the corresponding dictionary groups. And carrying out sparse coding on each group of pixel points under the low-resolution dictionaries of the corresponding group to obtain sparse representation coefficients. And multiplying the sparse representation coefficient by the corresponding high-resolution dictionary to obtain a high-resolution difference image.

Description

Multi-dictionary remote sensing image space-time fusion method based on maximum posterior

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a multi-dictionary remote sensing image space-time fusion method based on maximum posterior.

Background

Due to hardware technology and budget constraints, there are obstacles to obtaining remote sensing images with both high spatial resolution and high temporal coverage. For example, a mode-resolution imaging spectrometer (MODIS) can observe the same region every day, with a higher time resolution. However, at the same time, their spatial resolution ranges from 250 to 1000 meters, and since the spatial resolution is too low to perform well on the detailed features of the target area, the surface coverage and ecosystem changes of a certain area or a designated place cannot be monitored well. On the other hand, remote sensing images with higher spatial resolution can be obtained from SPOT and earth surface satellite instruments (such as Landsat, the terrestrial satellite in the United states), and the spatial resolution range of the remote sensing images is 10-30 meters. Such remote sensing images are generally suitable for land use, surface mapping and the prediction of coverage changes, but the time between the shots of these high spatial resolution remote sensing images is typically half a month to a month, and the long shot intervals and the effects of severe weather and climate conditions make the images taken by this type of satellite undetectable for rapid surface changes due to seasonal changes or disturbances caused by human activities. In order to better observe the change condition of ground objects and analyze the change of the ground surface, a space-time fusion technology is developed, and the technology fuses the existing images with rich time information and space information to obtain high-spatial resolution images with short time intervals.

A space-time Adaptive reflection fusion Model (STARFM) is a classical space-time fusion Model, and many improved models based on the Model are proposed later. The STARFM model predicts a central pixel value by using the similar pixel values of the adjacent spectrums in the three aspects of distance, spectrum similarity and time difference, so that the precision of a fusion result is greatly improved. Then, an image super-resolution method based on sparse representation is proposed: and by applying a sparse representation theory, the mapping relation of the high-resolution image and the low-resolution image is obtained by establishing a high-resolution dictionary and a low-resolution dictionary, so that the super-resolution of the image is realized. The sparse representation is applied to the field of remote sensing image space-time fusion, and a space-time fusion model (SPSTFM) based on the sparse representation is provided: the change in reflectivity is predicted by uniformly training a dictionary between the low spatial resolution high temporal coverage difference images and the high spatial resolution low temporal coverage difference images, which makes it very robust to both geophysical changes and earth surface coverage type changes.

However, the above model has the following problems: the STARFM model does not fully utilize current data resources, and lacks an effective means for short-time varying areas, since it assumes that the earth coverage type and the proportion of each earth coverage type are constant during the observation period; the model based on sparse representation is learned and reconstructed based on the unified dictionary, and the reconstruction effect is restrained by the effectiveness of the learning dictionary.

Disclosure of Invention

The invention aims to train a more targeted dictionary aiming at different shapes and textures of different landforms in an image through multi-dictionary learning and can better capture the difference between the landforms. And a Bayesian framework is introduced, so that the dictionary can be selected in the process of generating the image by fully utilizing the prior information. The invention provides a multi-dictionary space-time fusion model based on the maximum posterior on the basis of a sparse representation space-time fusion model: learning multiple dictionaries aiming at different types of regions in the image; the method comprises the steps of regarding a dictionary selection problem of an input image as a solving problem of maximum posterior probability, constructing a likelihood function between a region pixel and a dictionary and a prior function between region pixel dictionaries under a Bayes framework, and obtaining dictionary selection of the region through an optimization method; and generating a high-resolution image through front and back high-resolution remote sensing images and a simultaneous low-resolution image under a classical sparse expression space-time fusion framework.

The technical scheme of the invention is a multi-dictionary remote sensing image space-time fusion method based on maximum posterior, which comprises the following steps: step 1, classifying training images, comprising the following substeps,

step 1.1, for any two interval times t_1+2nAnd t₁Is differenced for any two interval times t_1+2nAnd t₁The low-resolution images are differentiated to obtain a high-resolution differential image and a low-resolution differential image respectively, wherein n is 1,2 and 3;

step 1.2, carrying out simple rough classification on the high-resolution differential image and the low-resolution differential image respectively to obtain a plurality of groups of high-resolution and low-resolution training sample images of different classes;

step 2, learning a multi-class dictionary, comprising the following substeps,

step 2.1, according to the different categories obtained in the step 1, partitioning each category of training sample images, and stacking each image block into columns to form a training sample matrix;

2.2, putting the high-resolution and low-resolution training sample matrixes of the same category together for joint training;

2.3, respectively training high-low resolution dictionary pairs belonging to each class of training sample matrix to obtain dictionaries of different groups;

step 3, inputting t to be reconstructed_1+nLow resolution image of time, and t_1+2nDifferentiating the low-resolution images of the time instants to obtain an input image, and selecting a pixel-by-pixel dictionary set for the input image, comprising the sub-steps of,

step 3.1, sampling the low-resolution training sample image of each category obtained in the step 1 to obtain the likelihood probability of any pixel point under each category;

step 3.2, obtaining the prior probability of the pixel point according to the classification condition of the surrounding points of the input image pixel point;

step 3.3, traversing each pixel point in the input image, and selecting a pixel point dictionary group for each pixel point by adopting a maximum posterior probability method;

step 4, for t_1+nThe high resolution image at a time is reconstructed, comprising the sub-steps of,

step 4.1, partitioning the input image according to the selection result of the dictionary group, putting the image blocks of the same group together and stacking the image blocks into columns to form an input matrix, and finally generating the input matrix with the same number as the dictionary group;

step 4.2, aiming at each input matrix, the low-resolution dictionary obtained in the step 2.3 is utilized, and an OMP method is adopted to obtain a corresponding sparse representation coefficient;

4.3, multiplying the high-resolution dictionary of each group with the sparse representation coefficient of the corresponding group to obtain each group image block of the reconstructed image; then, overlapping each group of image blocks, and obtaining a reconstructed difference image by adopting an average value in an overlapping area;

step 4.4, the reconstructed difference image is added or subtracted with the known t_1+2nThe high-resolution image at the moment is a reconstructed high-resolution image L21;

step 5, mixing t_1+nTime of dayLow resolution image of, and t₁And (3) taking the difference of the low-resolution images at the moment as an input image, repeating the step 3-4 to obtain a reconstructed high-resolution image L22, and adding the two images L21 and L22 for averaging to obtain a final reconstruction result.

Moreover, in the step 2.3, the high-low resolution dictionary pairs belonging to each class are respectively trained for each class of training sample matrix by adopting a K-SVD method, which is realized as follows,

let the training sample matrix be x ═ x₁,x₂,…,x_N}，x_iIs of the formula R^NSparse representation assumes that these signals can be linearly represented by several atoms in an overcomplete dictionary matrix, namely:

x＝Dα (1)

overcomplete dictionary

D＝{d₁,d₂,…,d_N}∈R^M×N(M＜N) (2)

Coefficient of sparseness

α＝{α₁,α₂,…,α_N}^T∈R^N(3)

Wherein M and N are rows and columns of a dictionary matrix respectively, each column in the dictionary matrix is called a dictionary atom d, N dictionary atoms are shared, the dimension of each atom is M, α is a sparse representation coefficient, most values of α are 0, only a few values are nonzero, if the number of nonzero values is K, and K < M, α is called K sparse;

as can be seen from the principle of sparse representation and sparse coding, each class of high-resolution dictionary pair and low-resolution dictionary pair is obtained by optimizing the formula (4),

where λ is a regularization parameter used to balance sparsity and reconstruction errors of the sparsity-represented signal, | α | | survival₁Is 1₁Norm, representing the sum of absolute values, | | Z-Da | | non-conducting phosphor₂Is represented by₂Norm, which is modulo in the usual sense, Z ═ Y; x]Y is a high resolution training sample matrix and X is lowA resolution training sample matrix, wherein X and Y are subjected to normalization processing; d ═ D_I；D_h]，D_I，D_hRespectively a low resolution dictionary and a high resolution dictionary, and D, α represent the dictionary and the sparse representation coefficient obtained after optimizing the right side of the equation;

obtaining high-low resolution dictionary pair D by adopting K-SVD method optimization formula_I，D_hThe method comprises the following specific steps:

a. inputting a training sample matrix Z, wherein the atom number of the dictionary is N, and the iteration number is J;

b. initializing a dictionary, and randomly selecting K columns from a training sample matrix Z as initial values of the dictionary;

c. solving a sparse representation coefficient α according to the initialized dictionary by using an Orthogonal Matching Pursuit (OMP) algorithm;

d. updating the kth column of the dictionary:

① let the k row in the sparse matrix multiplied by the k column of the dictionary be noted as

② calculating a representation error matrix for the population after removing the kth column of the dictionary

d_jColumn j in the dictionary;

③E_kof which only the k column of the dictionary is reserved

The term after the product of the medium non-zero positions forms

④ pairs

Singular value decomposition is carried out, so that the kth column of the dictionary and the corresponding sparse coefficient are updated;

e. and d, repeating the step d until the iteration times are met, and obtaining the final high-low resolution dictionary pair.

Moreover, each pixel in the step 3.3 adopts the maximum posterior probability method to select the category of the pixel dictionary group, the realization method is as follows,

the pixel value of a certain pixel point in the known input image is x_mnUnder the condition, m and n are coordinates, the posterior probability under the assumption of each dictionary group is calculated through a Bayes formula (5), the maximum posterior probability is taken as the final dictionary group of the pixel point, wherein the Bayes formula is as follows:

in the formula x_mnPixel value representing a pixel point, c_iRepresenting an ith class dictionary; p (x)_mn|c_i) Represented in class i dictionary c_iMiddle pixel value is x_mnThe probability of (2) is called likelihood probability; p (c)_i) Is the prior probability of the class i dictionary; p (x)_mn) Representing a pixel value of x_mnA priori of P (c)_i|x_mn) Is a posterior probability, i.e. the pixel value at this point is known to be x_mnUnder the condition (2), the point belongs to the dictionary class c_iThe probability of (d);

due to the denominator P (x)_mn) Is a constant independent of the dictionary group, i.e. P (x) is a constant independent of the dictionary group, i.e. no matter the pixel belongs to any dictionary group_mn) Is always kept constant, so that there is no influence in calculating the maximum a posteriori probability, P (x)_mn) Without taking part in the calculation, the following formula can be converted,

wherein, I represents the category of the dictionary, for an input image I, the image size is MxN, and pixel values x of all pixel points in the image are taken out_mn(M is 1,2, 1.. multidot.M; N is 1, 2.. multidot.N), obtaining the dictionary type I through formula (6), and combining the same pixel points of I together to form the image block I of the ith dictionary_i。

Furthermore, the simple coarse classification described in step 1.2 is a binary classification.

Compared with the prior art, the invention has the advantages and beneficial effects that: the multi-dictionary learning method provided by the invention considers that different landforms in the image have different shapes and textures, so that the trained dictionary has pertinence, can better capture the difference between landforms, and can better select dictionaries in different areas of the input image based on the maximum posterior probability method of the Bayes framework. Thereby improving the quality of the reconstructed high resolution image.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention.

FIG. 2 is a diagram illustrating a Bayesian maximum a posteriori classification scheme in accordance with an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is explained in detail in the following by combining the drawings and the embodiment.

The algorithm of the invention introduces multiple dictionaries and a maximum posterior. After the high-low resolution difference images are obtained, rough classification is carried out, image blocks are taken out of each class, a training sample matrix of each class is formed, and therefore a multi-class high-low resolution dictionary can be trained. The multi-dictionary learning considers that different landforms in the image have different shapes and textures, so that the trained dictionary is more targeted, and the difference between the landforms can be better captured. Before solving the sparse coefficient, a maximum posterior probability model is adopted to select the dictionary group, the maximum posterior probability is calculated through a likelihood function from the region pixel to the dictionary and a prior function between the region pixel dictionaries, and therefore the low-resolution differential input image pixel points are distributed to the corresponding dictionary groups. And carrying out sparse coding on each group of pixel points under the low-resolution dictionaries of the corresponding group to obtain sparse representation coefficients. And multiplying the sparse representation coefficient by the corresponding high-resolution dictionary to obtain a high-resolution difference image. A Bayesian framework is adopted for calculating the maximum posterior probability, so that prior information is utilized more fully, and the reconstruction effect is closer to a real image.

Referring to fig. 1, the flow chart of the embodiment of the present invention includes the following 3 steps:

step one training multi-class dictionary

(1)t₃And t₁The Landsat images Y3, Y1 at the time are differentiated to obtain L31, t₃And t₁M31 is obtained by differentiating MODIS images X3 and X1 at the time points, wherein the time intervals of the time points are uncertain, the time intervals can be multiple time points in the middle, and the images at the middle time are mainly reconstructed by using the images obtained from the time points at the two ends.

(2) And carrying out simple rough classification on L31 and M31, such as first binaryzation, screening out areas with areas larger than a specific value, and obtaining a binaryzation image of a lake, a lake degradation track and a forest land. And separating three groups of training sample images of lakes, lake degradation tracks and woodland from the difference image by using the binarization image.

(3) And partitioning each type of training sample image, wherein the blocks are overlapped, each image block is stacked into columns to form a training sample matrix, and the sample matrices with high and low resolutions are put together to be jointly trained to obtain a plurality of pairs of dictionary pairs with high and low resolutions. The K-SVD method is specifically as follows,

for an image, it is divided into image blocks, each of which is represented by { x after being stacked in columns₁,x₂,…,x_N}，x_iIs of the formula R^N. Sparse representation it is assumed that these signals can be linearly represented by several atoms in an overcomplete dictionary matrix, namely:

x＝Dα (1)

overcomplete dictionary

D＝{d₁,d₂,…,d_N}∈R^M×N(M＜N) (2)

Coefficient of sparseness

α＝{α₁,α₂,…,α_N}^T∈R^N(3)

Where M and N are the rows and columns, respectively, of a dictionary matrix, each column in the dictionary matrix is called a dictionary atom d, there are N dictionary atoms in total, the dimension of each atom is M, α is a sparse representation coefficient, where α has most values of 0 and only a few values are non-zero, if the number of non-zero values is K, and K < M, then α is K sparse, and the dictionary pairs are obtained by optimizing formula (4),

where λ is a regularization parameter used to balance sparsity and reconstruction errors of the sparsity-represented signal, | α | | survival₁Is 1₁Norm, representing the sum of absolute values, | | Z-D α | | purple₂Is represented by₂Norm, which is modulo in the usual sense, Z ═ Y; x]，D＝[D_I；D_h]，D_I，D_hRespectively, low-resolution dictionary and high-resolution dictionary, D^*,α^*And representing the dictionary and sparse representation coefficients obtained after optimizing the right side of the equation. And Y is a high-resolution training matrix formed by dividing the high-resolution differential image into image blocks and then stacking each image block into columns. Similarly, X is a low-resolution differential image block, and is stacked into columns to form a low-resolution training matrix. The high-resolution and low-resolution training matrixes are combined to ensure that the high-resolution sparse representation coefficient and the low-resolution sparse representation coefficient of the image block at the same position are the same during training. The training matrix is normalized to account for differences in reflectivity between different frequency bands and between high and low resolution images. And X and Y are training matrixes after normalization processing.

Optimizing formula (4) to obtain high-low resolution dictionary D_I，D_hIn the method, a K-SVD method is adopted, and the specific steps are as follows: a) inputting a training sample matrix Z, wherein the atom number of the dictionary is N, and the iteration number is J;

b) initializing a dictionary, and randomly selecting K columns from a training sample matrix Z as initial values of the dictionary;

c) solving a sparse representation coefficient α according to the initialized dictionary by using an Orthogonal Matching Pursuit (OMP) algorithm;

for a certain overcomplete dictionary, the solution for sparse representation coefficients α is not unique, and in order to make α the most sparse, it is therefore necessary to find the solution with the fewest non-zero values, and the problem translates into:

min||α||₀s.t.x＝Dα (7)

||α||₀is represented by₀Norm, representing the number of non-zero values; the number N of atoms in D is greater than the dimension M of the signal x, i.e. M < N, so that the overcomplete of the dictionary is guaranteed.

Solving for sparse representation coefficients can be converted into solving for l₁And (4) norm. The solution to the sparse coefficients uses an Orthogonal Matching Pursuit (OMP) algorithm. The main idea of the algorithm is as follows: from the dictionary matrix D, an atom (i.e., a column) that best matches the sample matrix Z is selected, a sparse approximation is constructed, the signal residuals are solved, then the atom that best matches the signal residuals is continuously selected, and iteration is repeated, so that Z can be represented by the linear sum of the atoms plus the final residual value. When the residual values are within a negligible range, then Z is the linear combination of these atoms. The selected atoms need to be orthogonalized before being selected, so that each iteration is optimal, and the calculation time is gradually reduced along with the iteration.

d) Updating the kth column of the dictionary:

d_jColumn j in the dictionary;

③E_konly the kth column of the dictionary and

those terms after the product of the medium non-zero positions form

④ pairs

e) and d), repeating the step d) until the iteration times are met, and obtaining the final dictionary.

(4) Inputting t to be reconstructed₂The low resolution image X2 at the time is differentiated from the low resolution image X3 at the time t3 to obtain an input image X32, and at an intermediate time, a low resolution image is known.

Step two, the method of the maximum posterior probability selects the dictionary group of the pixel points

The main idea of obtaining the likelihood function is to perform a rough classification on a series of low-resolution difference images, then perform pixel sampling on each category respectively, count the gray values of the pixels, draw a probability density curve of each dictionary category, and obtain the probability of each gray value under dictionaries of different groups according to the probability density curve. For example, the rough classification is roughly classified into three categories according to lakes, lake degradation tracks and woodlands, and the corresponding dictionaries are also classified into the three categories.

The prior probability is determined mainly by considering the dictionary class condition of the surrounding points. For example, when the classification results of the surrounding points are the same, the probability that the classification of the intermediate point is the same as that of the surrounding points is higher, and may be set to 0.8, and the probability that the intermediate point is classified into other classes is lower, and is assumed to be 0.2; when the grouping results of the surrounding points are not consistent, the probability that the intermediate points are classified into any class is considered to be the same. The prior function is simpler, the surrounding points can be divided into more different situations, not only four points up, down, left and right can be considered when the surrounding points are considered, but also more points can participate in determining the prior probability, so that the obtained prior probability is more accurate, and the classification result is better.

In the embodiment, the method of maximum posterior probability is adopted for X32 to select the category of the dictionary group of the pixel points, the probabilities of the whole image under all different grouping conditions are traversed, and the maximum probability is taken as the final result of selecting the category of the dictionary group; the realization is as follows,

and the dictionary group of the pixel points is selected by adopting a Bayes framework, and a Bayes formula is as follows:

in the formula x_mnPixel value, C, representing a pixel point_iRepresenting the ith class dictionary. P (x)_mn|c_i) Represented in class i dictionary c_iMiddle pixel value is x_mnIs called likelihood probability, P (c)_i) Is the prior probability of the class i dictionary. The prior probability reflects the probability that the intermediate point is assigned to the dictionary set based on the grouping results of the surrounding points. P (x)_mn) Representing a pixel value of x_mnA priori of P (c)_i|x_mn) Is a posterior probability, i.e. the pixel value at this point is known to be x_mnUnder the condition (2), the point belongs to the dictionary class c_iThe probability of (c).

The selection of the dictionary category of the pixel points adopts the Maximum A Posteriori (MAP) principle, which means that the known pixel value is x_mnUnder the condition (3), the posterior probability under each dictionary group hypothesis is calculated by the formula (5), and the hypothesis with the highest probability is taken as the final group. Denominator P (x)_mn) Is a constant independent of the dictionary class, i.e. P (x) regardless of the dictionary class to which the point belongs_mn) The value of (2) is always kept unchanged, so that the maximum posterior probability is calculated without any influence and can not participate in the calculation.

I represents the category of the dictionary, which can be determined by the maximum posterior probability, for an input image I, the image size is M multiplied by N, and the pixel value x of all pixel points in the image is taken_mn(M is 1,2, 1.. multidot.M; N is 1, 2.. multidot.N), obtaining the dictionary type I through formula (6), and combining the same pixel points of I together to form the image block I of the ith dictionary_i。

Referring to fig. 2, a schematic diagram of determining a pixel group according to a maximum posterior probability in the embodiment of the present invention; each pixel point in the image is corresponding to likelihood probability, pixel value, dictionary group and prior probability. For an image, it can be seen as having four layers. Assuming that the group of intermediate points is unknown and the group of surrounding points is known. The first step is to get the likelihood probability of the intermediate point according to the pixel value. And thirdly, multiplying the likelihood probability and the prior probability to obtain the posterior probability of the intermediate point, and taking the group with the maximum posterior probability as the grouping result of the intermediate point.

Step three high resolution image reconstruction

Partitioning the input image X32 according to the selection result of the dictionary group, and combining the same pixel points of I to form an image block I of the ith dictionary_iThe image blocks of the same group are put together and stacked in columns to form an input matrix, and finally the same number of input matrices as the dictionary group are generated. For each input matrix, a corresponding low resolution dictionary is used, and the corresponding sparse representation coefficients are obtained by OMP.

High resolution dictionary D of each group_hMultiplying with each group of sparse representation coefficients respectively to obtain each group of image blocks of the reconstructed image, overlapping each group of image blocks, and obtaining the reconstructed difference image Y by adopting the mean value in the overlapping area₃₂。

Suppose that the OMP algorithm is used to solve the input matrix X₃₂Is α, the high resolution difference image Y₃₂Can be expressed as:

Y₃₂＝D_h*α (8)

differential image Y₃₂The high-resolution image L3 was added to the image to obtain-L2, and the negative image was subtracted to obtain the reconstructed high-resolution image L21.

In the same manner, a difference image X12 between X1 and X2 is obtained as an input image, a high-resolution image L22 is reconstructed, and L21 and L22 are added and averaged to obtain a final reconstruction result.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A multi-dictionary remote sensing image space-time fusion method based on maximum posterior is characterized by comprising the following steps:

step 1, classifying training images, comprising the following substeps,

step 2, learning a multi-class dictionary, comprising the following substeps,

step 5, mixing t_1+nLow resolution image of time, and t₁And (3) taking the difference of the low-resolution images at the moment as an input image, repeating the step 3-4 to obtain a reconstructed high-resolution image L22, and adding the two images L21 and L22 for averaging to obtain a final reconstruction result.

2. The maximum a posteriori based spatio-temporal fusion method for multi-dictionary remote sensing images according to claim 1, characterized in that: in the step 2.3, a high-low resolution dictionary pair belonging to each class is trained for each class of training sample matrix by adopting a K-SVD method in the following way,

x＝Dα (1)

overcomplete dictionary

D＝{d₁,d₂,…,d_N}∈R^M×N(2)

Coefficient of sparseness

α＝{α₁,α₂,…,α_N}^T∈R^N(3)

Wherein M and N are rows and columns of the dictionary matrix respectively, wherein M is less than N, each column in the dictionary matrix is called a dictionary atom d, N dictionary atoms are shared, and the dimension of each atom is M;

where λ is a regularization parameter used to balance sparsity and reconstruction errors of the sparsity-represented signal, | α | | survival₁Is 1₁Norm, representing the sum of absolute values, | | Z-Da | | non-conducting phosphor₂Is represented by₂Norm, Z ═ Y; x]Y is a high-resolution training sample matrix, X is a low-resolution training sample matrix, and both X and Y are subjected to normalization processing; d ═ D_I；D_h]，D_I，D_hRespectively, low-resolution dictionary and high-resolution dictionary, D^*,α^*Expressing a dictionary and a sparse representation coefficient obtained after optimizing the right side of the equation;

d. updating the kth column of the dictionary:

d_jColumn j in the dictionary;

③E_kof which only the k column of the dictionary is reserved

The term after the product of the medium non-zero positions forms

④ pairs

3. The maximum a posteriori based spatio-temporal fusion method for multi-dictionary remote sensing images according to claim 1 or 2, characterized in that: in the step 3.3, each pixel point adopts the maximum posterior probability method to select the category of the pixel point dictionary group, the realization method is as follows,

the pixel value of a certain pixel point in the known input image is x_mnUnder the condition (1), m and n are coordinates, and the posterior is calculated under the assumption of each dictionary group through a Bayes formula (5)And (3) probability, taking the maximum posterior probability as the final dictionary group of the pixel point, wherein the Bayesian formula is as follows:

p (x) when calculating the maximum a posteriori probability_mn) Without involving in the calculation, the formula (5) can be converted into the following formula,

wherein, I represents the category of the dictionary, for an input image I, the image size is MxN, and pixel values x of all pixel points in the image are taken out_mnWherein M is 1, 2.. times.m; n is 1,2, the dictionary type I is obtained through the formula (6), and the same pixel points of I are combined together to form the image block I of the I-th dictionary_i。

4. The maximum a posteriori based spatio-temporal fusion method for multi-dictionary remote sensing images according to claim 1, characterized in that: the simple coarse classification described in step 1.2 is a binary classification.