CN116503245A

CN116503245A - Multi-scale fusion and super-resolution reconstruction method for coal rock digital core sequence pictures

Info

Publication number: CN116503245A
Application number: CN202310270935.0A
Authority: CN
Inventors: 胡维强; 徐长贵; 李洋冰; 马立涛; 李盼盼; 柳雪青; 刘成; 刘再振; 张波; 王威; 李晨晨; 乔方; 陈建奇
Original assignee: CNOOC Energy Technology and Services Ltd
Current assignee: CNOOC Energy Technology and Services Ltd
Priority date: 2023-03-16
Filing date: 2023-03-16
Publication date: 2023-07-28

Abstract

The invention discloses a coal rock digital rock core sequence picture multi-scale fusion and super-resolution reconstruction method, which comprises the following steps: acquiring a dataset comprising a low resolution image and a high resolution image; preprocessing data, namely aligning and fusing characteristic points of the multi-scale images, and manufacturing a time sequence data set; inputting the time sequence data set into an EDVR model for training of super-resolution reconstruction of the sequence image, and obtaining a mapping relation between the fused high-resolution image and the original low-resolution image; according to the mapping relation, an unfused low-resolution image is reconstructed into a high-resolution image through an EDVR model, indexes of corresponding structural similarity and peak signal-to-noise ratio are obtained, and model test is carried out; and carrying out three-dimensional reconstruction on the two-dimensional slice by using avizo software to obtain the three-dimensional plunger-shaped coal rock core filled with the aperture detail information. The invention is based on deep learning, can rapidly and accurately obtain the high-resolution image with large visual field, reduces cost and increases efficiency, does not need a large amount of manual operation, and reduces manual errors.

Description

Multi-scale fusion and super-resolution reconstruction method for coal rock digital core sequence pictures

Technical Field

The invention belongs to the technical field of coal rock image processing, and particularly relates to a coal rock digital core sequence picture multi-scale fusion and super-resolution reconstruction method.

Background

In core image analysis, a low-resolution core image can display a larger view, has better global representativeness, but cannot be accurately characterized for small-scale information. High resolution core images can accurately represent small scale information of the core, but generally only a small field of view can be displayed.

Currently, the existing methods for acquiring core images include the following two methods:

1. and acquiring a high-resolution image of the core by adopting Computer Tomography (CT). Due to the insufficient resolution of CT images, small pores of micrometer-scale size cannot be fully characterized. In order to obtain a clear image of small-sized pore structures on the order of micrometers and below, rock can only be cut into samples of a few millimeters or even smaller, which to some extent results in a lack of representativeness of the samples.

2. And acquiring a high-resolution two-dimensional rock core image by means of optical microscope imaging of the rock core slice, scanning electron microscope imaging of the rock sample and the like. The existing method for acquiring and displaying the high-resolution image with large vision is high in cost, low in imaging efficiency, low in imaging speed, high in technical requirements on operators and not suitable for popularization and application.

Disclosure of Invention

The invention aims to solve the problem of providing a multi-scale fusion and super-resolution reconstruction method for coal rock digital rock core sequence pictures, in particular to a method for reconstructing a low-resolution image of a plunger-shaped coal rock core two-dimensional slice into a high-resolution image, which is based on deep learning, can quickly and accurately obtain a high-resolution image displaying a large visual field, does not need a large amount of manual operation and reduces manual errors.

In order to solve the technical problems, the invention adopts the following technical scheme: the multi-scale fusion and super-resolution reconstruction method of the coal rock digital core sequence picture comprises the following steps,

s1: acquiring a dataset comprising a low resolution image and a high resolution image;

s2: preprocessing data, namely aligning and fusing characteristic points of the multi-scale images, and manufacturing a time sequence data set;

s3: inputting the time sequence data set into an EDVR model for training of super-resolution reconstruction of sequence images to obtain a mapping relation between the fused high-resolution image and the original low-resolution image;

s4: according to the mapping relation, an unfused low-resolution image is reconstructed into a high-resolution image through an EDVR model, indexes of corresponding structural similarity and peak signal-to-noise ratio are obtained, and model test is carried out;

s5: and carrying out three-dimensional reconstruction on the two-dimensional slice by using avizo software to obtain the three-dimensional plunger-shaped coal rock core filled with the aperture detail information.

Further, the step S1 comprises the following steps,

s11: obtaining a low-resolution image with the pixel point diameter of 9um and a high-resolution image with the pixel point diameter of 0.3um through CT scanning;

s12: the ratio of the number of the high-resolution two-dimensional slice images to the number of the low-resolution two-dimensional slice images is 1:3.

further, the step S2 comprises the following steps,

s21: up-sampling the low-resolution image according to the optimal registration proportion and reserving image information under the original low resolution;

s22: performing SIFT feature point alignment on the up-sampled low-resolution image and the high-resolution image;

s23: and fusing and reconstructing the aligned high-resolution image information in the low-resolution image to be used as a tag set, and manufacturing a time sequence data set according to the reserved original low-resolution image sample.

Further, the step S3 comprises the following steps,

s31: making a time sequence data set through a sliding window, and inputting the time sequence data set into a model;

s32: inter-frame alignment by pyramid cascading deformable convolutional layers;

s33: TSA fusion is carried out by utilizing the time relation between frames and the space relation in the frames, and pixel-level aggregation weight is distributed for each frame;

s34: the feature block F subjected to the fusion of the spatial and the time attention is sent to an SR network for reconstruction, and a high-resolution two-dimensional core slice supporting frame image is reconstructed;

s35: and lifting the size of the characteristic block through an up-sampling layer by using a sub-pixel convolution mode to generate an HR level image, connecting the predicted image with a two-dimensional core slice reference frame image which is obtained from an input end and through up-sampling, and outputting a super-resolution image of the predicted image.

Further, the step S32 includes the following steps,

s321: outputting respective characteristic blocks of the two-dimensional rock core slice sequence images under the same time window through convolution, wherein the characteristic blocks are used as characteristic information of a first layer; using a convolution operation with a step length of 2 to obtain the feature information of the downsampled layer, and attenuating the image sizes of the feature information by 2 times;

s322: starting from the top layer L3, splicing and fusing the characteristic blocks of the two-dimensional core slice reference frame image of the layer to output the offset of the layer, and then carrying out deformable convolution on the offset and the characteristic blocks of the two-dimensional core slice support frame image of the layer to output the characteristic blocks of the two-dimensional core slice reference frame image after the layer is aligned;

s323: the offset generated by the third layer and the two-dimensional core slice reference frame characteristic diagram after the alignment of the third layer are up-sampled and conveyed to the second layer through a bilinear interpolation method, and the offset of the second layer is obtained not only from 2 input characteristic blocks of the third layer but also from the offset of the third layer; in addition, the alignment feature block output by the layer is not only derived from the output of the deformable convolution of the layer, but also depends on the alignment feature block of the third layer, and so on until the feature block of the aligned two-dimensional core slice supporting frame image is output;

s324: and outside the pyramid structure, splicing and fusing the two-dimensional core slice reference frame image characteristic blocks of the first layer and the characteristic information of the first layer to output an offset, and then carrying out deformable convolution by utilizing the offset and the characteristic information of the first layer to output an aligned version of the two-dimensional core slice support frame characteristic blocks.

Further, the step S33 includes the steps of,

s331: performing convolution learning on an input two-dimensional core slice frame image to obtain an embedded space, and calculating the feature information similarity of the input two-dimensional core slice support frame image and the two-dimensional core slice reference frame image in the embedded space to obtain a feature block fused with time attention;

s332: the feature blocks fused with the time attention are placed into a pyramid-shaped space attention framework, firstly, 2 times of convolution downsampling are used in an aligned mode, and then the feature blocks combined with the space attention fusion are output from the top layer through upsampling, adding and dot multiplying operations from top to bottom.

Further, the reconstruction method in S34 is bicubic interpolation.

Furthermore, the invention also provides a device for operating the data processing method.

Furthermore, the invention also provides a device comprising a memory, a processor and an algorithm stored in the memory and executable on the processor, wherein the processor implements the data processing method when executing the computer program.

Further, the present invention also provides a computer readable storage medium storing a computer algorithm which when executed by a processor implements the above-described data processing method.

The invention has the advantages and positive effects that:

the invention is based on deep learning, can rapidly and accurately obtain the high-resolution image with large visual field, and compared with the traditional method, the invention has the advantages of cost reduction and synergy, does not need a large amount of manual operation, and reduces manual errors.

Drawings

FIG. 1 is an overall flow chart of an embodiment of the present invention.

Fig. 2 is a schematic diagram of sample coal rock core sampling according to an embodiment of the present invention.

Fig. 3 is a network architecture diagram of an EDVR model in accordance with an embodiment of the present invention.

FIG. 4 is a block diagram of a PCD network in accordance with an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, it should be noted that the terms "mounted," "connected," and "connected," should be construed broadly, unless otherwise explicitly specified and defined, and the specific meaning of the terms in the present invention will be understood to those of ordinary skill in the art.

Embodiments of the invention are further described below with reference to the accompanying drawings:

as shown in fig. 1, the multi-scale fusion and super-resolution reconstruction method for the coal rock digital core sequence picture comprises the following steps,

s1: a dataset is acquired comprising a low resolution image and a high resolution image. Specifically, S1 includes the following steps.

S11: and acquiring a low-resolution image with the pixel point diameter of 9um and a high-resolution image with the pixel point diameter of 0.3um through CT scanning.

specifically, in this embodiment, CT scanning is used on a plunger-shaped coal rock sample core, a core sequence slice is obtained, and a plurality of low-resolution two-dimensional core slice images with a pixel point diameter of 9um are generated. Taking sub-samples in the core of the coal rock sample, carrying out CT scanning on the sub-samples, generating a plurality of high-resolution two-dimensional slice images with the pixel point diameter of 0.3um, wherein the number of slices of the sub-samples taken in the core is smaller than that of the plunger-shaped coal rock sample, and the ratio of the number of the sub-samples taken in the core to the plunger-shaped coal rock sample is N:3N, where N is a natural number, specifically, the number ratio of the core internal sample and the plunger-shaped coal rock sample provided in this embodiment is 1:3, the schematic diagram of the sampling of the sample coal rock core is shown in fig. 2, wherein A represents a sub-sample in the coal rock sample core, and B represents a plunger-shaped coal rock sample.

In digital core image alignment, direct image alignment does not accurately determine the position of the subsamples in the plunger sample due to the different plunger sample and subsamples resolution. For this problem it is necessary to find a way to accurately determine the position of the sub-sample in the plunger sample, while still accurately performing image alignment in the case of different image resolutions. Because the SIFT algorithm has invariance to the characteristics when the image is subjected to scale scaling, rotation, translation and the like, the SIFT algorithm is used for carrying out image characteristic point alignment and image information fusion.

S2: and (3) preprocessing data, namely aligning and fusing characteristic points of the multi-scale images, and manufacturing a time sequence data set. Specifically, S2 comprises the following steps,

s21: and (3) up-sampling the low-resolution two-dimensional core slice image to 3um in resolution according to the optimal registration proportion, and retaining the image information under the original low resolution.

S22: and (3) aligning SIFT feature points of the up-sampled low-resolution two-dimensional core slice image and the high-resolution two-dimensional core slice image, and then fusing and reconstructing the aligned image information with the resolution of 0.3um in the image with the resolution of 3 um.

S23: and taking the reserved original corresponding area of the image as a model input, fusing and reconstructing the aligned high-resolution image information in the low-resolution image as a tag set, and manufacturing a time sequence data set according to the reserved original low-resolution image sample.

The time series data set is generated as follows:

firstly, sequentially taking 3 images from front to back according to a scanning sequence by a sliding window strategy to form 1 sequence, namely one sample. For example, taking 1 st to 3 rd as the first sequence, 2 nd to 4 th as the second sequence, 3 rd to 5 th as the third sequence, until N-2 th to N th as the last sequence, forming N-2 samples in total, and finishing the generation of the time sequence data set.

Because the CT slice data has a certain sequence relationship, the sequence relationship describes the continuity and the spread condition of the hole and seam in the space, and in order to learn the sequence relationship, the generated high-resolution image also has the continuity and the integrity, so the invention adopts an EDVR model capable of realizing super-resolution reconstruction of the sequence image, fully utilizes the information between adjacent frames and realizes the generation of the super-resolution image, and the model structure is shown in figure 3.

S3: and inputting the time sequence data set into an EDVR model for training of super-resolution reconstruction of the sequence image, and obtaining the mapping relation between the fused high-resolution image and the original low-resolution image. Specifically, S3 comprises the following steps,

s31: a time series data set is produced by sliding a window and is input into a model.

The specific process of model input is as follows: firstly, sequentially taking 3 images from front to back according to a scanning sequence by a sliding window strategy to form 1 sequence, namely one sample. For example, taking 1 st to 3 rd as the first sequence, 2 nd to 4 th as the second sequence, 3 rd to 5 th as the third sequence, until N-2 th to N th as the last sequence, forming N-2 samples in total, and finishing the generation of the time sequence data set.

And continuously inputting 3 frames of images, namely three images of each sample, respectively marking the images as t-1 frames, t frames and t+1 frames, wherein the t frames are reference frames, namely frames (super-division objects) of the model needing up-sampling, and the t-1 frames and the t+1 frames are support frames. The input frame image is subjected to a deblurring module consisting of a plurality of residual blocks and a convolution layer with the step length of 2, so that the image characteristics are extracted, and the subsequent inter-frame alignment quality is improved.

S32: and by pyramid cascading deformable convolution layers, namely PCD inter-frame alignment, the accuracy of the inter-frame alignment is improved, so that assistance is provided for subsequent fusion SR reconstruction.

Where PCD inter-frame alignment is a coarse to fine, top-down process. The module is based on a Deformable Convolutional Network (DCN), and adopts a plurality of deformable convolutional networks and convolutional networks to form a pyramid structure in a cascading way, wherein the pyramid layer number is 3 in the embodiment, and the PCD network structure is shown in figure 4.

Specifically, S32 includes the steps of,

s321: the two-dimensional rock core slice sequence image under the same time window is input, the example is a reference frame and two support frames, and the respective characteristic blocks are output through convolution and used as the characteristic information of the first layerWherein F is _t+i ，i∈[-1，1]Representing 3 frame images, where F _t The rest is a support frame; obtaining the feature information of the layer downsampling using a convolution operation with step size 2 +.>Wherein a certain layer s epsilon [1, S]S=3, i.e. 3 levels of feature information are generated, their image size decays by a factor of 2.

S322: starting from the top layer L3, splicing and fusing the characteristic blocks of the two-dimensional core slice reference frame image of the layer to output the offset of the layer, and then carrying out deformable convolution on the offset and the characteristic blocks of the two-dimensional core slice support frame image of the layer to output the characteristic blocks of the two-dimensional core slice reference frame image after the alignment of the layerWherein "3" in the superscript indicates layer 3 of the pyramid.

S323: generating offset of the third layer and a two-dimensional core slice reference frame characteristic diagram after the layer is alignedUpsampling by bilinear interpolation is carried to the next layer, i.e. the second layer, from which the offset is derived not only from the 2 input feature blocks of that layer, but also from the third layerAn offset; the aligned feature blocks output by the layer are not only derived from the output of the deformable convolution of the layer, but also depend on the aligned feature blocks of the third layer, and then continue to pass to the first layer according to the following formula (1) (2) until the aligned feature blocks of the two-dimensional core slice support frame image are output.

Wherein () ^↑2 Representing 2-fold upsampling using bilinear interpolation; DConv () represents a deformable convolution; g (), h () represents a general convolution process; [.]Represents concat (fusion);

s324: outside the pyramid structure, the two-dimensional rock core slice reference frame image characteristic block of the first layer is added withThe characteristic information of the first layer is spliced and fused to output an offset, and then the offset and +.>The first layer of characteristic information is subjected to deformable convolution to output a two-dimensional core slice supporting frame characteristic block (F) _t+i ) ¹ Is (are) aligned version of->This may further adjust the alignment feature block.

To fully exploit the temporal and spatial relationships between frames, the fusion module introduces a temporal attention mechanism in time and space. The TSA fusion module distributes aggregation weight of pixel level for each frame, and improves the fusion effectiveness and high efficiency.

S33: and performing TSA fusion by using the time relationship between frames and the space relationship in the frames, wherein the fusion mode distributes the aggregation weight of pixel level for each frame, and improves the fusion effectiveness and high efficiency. Specifically, S33 comprises the following steps,

s331: and performing convolution learning on the input two-dimensional core slice frame image to obtain an embedded space, and calculating the feature information similarity of the input two-dimensional core slice support frame image and the two-dimensional core slice reference frame image in the embedded space to obtain a feature block fused with time attention.

The specific calculation method comprises the following steps: obtaining the time attention weight (pixel level) of each frame through dot multiplication with a reference frame and then through sigmoid, then carrying out dot multiplication on the obtained weight and the input continuous 3-frame image, and carrying out continuous convolution on the dot multiplication result to obtain a feature block F fused with the time attention _fusion The formula is as follows:

wherein sigmoid (): weights are retracted between 0 and 1, so that training stability is improved; the following is true: representing dot products, multiplying element by element; [.,.]Represents a concat; conv (): representing a convolution;and->Possessing the same size (size); support frame characteristic information: />Reference frame characteristic information: />

S332: feature block F with fused temporal attention _fusion Put into a pyramid-shaped space attention framework, firstly, 2 times of convolution downsampling (Conv) are used in alignment to obtain F respectively ₀ And F ₁ Then from top to bottom from top to top through up-sampling, adding, dot multiplying, etc. operations, obtain sum F _fusion Spatial attention F to identical size ₂ And finally, outputting a characteristic block combined with the spatial attention fusion, namely F by multiplying the characteristic blocks element by element. The spatial attention is expressed as follows:

F ₀ ＝Conv(F _fusion )，F ₁ ＝Conv(F _fusion )， (6)

wherein () ^T Representing upsampling;and F _fusion With the same size.

S34: and (3) feeding the feature block F subjected to the fusion of the spatial and the time attention into an SR network for reconstruction, and reconstructing a high-resolution two-dimensional core slice supporting frame image, wherein the reconstruction method is a bicubic interpolation method.

S35: and (3) lifting the size of the feature block through an up-sampling layer by using a sub-pixel convolution mode to generate a high-definition image of an HR level, connecting a predicted image with a two-dimensional core slice reference frame image which is obtained from an input end and through up-sampling, and outputting a super-resolution image of the predicted image. The connection here acts as a regularization term forcing the network to learn the residual information, i.e. how to reconstruct higher quality reference frame images. The aim of the invention is to approximate the super-division structure to the reference frame image by optimizing the MSE loss function. And through training N-2 samples, the optimal mapping relation between the low-resolution two-dimensional core slice image and the super-resolution image is learned.

S4: and according to the mapping relation, reconstructing the unfused low-resolution image into a high-resolution image through an EDVR model, obtaining indexes of corresponding structural similarity and peak signal-to-noise ratio, and performing model test.

Specifically, after N-2 samples in the label set are trained sequentially, peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) values obtained by the model are very high and 38,0.97 respectively, which indicates that the super-resolution image generation effect is good, and the model successfully trains the mapping relation from the low-resolution image to the high-resolution image. Therefore, 2N sequences, namely 2N samples, are manufactured from the 2N+ Zhang Mei rock sample core two-dimensional slice images and are sequentially put into an EDVR model for training, and 2N high-resolution images with larger vision fields are obtained.

In conclusion, the invention can quickly and accurately obtain the high-resolution image with large visual field based on deep learning, and compared with the traditional method, the invention has the advantages of cost reduction and synergy, does not need a large amount of manual operation, and reduces manual errors.

The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.

Claims

1. The multi-scale fusion and super-resolution reconstruction method for the coal rock digital rock core sequence picture is characterized by comprising the following steps of: comprises the steps of,

2. The method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 1, which is characterized in that: the step S1 comprises the following steps,

3. the method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 1 or 2, wherein the method comprises the following steps of: the step S2 comprises the following steps,

4. The method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 1 or 2, wherein the method comprises the following steps of: the step S3 comprises the following steps,

5. The method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 4, which is characterized in that: the step S32 includes the following steps,

6. The method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 4, which is characterized in that: the step S33 includes the steps of,

7. The method for multi-scale fusion and super-resolution reconstruction of coal rock digital core sequence pictures according to claim 4, which is characterized in that: the reconstruction method in S34 is bicubic interpolation.

8. An apparatus, characterized in that: a data processing method as claimed in any one of claims 1 to 7.

9. An apparatus comprising a memory, a processor, and an algorithm stored in the memory and executable on the processor, characterized by: the processor, when executing the computer program, implements a data processing method as claimed in any one of claims 1 to 7.

10. A computer readable storage medium storing a computer algorithm, which when executed by a processor implements a data processing method according to any one of claims 1 to 7.