WO2024120224A1

WO2024120224A1 - Image processing method, apparatus, electronic device and storage medium

Info

Publication number: WO2024120224A1
Application number: PCT/CN2023/134026
Authority: WO
Inventors: 李楠宇; 陈日清; 余坤璋; 徐宏; 苏晨晖
Original assignee: 杭州堃博生物科技有限公司
Priority date: 2022-12-05
Filing date: 2023-11-24
Publication date: 2024-06-13
Also published as: CN118154422A

Abstract

The present disclosure relates to the technical field of image processing, and provides an image processing method, an apparatus, an electronic device and a storage medium. The method comprises: inputting any image block of a low-resolution original image into a preset constraint model for constraint computation, so as to obtain a corresponding super-resolution image feature vector; then inputting the super-resolution image feature vector corresponding to said any image block and position information of said any image block in the original image into a Transformer model for processing, such that the Transformer model can achieve better convergence during a training process and does not need a large amount of data for training; and finally obtaining a super-resolution image output by the Transformer model and corresponding to said any image block. Therefore, the efficiency of converting low-resolution original images into super-resolution images is remarkably improved.

Description

Image processing method, device, electronic device and storage medium

Technical Field

The present invention relates to the field of image processing technology, and in particular to an image processing method, device, electronic equipment and storage medium.

Background technique

At present, the demand for high-resolution images in video surveillance, medical diagnosis and remote sensing applications is increasingly urgent. Using image acquisition equipment to directly obtain high-resolution images has problems such as high cost, long time consumption and high radiation dose, which makes it difficult to meet the demand for high-resolution images. The super-resolution reconstruction method based on machine learning needs to rely on the training library for training, resulting in low reconstruction efficiency.

Summary of the invention

In view of the above, it is necessary to propose an image processing method, device, electronic device and computer-readable storage medium, which can quickly reconstruct a low-resolution image into a high-resolution image.

A first aspect of the present invention provides an image processing method, the method comprising:

Obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

Obtaining position information of any image block in the original image;

The super-resolution image feature vector and position information corresponding to any image block are input into the Transformer model for processing to obtain a super-resolution image corresponding to any image block output by the Transformer model.

According to an optional implementation manner of the present invention, acquiring any image block corresponding to the low-resolution original image includes:

Determining an image mask matching any one of the image blocks;

The arbitrary image block is obtained by calculating the image mask and the original image.

According to an optional implementation of the present invention, inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:

Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;

The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.

According to an optional implementation manner of the present invention, obtaining the position information of any image block in the original image includes:

Obtaining a specified point in any one of the image blocks;

Obtaining the position coordinates of each of the specified points in the original image;

Normalizing the position coordinates to obtain normalized coordinates;

The position information is generated according to the normalized coordinates.

According to an optional embodiment of the present invention, inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model includes:

Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;

Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;

A super-resolution image of any image block is generated according to the target feature vector.

According to an optional embodiment of the present invention, the Transformer model includes: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer, and each of the combined feature vectors is input into the corresponding Transformer model for processing to obtain the target feature vector including:

Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;

Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;

Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;

Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;

Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;

Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.

According to an optional embodiment of the present invention, the method further includes:

Acquire the super-resolution image of each image block contained in the original image;

The acquired super-resolution images are spliced according to the position information corresponding to each of the image blocks to obtain a spliced target image, wherein the size of the target image matches the size of the original image.

A second aspect of the present invention provides an image processing device, the device comprising:

A constraint calculation module, used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

A position acquisition module, used to acquire the position information of any image block in the original image;

The model processing module is used to input the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolution image corresponding to any image block output by the Transformer model.

A third aspect of the present invention provides a computer device, comprising a processor and a memory, wherein the processor is configured to implement the image processing method when executing a computer program stored in the memory.

A fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, and the computer program implements the image processing method when executed by a processor.

The present invention obtains any image block corresponding to the low-resolution original image, and inputs any image block into a preset constraint model for constraint calculation, obtains the corresponding super-resolution image feature vector, obtains the position information of any image block in the original image, inputs the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtains the super-resolution image corresponding to any image block output by the Transformer model. In the present invention, the preset constraint model is first used to convert any image block of the low-resolution image into the corresponding super-resolution image feature vector, and then the super-resolution image feature vector and position information are input into the Transformer model for processing, so that the Transformer model does not need to process the low-resolution image, and can make the Transformer model better converge during the training process, without the need for a large amount of data training, and significantly improve the efficiency of converting the low-resolution original image into the super-resolution image. At the same time, the combination of the preset constraint model and the Transformer model can be used to process the detail information in the image, and the quality of the super-resolution image obtained can be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief introduction will be given below to the drawings required for use in the embodiments or the description of the prior art. Obviously, the drawings described below are some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative labor.

FIG1 is a flow chart of an image processing method provided in an embodiment of the present application;

FIG2 is a schematic diagram of a plurality of segmented images obtained by segmenting an image according to an embodiment of the present application;

FIG3 is a schematic diagram of an ADMM and a Transformer provided in an embodiment of the present application to realize reconstruction of a low-resolution CT image into a high-resolution CT image;

FIG4 is a schematic diagram of a CT image reconstructed by an ADMM algorithm provided in an embodiment of the present application;

FIG5 is a schematic diagram of a high-resolution CT image obtained by combining the ADMM algorithm with the Transformer reconstruction according to an embodiment of the present application;

FIG6 is a schematic diagram of the structure of an image processing device provided in an embodiment of the present application;

FIG. 7 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.

Specific embodiments

In order to more clearly understand the above-mentioned objects, features and advantages of the present invention, the following is a detailed description of the present invention in conjunction with the accompanying drawings and specific embodiments. The present invention is described in detail. In the absence of conflict, the embodiments of the present invention and the features in the embodiments can be combined with each other.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those generally understood by those skilled in the art of the present invention. The terms used in the specification of the present invention herein are only for the purpose of describing embodiments in an optional embodiment and are not intended to limit the present invention.

The image processing method provided by the embodiment of the present invention is executed by an electronic device, and accordingly, the image processing device runs in the electronic device.

The embodiments of the present invention can standardize the symptoms based on artificial intelligence technology. Artificial Intelligence (AI) is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.

The basic technologies of artificial intelligence generally include sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics, etc. Artificial intelligence software technologies mainly include computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

Embodiment 1

Fig. 1 is a flow chart of an image processing method provided by Embodiment 1 of the present invention. The image processing method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some steps can be omitted.

S11, obtaining any image block corresponding to the low-resolution original image, and inputting the any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector.

The original image refers to a low-resolution digital image. The electronic device can collect the original image through its own camera, or receive the original image sent by other devices.

When the present invention is applied to a digital medical scenario, the original image may be a digital medical image. The original image may be obtained from a digital medical database, which may be a digital library storing patient cases in a hospital, or a networked database of multiple hospitals, which is not limited by the present invention.

For ease of understanding, the following embodiments are described by taking the original image as a CT image acquired by a bronchoscope as an example.

In an optional implementation, the acquiring any image block corresponding to the low-resolution original image includes:

Determining an image mask matching any one of the image blocks;

After acquiring the low-resolution original image, the electronic device can divide the original image into blocks to obtain multiple block images. The electronic device divides the original image into blocks according to the size of the original image to obtain multiple block images of the same size. Exemplarily, assuming that the original image is 64*64, the original image can be divided into 4 block images of the same size, and the block images are 32*32. After obtaining the multiple block images, the electronic device obtains the position of each block image in the original image, and sets an image mask according to the position of each block image in the original image. The block images correspond to the image masks one by one, and the size of each image mask is consistent with the size of the original image. As shown in Figure 2, the electronic device divides the original image into blocks to obtain 4 block images.

The image mask is a preset binary image consisting of 0 and 1. The electronic device can set the size of the image mask according to the size of the original image. For example, if the size of the original image is W*H, the size of the image mask can be set to W*H.

The electronic device can use each image mask to multiply the original image. Specifically, each pixel in the original image is ANDed with each corresponding pixel in the image mask to obtain an image of the region of interest. The pixel values in the region of interest remain unchanged, while the pixel values outside the region of interest are all 0, thereby obtaining the required image block.

The above optional implementation, by setting a plurality of different image masks and performing calculations with the original image, can shield certain areas on the original image so that they do not participate in the processing. That is, when a certain image block is processed, other image blocks do not participate in the processing.

In an optional implementation, inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:

Alternating Direction Method of Multipliers (ADMM) is an important method for solving separable convex optimization problems with fast processing speed and good convergence performance.

Next, the principle of the alternating direction multiplier method is explained, assuming that the low-resolution CT image is modeled as follows:

In formula (1), Y is the CT value of each voxel point in the low-resolution CT image, B is the fuzzy operator, is the downsampling function, X is the CT value of each voxel point in the high-resolution CT image, Z is the transform domain of X, D is the transform domain function, λ is the parameter constraining the L1 norm, α is the Lagrangian parameter, ρ is the penalty parameter, x represents each column/row vector in X, and z represents each column/row vector in Z.

The above formula (1) is transformed into three sub-problems through dual decomposition:

The ADMM algorithm is used to solve the above three sub-problems as follows:

in,

The ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. The optimal solution for x can be calculated, thereby simplifying the calculation. x, y, z correspond to each column/row vector in X, Y, Z.

The mathematical a priori of CT super-resolution is relatively strong. The above formula (1) is used as the preset constraint model, and any image block is input into the preset constraint model to obtain a constraint function. The constraint function is then dually decomposed, and the constraint function obtained by dual decomposition is iteratively calculated based on the alternating direction multiplier method to finally obtain the super-resolution image feature vector. The solution process is relatively simple, avoiding the use of deep learning/neural network methods to achieve the transformation between low-resolution CT images and high-resolution CT images, avoiding complex solution processes, and no need to use a large amount of data for training.

S12: Obtain position information of any image block in the original image.

The position information of any image block in the original image can be represented by position coordinates. For example, the position coordinates of the four vertices of any image block in the original image can be obtained, and the obtained position coordinates of the four vertices are used as the position information of the any image block in the original image. For another example, the position coordinates of the geometric center point of any image block in the original image can be obtained, and the obtained position coordinates of the geometric center point are used as the position information of the any image block in the original image. The present invention does not impose any restrictions.

Obtaining a specified point in any one of the image blocks;

Normalizing the position coordinates to obtain normalized coordinates;

The position information is generated according to the normalized coordinates.

The designated point is a pixel point pre-designated in any image block, and may be the geometric center point of any image block, or the vertex of the upper left corner, or the vertex of the lower right corner, or any other point of any image block.

Since the position coordinates of a specified point in any image block in the original image are quite different, it is necessary to Normalization is performed so that the position coordinates of each specified point are unified within a data range.

In specific implementation, assuming that the position coordinates of each specified point are marked as (X, Y), the maximum and minimum horizontal coordinate values of all the horizontal coordinate values in the position coordinates are obtained, and the horizontal coordinate value of each position coordinate is normalized according to the maximum and minimum horizontal coordinate values.

The calculation formula for normalizing the horizontal axis value is as follows:
x'＝(x-X_min)/(X_max-X_min) (4)

In formula (4), X_min is the minimum abscissa value of all position coordinates, X_max is the maximum abscissa value of all position coordinates, and x' is the normalized abscissa.

The maximum ordinate value and the minimum ordinate value of the ordinate values in all position coordinates are obtained, and the ordinate of each position coordinate is normalized according to the maximum ordinate value and the minimum ordinate value.

The calculation formula for normalizing the vertical coordinate value is as follows:
y'＝(y-Y_min)/(Y_max-Y_min) (5)

In formula (5), Y_min is the minimum ordinate value of all position coordinates, Y_max is the maximum ordinate value of all position coordinates, and y' is the normalized ordinate.

Normalized coordinates are denoted by (x', y').

S13, inputting the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtaining the super-resolution image corresponding to any image block output by the Transformer model.

The Transformer model in this embodiment may include: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer.

In an optional embodiment, the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model comprises:

The combined feature vector is a feature vector obtained by concatenating the super-resolved image feature vector and the corresponding position information. For example, the combined feature vector is recorded as (position information, super-resolved image feature vector).

In an optional implementation, inputting each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector comprises:

In an optional embodiment, the method further comprises:

The acquired super-resolution images are spliced according to the position information corresponding to each image block to obtain a spliced target image.

One image block corresponds to one super-resolution image, and multiple super-resolution images corresponding to multiple image blocks are spliced according to the position information of the image blocks in the original image, and the obtained image is called a target image. The size of the target image matches the size of the original image.

In the above optional implementation, by splicing multiple super-resolution images according to the position information of corresponding image blocks, the resolution of the super-resolution image is higher than the resolution of the original image, thereby achieving super-resolution processing of the low-resolution original image.

The present invention combines the ADMM algorithm with the Transformer to realize the low-resolution original The reconstruction of the original image has a simple reconstruction process and does not require a large amount of data training. By inputting the reconstructed image into the Transformer model, a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is quickly realized. The high-resolution image obtained has better effect.

The following specifically describes the process of reconstructing a low-resolution CT image into a high-resolution CT image using the method of the present invention in conjunction with FIG. 3 .

First, the low-resolution CT image is divided into multiple block images, and the positional relationship of each block image in the low-resolution CT image is recorded. Assume that as shown in Figure 2, the low-resolution CT image is divided into 4 block images (①②③④), a mask image 1-Mask is set for block image ①, a mask image 2-Mask is set for block image ②, a mask image 3-Mask is set for block image ③, and a mask image 4-Mask is set for block image ④. Among them, mask images 1-Mask, 2-Mask, 3-Mask and 4-Mask are all binary images composed of 0 and 1, and they are all the same size as the low-resolution CT image. The position of the target pixel with a pixel value of 1 in mask image 1-Mask is consistent with the position of the corresponding block image ① in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 1-Mask except the target pixel are 0. The position of the target pixel with a pixel value of 1 in mask image 2-Mask is consistent with the position of the corresponding block image ② in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 2-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 3-Mask is consistent with the position of the corresponding block image ③ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 3-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 4-Mask is consistent with the position of the corresponding block image ④ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 4-Mask except the target pixel are 0.

Then, block image ① is processed through the processing path (101→102→103→104→105); block image ② is processed through the processing path (201→202→203→204→205). Similarly, block images ③ and ④ are processed using similar processing paths. FIG3 only describes the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205).

At 101, the low-resolution CT image is calculated through the mask image 1-Mask to obtain the image block corresponding to the block image ①. The image block corresponding to the block image ① is equivalent to covering up the block images ②③④ in FIG. 2 .

At 102, the ADMM algorithm is used to perform K iterations of calculation on the image block obtained after the processing at 101:

The ADMM algorithm can be used to process each small block in the image block obtained through 101, so that after being processed by 102, high-resolution image blocks are output through 102, and these high-resolution image blocks correspond to each small block contained in the image block obtained through 101.

At 103, the position information of the recorded block image ① can be obtained, and the position information can be encoded to obtain a position information encoding result. The position information encoding result is a position vector obtained by encoding the position information into a vector form.

At 104, the high-resolution image blocks obtained by processing 102 and the position information encoding results obtained by processing 103 are concatenated to obtain a vector sequence (i.e., a combined feature vector) consisting of high-resolution image blocks containing position information encoding results, and the combined feature vector is input into the Transformer model.

At 105, the combined feature vector is normalized by the first normalization layer of the Transformer model to obtain a first normalized feature vector; the first normalized feature vector is processed by the multi-head attention layer to obtain an attention feature vector; the first normalized feature vector and the attention feature vector are residually connected to obtain a residual feature vector; the residual feature vector is normalized by the second normalization layer to obtain a second normalized feature vector; the second normalized feature vector is fully connected by the fully connected layer to obtain a connection feature vector; the connection feature vector and the residual feature vector are residually connected to obtain a target feature vector. After being processed by the Transformer model, the key information contained in the image block can be highlighted, the noise information contained in the image block can be removed, and the quality of the high-resolution CT image subsequently obtained can be improved.

It should be understood that at 201, the low-resolution CT image is calculated through the mask image 2-Mask to obtain the image block corresponding to the block image ②, and the image block corresponding to the block image ② is equivalent to covering up the block images ①③④ in Figure 2. It should be understood that each image block obtained is consistent with the size of the low-resolution CT image.

Finally, at 106, the target feature vector obtained through the processing path (101→102→103→104→105) and the target feature vector obtained through the processing path (201→202→203→204→205), as well as the target feature vectors obtained through other similar processing paths, are spliced according to the position information encoding result to obtain a complete high-resolution CT image.

The Transformer model requires a large amount of data training, and the training process is relatively complex and difficult to converge. Therefore, the Transformer model cannot usually be used directly to convert low-resolution CT images into high-resolution CT images.

The present invention processes any image block in the low-resolution CT image by adding the ADMM algorithm to obtain the corresponding super-resolution image feature vector, and inputs the super-resolution image feature vector and the corresponding position information into the Transformer model for processing, and obtains the super-resolution image corresponding to the any image block output by the Transformer model. Since the ADMM algorithm has a mathematical prior, the corresponding super-resolution image feature vector and the value of the relevant parameters can be better solved. Then, the super-resolution image feature vector corresponding to the low-resolution CT image can be obtained through the ADMM algorithm. Then, the subsequent Transformer model only needs to continue processing the obtained super-resolution image feature vector and position information, and does not need to process the initial low-resolution CT image, so that it can converge quickly and improve processing efficiency. At the same time, since the present invention expands the traditional algorithm ADMM into a differentiable form and combines it with the Transformer, it is first encoded and decoded by the ADMM algorithm, the mathematical prior is given to the Transformer, and then further decoded by the Transformer model, the combination of mathematical prior and deep learning is realized, so that the solution is more accurate and faster, and the effect of reconstructing the high-resolution CT image is better.

In addition, compared with the high-resolution image reconstructed directly by using the ADMM algorithm, that is, the CT image directly obtained by splicing image blocks such as the processing path (101→102→103→104) or the processing path (201→202→203→204) as shown in Figure 3, as shown in Figure 4, the high-resolution image reconstructed by using the ADMM algorithm in combination with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205) as shown in Figure 3, as shown in Figure 5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of Figure 5 is higher and clearer than that of Figure 4, and Figure 5 contains more detail information.

The present invention can obtain high-resolution CT images without using CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.

Embodiment 2

FIG. 6 is a structural diagram of an image processing device provided in Embodiment 2 of the present invention.

In some embodiments, the image processing device 60 may include a plurality of functional modules composed of computer program segments. The computer program of each program segment in the image processing device 60 may be stored in a memory of an electronic device and executed by at least one processor to perform (see FIG. 1 for details) image processing functions.

In this embodiment, the image processing device 60 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: a constraint calculation module 601, a position acquisition module 602, a model processing module 603 and an image fusion module 604. The module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, which are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.

The constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector.

Determining an image mask matching any one of the image blocks;

The ADMM algorithm is used to solve the above three sub-problems as follows:

in,

ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. It can calculate the optimal solution of x, thereby simplifying the calculation. x, y, and z correspond to X, Y, and Z. Each column/row vector of .

The position acquisition module 602 is used to acquire the position information of any image block in the original image.

Obtaining a specified point in any one of the image blocks;

Normalizing the position coordinates to obtain normalized coordinates;

The position information is generated according to the normalized coordinates.

Since the position coordinates of a designated point in any image block in the original image are quite different, it is necessary to normalize the position coordinates so as to unify the position coordinates of each designated point into one data range.

The calculation formula for normalizing the vertical coordinate value is as follows:
y'＝(y-Y_min)/(Y_max-Y_min) (4)

In formula (4), Y_min is the minimum ordinate value of all position coordinates, Y_max is the maximum ordinate value of all position coordinates, and y' is the normalized ordinate.

Normalized coordinates are denoted by (x', y').

The model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model.

The image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; and to splice the super-resolution images obtained according to the position information corresponding to each image block to obtain a spliced target image.

The present invention combines the ADMM algorithm with the Transformer and realizes the reconstruction of the low-resolution original image through the ADMM algorithm. The reconstruction process is simple and does not require a large amount of data training. By inputting the reconstructed image into the Transformer model, a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is realized quickly, and the obtained high-resolution image has better effect.

In addition, compared with the high-resolution image reconstructed directly by the ADMM algorithm, that is, the CT image directly obtained by splicing image blocks such as the processing path (101→102→103→104) or the processing path (201→202→203→204) as shown in FIG3, as shown in FIG4, the high-resolution image reconstructed by the ADMM algorithm combined with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205) as shown in FIG3, as shown in FIG5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of FIG5 is better than that of FIG4. Higher, clearer, and Figure 5 contains more detail information.

The present invention can obtain high-resolution CT images without the need for CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.

Embodiment 3

This embodiment provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps in the above-mentioned image processing method embodiment are implemented, such as S11-S13 shown in FIG1 :

S11, obtaining any image block corresponding to the low-resolution original image, and inputting the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

S12, obtaining position information of any image block in the original image;

Alternatively, when the computer program is executed by a processor, the functions of each module/unit in the above-mentioned device embodiment are implemented, such as modules 601-604 in FIG. 6:

The constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

The position acquisition module 602 is used to acquire the position information of any image block in the original image;

The model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model;

The image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; according to the position information corresponding to each image block, the super-resolution image is spliced to obtain a spliced target image, and the size of the target image matches the size of the original image.

Embodiment 4

7 is a schematic diagram of the structure of an electronic device provided in Embodiment 3 of the present invention. In a preferred embodiment of the present invention, the electronic device 70 includes a memory 701 , at least one processor 702 , at least one communication bus 703 and a transceiver 704 .

Those skilled in the art should understand that the structure of the electronic device shown in FIG. 7 does not constitute a limitation of the embodiments of the present invention, and may be either a bus structure or a star structure. The electronic device 70 may also include more or less other hardware or software than shown in the figure, or a different component arrangement.

In some embodiments, the electronic device 70 is a device that can automatically perform numerical calculations and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application-specific integrated circuits, programmable gate arrays, digital processors, and embedded devices. The electronic device 70 may also include client devices, which include but are not limited to any electronic product that can interact with a client through a keyboard, mouse, remote control, touchpad, or voice-controlled device, such as a personal computer, tablet computer, smart phone, digital camera, etc.

The electronic device 70 is only an example. Other existing or future electronic products that are suitable for the present invention should also be included in the protection scope of the present invention and are included here by reference.

In some embodiments, the memory 701 stores a computer program, and when the computer program is executed by the at least one processor 702, all or part of the steps in the image processing method are implemented. The memory 701 includes a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), a one-time programmable read-only memory (OTPROM), an electronically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, magnetic disk storage, magnetic tape storage, or any other computer-readable medium that can be used to carry or store data.

Furthermore, the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application required for at least one function, etc.; the data storage area may store data created according to the use of the blockchain node, etc.

The blockchain referred to in the present invention is a new application model of computer technologies such as distributed data storage, peer-to-peer transmission, consensus mechanism, encryption algorithm, etc. Blockchain is essentially a decentralized database, a string of data blocks generated by cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify the validity of its information (anti-counterfeiting) and generate the next block. Blockchain can include the blockchain underlying platform, platform product service layer, and application service layer.

In some embodiments, the at least one processor 702 is the control core (Control Unit) of the electronic device 70, and uses various interfaces and lines to connect each component of the entire electronic device 70, and executes various functions and processes data of the electronic device 70 by running or executing the program or module stored in the memory 701, and calling the data stored in the memory 701. For example, when the at least one processor 702 executes the computer program stored in the memory, it implements all or part of the steps of the image processing method described in the embodiment of the present invention; or implements all or part of the functions of the image processing device. The at least one processor 702 can be composed of an integrated circuit, for example, it can be composed of a single packaged integrated circuit, or it can be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips.

In some embodiments, the at least one communication bus 703 is configured to implement connection and communication between the memory 701 and the at least one processor 702, etc.

Although not shown, the electronic device 70 may also include a power source (such as a battery) for supplying power to each component. Preferably, the power source may be logically connected to the at least one processor 702 through a power management device, so that the power management device can manage charging, discharging, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and other arbitrary components. The electronic device 70 may also include a variety of sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

The above-mentioned integrated unit implemented in the form of a software function module can be stored in a computer-readable storage medium. The above-mentioned software function module is stored in a storage medium and includes a number of instructions for enabling a computer device (which can be a personal computer, electronic device, or network device, etc.) or a processor to execute a part of the method described in each embodiment of the present invention.

In the several embodiments provided by the present invention, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing unit, each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of hardware plus software functional modules.

It is obvious to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the present invention can be implemented in other specific forms without departing from the spirit or essential features of the present invention. Therefore, from any point of view, the embodiments should be regarded as exemplary and non-restrictive, and the scope of the present invention is defined by the appended claims rather than the above description, and it is intended that all changes falling within the meaning and scope of the equivalent elements of the claims are included in the present invention. Any figure mark in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other units or, and the singular does not exclude the plural. Multiple units or devices stated in the specification may also be implemented by one unit or device through software or hardware. The words first, second, etc. are used to indicate names, and do not indicate any particular order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit the present invention. The embodiments describe the present invention in detail. Those skilled in the art should understand that the technical solutions of the present invention may be modified or replaced by equivalents without departing from the spirit and scope of the technical solutions of the present invention.

Claims

An image processing method, wherein the method comprises:

Acquire any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

Obtaining position information of any image block in the original image;

The super-resolution image feature vector and position information corresponding to any image block are input into the Transformer model for processing to obtain a super-resolution image corresponding to any image block output by the Transformer model.
The image processing method according to claim 1, wherein the acquiring any image block corresponding to the low-resolution original image comprises:

Determining an image mask matching any one of the image blocks;

The arbitrary image block is obtained by calculating the image mask and the original image.
The image processing method according to claim 1, wherein the step of inputting any one of the image blocks into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:

Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;

The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
The image processing method according to claim 1, wherein the step of obtaining the position information of any image block in the original image comprises:

Obtaining a specified point in any one of the image blocks;

Obtaining the position coordinates of each of the specified points in the original image;

Normalizing the position coordinates to obtain normalized coordinates;

The position information is generated according to the normalized coordinates.
The image processing method according to claim 4, wherein the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing, and obtaining a super-resolved image corresponding to any image block output by the Transformer model comprises:

Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;

Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;

A super-resolution image of any image block is generated according to the target feature vector.
The image processing method according to claim 5, wherein the Transformer model comprises: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer, and an output layer, and the step of inputting each of the combined feature vectors into the corresponding Transformer model for processing to obtain the target feature vector comprises:

Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;

Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;

Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;

Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;

Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;

Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.
The image processing method according to any one of claims 1 to 6, wherein the method further comprises:

Acquire the super-resolution image of each image block contained in the original image;

The acquired super-resolution images are spliced according to the position information corresponding to each of the image blocks to obtain a spliced target image, wherein the size of the target image matches the size of the original image.
An image processing device, characterized in that the device comprises:

A constraint calculation module, used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;

A position acquisition module, used to acquire the position information of any image block in the original image;

The model processing module is used to input the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolution image corresponding to any image block output by the Transformer model.
A computer device, wherein the computer device comprises a processor and a memory, and the processor is used to implement the image processing method according to any one of claims 1 to 7 when executing a computer program stored in the memory.
A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the image processing method according to any one of claims 1 to 7 is implemented.