WO2024120224A1 - Image processing method, apparatus, electronic device and storage medium - Google Patents

Image processing method, apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2024120224A1
WO2024120224A1 PCT/CN2023/134026 CN2023134026W WO2024120224A1 WO 2024120224 A1 WO2024120224 A1 WO 2024120224A1 CN 2023134026 W CN2023134026 W CN 2023134026W WO 2024120224 A1 WO2024120224 A1 WO 2024120224A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature vector
resolution
super
block
Prior art date
Application number
PCT/CN2023/134026
Other languages
French (fr)
Chinese (zh)
Inventor
李楠宇
陈日清
余坤璋
徐宏
苏晨晖
Original Assignee
杭州堃博生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州堃博生物科技有限公司 filed Critical 杭州堃博生物科技有限公司
Publication of WO2024120224A1 publication Critical patent/WO2024120224A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to the field of image processing technology, and in particular to an image processing method, device, electronic equipment and storage medium.
  • a first aspect of the present invention provides an image processing method, the method comprising:
  • the super-resolution image feature vector and position information corresponding to any image block are input into the Transformer model for processing to obtain a super-resolution image corresponding to any image block output by the Transformer model.
  • acquiring any image block corresponding to the low-resolution original image includes:
  • the arbitrary image block is obtained by calculating the image mask and the original image.
  • inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
  • the constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
  • obtaining the position information of any image block in the original image includes:
  • the position information is generated according to the normalized coordinates.
  • inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model includes:
  • a super-resolution image of any image block is generated according to the target feature vector.
  • the Transformer model includes: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer, and each of the combined feature vectors is input into the corresponding Transformer model for processing to obtain the target feature vector including:
  • the method further includes:
  • the acquired super-resolution images are spliced according to the position information corresponding to each of the image blocks to obtain a spliced target image, wherein the size of the target image matches the size of the original image.
  • a second aspect of the present invention provides an image processing device, the device comprising:
  • a constraint calculation module used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
  • a position acquisition module used to acquire the position information of any image block in the original image
  • the model processing module is used to input the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolution image corresponding to any image block output by the Transformer model.
  • a third aspect of the present invention provides a computer device, comprising a processor and a memory, wherein the processor is configured to implement the image processing method when executing a computer program stored in the memory.
  • a fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, and the computer program implements the image processing method when executed by a processor.
  • the present invention obtains any image block corresponding to the low-resolution original image, and inputs any image block into a preset constraint model for constraint calculation, obtains the corresponding super-resolution image feature vector, obtains the position information of any image block in the original image, inputs the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtains the super-resolution image corresponding to any image block output by the Transformer model.
  • the preset constraint model is first used to convert any image block of the low-resolution image into the corresponding super-resolution image feature vector, and then the super-resolution image feature vector and position information are input into the Transformer model for processing, so that the Transformer model does not need to process the low-resolution image, and can make the Transformer model better converge during the training process, without the need for a large amount of data training, and significantly improve the efficiency of converting the low-resolution original image into the super-resolution image.
  • the combination of the preset constraint model and the Transformer model can be used to process the detail information in the image, and the quality of the super-resolution image obtained can be improved.
  • FIG1 is a flow chart of an image processing method provided in an embodiment of the present application.
  • FIG2 is a schematic diagram of a plurality of segmented images obtained by segmenting an image according to an embodiment of the present application
  • FIG3 is a schematic diagram of an ADMM and a Transformer provided in an embodiment of the present application to realize reconstruction of a low-resolution CT image into a high-resolution CT image;
  • FIG4 is a schematic diagram of a CT image reconstructed by an ADMM algorithm provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of a high-resolution CT image obtained by combining the ADMM algorithm with the Transformer reconstruction according to an embodiment of the present application;
  • FIG6 is a schematic diagram of the structure of an image processing device provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the image processing method provided by the embodiment of the present invention is executed by an electronic device, and accordingly, the image processing device runs in the electronic device.
  • AI Artificial Intelligence
  • the basic technologies of artificial intelligence generally include sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics, etc.
  • Artificial intelligence software technologies mainly include computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Fig. 1 is a flow chart of an image processing method provided by Embodiment 1 of the present invention.
  • the image processing method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some steps can be omitted.
  • the original image refers to a low-resolution digital image.
  • the electronic device can collect the original image through its own camera, or receive the original image sent by other devices.
  • the original image may be a digital medical image.
  • the original image may be obtained from a digital medical database, which may be a digital library storing patient cases in a hospital, or a networked database of multiple hospitals, which is not limited by the present invention.
  • the acquiring any image block corresponding to the low-resolution original image includes:
  • the arbitrary image block is obtained by calculating the image mask and the original image.
  • the electronic device can divide the original image into blocks to obtain multiple block images.
  • the electronic device divides the original image into blocks according to the size of the original image to obtain multiple block images of the same size. Exemplarily, assuming that the original image is 64*64, the original image can be divided into 4 block images of the same size, and the block images are 32*32.
  • the electronic device obtains the position of each block image in the original image, and sets an image mask according to the position of each block image in the original image.
  • the block images correspond to the image masks one by one, and the size of each image mask is consistent with the size of the original image. As shown in Figure 2, the electronic device divides the original image into blocks to obtain 4 block images.
  • the image mask is a preset binary image consisting of 0 and 1.
  • the electronic device can set the size of the image mask according to the size of the original image. For example, if the size of the original image is W*H, the size of the image mask can be set to W*H.
  • the electronic device can use each image mask to multiply the original image. Specifically, each pixel in the original image is ANDed with each corresponding pixel in the image mask to obtain an image of the region of interest. The pixel values in the region of interest remain unchanged, while the pixel values outside the region of interest are all 0, thereby obtaining the required image block.
  • the above optional implementation by setting a plurality of different image masks and performing calculations with the original image, can shield certain areas on the original image so that they do not participate in the processing. That is, when a certain image block is processed, other image blocks do not participate in the processing.
  • inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
  • the constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
  • ADMM Alternating Direction Method of Multipliers
  • Y is the CT value of each voxel point in the low-resolution CT image
  • B is the fuzzy operator
  • X is the CT value of each voxel point in the high-resolution CT image
  • Z is the transform domain of X
  • D is the transform domain function
  • is the parameter constraining the L1 norm
  • is the Lagrangian parameter
  • is the penalty parameter
  • x represents each column/row vector in X
  • z represents each column/row vector in Z.
  • the ADMM algorithm is used to solve the above three sub-problems as follows:
  • the ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. The optimal solution for x can be calculated, thereby simplifying the calculation. x, y, z correspond to each column/row vector in X, Y, Z.
  • the mathematical a priori of CT super-resolution is relatively strong.
  • the above formula (1) is used as the preset constraint model, and any image block is input into the preset constraint model to obtain a constraint function.
  • the constraint function is then dually decomposed, and the constraint function obtained by dual decomposition is iteratively calculated based on the alternating direction multiplier method to finally obtain the super-resolution image feature vector.
  • the solution process is relatively simple, avoiding the use of deep learning/neural network methods to achieve the transformation between low-resolution CT images and high-resolution CT images, avoiding complex solution processes, and no need to use a large amount of data for training.
  • the position information of any image block in the original image can be represented by position coordinates.
  • the position coordinates of the four vertices of any image block in the original image can be obtained, and the obtained position coordinates of the four vertices are used as the position information of the any image block in the original image.
  • the position coordinates of the geometric center point of any image block in the original image can be obtained, and the obtained position coordinates of the geometric center point are used as the position information of the any image block in the original image.
  • the present invention does not impose any restrictions.
  • obtaining the position information of any image block in the original image includes:
  • the position information is generated according to the normalized coordinates.
  • the designated point is a pixel point pre-designated in any image block, and may be the geometric center point of any image block, or the vertex of the upper left corner, or the vertex of the lower right corner, or any other point of any image block.
  • X_min is the minimum abscissa value of all position coordinates
  • X_max is the maximum abscissa value of all position coordinates
  • x' is the normalized abscissa.
  • the maximum ordinate value and the minimum ordinate value of the ordinate values in all position coordinates are obtained, and the ordinate of each position coordinate is normalized according to the maximum ordinate value and the minimum ordinate value.
  • Y_min is the minimum ordinate value of all position coordinates
  • Y_max is the maximum ordinate value of all position coordinates
  • y' is the normalized ordinate.
  • the Transformer model in this embodiment may include: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer.
  • the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model comprises:
  • a super-resolution image of any image block is generated according to the target feature vector.
  • the combined feature vector is a feature vector obtained by concatenating the super-resolved image feature vector and the corresponding position information.
  • the combined feature vector is recorded as (position information, super-resolved image feature vector).
  • inputting each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector comprises:
  • the method further comprises:
  • the acquired super-resolution images are spliced according to the position information corresponding to each image block to obtain a spliced target image.
  • One image block corresponds to one super-resolution image, and multiple super-resolution images corresponding to multiple image blocks are spliced according to the position information of the image blocks in the original image, and the obtained image is called a target image.
  • the size of the target image matches the size of the original image.
  • the resolution of the super-resolution image is higher than the resolution of the original image, thereby achieving super-resolution processing of the low-resolution original image.
  • the present invention combines the ADMM algorithm with the Transformer to realize the low-resolution original
  • the reconstruction of the original image has a simple reconstruction process and does not require a large amount of data training.
  • By inputting the reconstructed image into the Transformer model a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is quickly realized.
  • the high-resolution image obtained has better effect.
  • the low-resolution CT image is divided into multiple block images, and the positional relationship of each block image in the low-resolution CT image is recorded.
  • the low-resolution CT image is divided into 4 block images (1234)
  • a mask image 1-Mask is set for block image 1
  • a mask image 2-Mask is set for block image 2
  • a mask image 3-Mask is set for block image 3
  • a mask image 4-Mask is set for block image 4.
  • mask images 1-Mask, 2-Mask, 3-Mask and 4-Mask are all binary images composed of 0 and 1, and they are all the same size as the low-resolution CT image.
  • the position of the target pixel with a pixel value of 1 in mask image 1-Mask is consistent with the position of the corresponding block image 1 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 1-Mask except the target pixel are 0.
  • the position of the target pixel with a pixel value of 1 in mask image 2-Mask is consistent with the position of the corresponding block image 2 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 2-Mask except the target pixel are 0.
  • the position of the pixel with a target pixel value of 1 in mask image 3-Mask is consistent with the position of the corresponding block image 3 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 3-Mask except the target pixel are 0.
  • the position of the pixel with a target pixel value of 1 in mask image 4-Mask is consistent with the position of the corresponding block image 4 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 4-Mask except the target pixel are 0.
  • block image 1 is processed through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105); block image 2 is processed through the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205).
  • block images 3 and 4 are processed using similar processing paths.
  • FIG3 only describes the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205).
  • the low-resolution CT image is calculated through the mask image 1-Mask to obtain the image block corresponding to the block image 1.
  • the image block corresponding to the block image 1 is equivalent to covering up the block images 234 in FIG. 2 .
  • the ADMM algorithm is used to perform K iterations of calculation on the image block obtained after the processing at 101:
  • the ADMM algorithm can be used to process each small block in the image block obtained through 101, so that after being processed by 102, high-resolution image blocks are output through 102, and these high-resolution image blocks correspond to each small block contained in the image block obtained through 101.
  • the position information of the recorded block image 1 can be obtained, and the position information can be encoded to obtain a position information encoding result.
  • the position information encoding result is a position vector obtained by encoding the position information into a vector form.
  • the high-resolution image blocks obtained by processing 102 and the position information encoding results obtained by processing 103 are concatenated to obtain a vector sequence (i.e., a combined feature vector) consisting of high-resolution image blocks containing position information encoding results, and the combined feature vector is input into the Transformer model.
  • a vector sequence i.e., a combined feature vector
  • the combined feature vector is normalized by the first normalization layer of the Transformer model to obtain a first normalized feature vector; the first normalized feature vector is processed by the multi-head attention layer to obtain an attention feature vector; the first normalized feature vector and the attention feature vector are residually connected to obtain a residual feature vector; the residual feature vector is normalized by the second normalization layer to obtain a second normalized feature vector; the second normalized feature vector is fully connected by the fully connected layer to obtain a connection feature vector; the connection feature vector and the residual feature vector are residually connected to obtain a target feature vector.
  • the key information contained in the image block can be highlighted, the noise information contained in the image block can be removed, and the quality of the high-resolution CT image subsequently obtained can be improved.
  • the low-resolution CT image is calculated through the mask image 2-Mask to obtain the image block corresponding to the block image 2, and the image block corresponding to the block image 2 is equivalent to covering up the block images 134 in Figure 2. It should be understood that each image block obtained is consistent with the size of the low-resolution CT image.
  • the target feature vector obtained through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the target feature vector obtained through the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205), as well as the target feature vectors obtained through other similar processing paths, are spliced according to the position information encoding result to obtain a complete high-resolution CT image.
  • the Transformer model requires a large amount of data training, and the training process is relatively complex and difficult to converge. Therefore, the Transformer model cannot usually be used directly to convert low-resolution CT images into high-resolution CT images.
  • the present invention processes any image block in the low-resolution CT image by adding the ADMM algorithm to obtain the corresponding super-resolution image feature vector, and inputs the super-resolution image feature vector and the corresponding position information into the Transformer model for processing, and obtains the super-resolution image corresponding to the any image block output by the Transformer model. Since the ADMM algorithm has a mathematical prior, the corresponding super-resolution image feature vector and the value of the relevant parameters can be better solved. Then, the super-resolution image feature vector corresponding to the low-resolution CT image can be obtained through the ADMM algorithm.
  • the subsequent Transformer model only needs to continue processing the obtained super-resolution image feature vector and position information, and does not need to process the initial low-resolution CT image, so that it can converge quickly and improve processing efficiency.
  • the present invention expands the traditional algorithm ADMM into a differentiable form and combines it with the Transformer, it is first encoded and decoded by the ADMM algorithm, the mathematical prior is given to the Transformer, and then further decoded by the Transformer model, the combination of mathematical prior and deep learning is realized, so that the solution is more accurate and faster, and the effect of reconstructing the high-resolution CT image is better.
  • the high-resolution image reconstructed by using the ADMM algorithm in combination with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205) as shown in Figure 3, as shown in Figure 5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of Figure 5 is higher and clearer than that of Figure 4, and Figure 5 contains more detail information.
  • the present invention can obtain high-resolution CT images without using CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.
  • FIG. 6 is a structural diagram of an image processing device provided in Embodiment 2 of the present invention.
  • the image processing device 60 may include a plurality of functional modules composed of computer program segments.
  • the computer program of each program segment in the image processing device 60 may be stored in a memory of an electronic device and executed by at least one processor to perform (see FIG. 1 for details) image processing functions.
  • the image processing device 60 can be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: a constraint calculation module 601, a position acquisition module 602, a model processing module 603 and an image fusion module 604.
  • the module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, which are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
  • the constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector.
  • the original image refers to a low-resolution digital image.
  • the electronic device can collect the original image through its own camera, or receive the original image sent by other devices.
  • the original image may be a digital medical image.
  • the original image may be obtained from a digital medical database, which may be a digital library storing patient cases in a hospital, or a networked database of multiple hospitals, which is not limited by the present invention.
  • the acquiring any image block corresponding to the low-resolution original image includes:
  • the arbitrary image block is obtained by calculating the image mask and the original image.
  • the electronic device can divide the original image into blocks to obtain multiple block images.
  • the electronic device divides the original image into blocks according to the size of the original image to obtain multiple block images of the same size. Exemplarily, assuming that the original image is 64*64, the original image can be divided into 4 block images of the same size, and the block images are 32*32.
  • the electronic device obtains the position of each block image in the original image, and sets an image mask according to the position of each block image in the original image.
  • the block images correspond to the image masks one by one, and the size of each image mask is consistent with the size of the original image. As shown in Figure 2, the electronic device divides the original image into blocks to obtain 4 block images.
  • the image mask is a preset binary image consisting of 0 and 1.
  • the electronic device can set the size of the image mask according to the size of the original image. For example, if the size of the original image is W*H, the size of the image mask can be set to W*H.
  • the electronic device can use each image mask to multiply the original image. Specifically, each pixel in the original image is ANDed with each corresponding pixel in the image mask to obtain an image of the region of interest. The pixel values in the region of interest remain unchanged, while the pixel values outside the region of interest are all 0, thereby obtaining the required image block.
  • the above optional implementation by setting a plurality of different image masks and performing calculations with the original image, can shield certain areas on the original image so that they do not participate in the processing. That is, when a certain image block is processed, other image blocks do not participate in the processing.
  • inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
  • the constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
  • ADMM Alternating Direction Method of Multipliers
  • Y is the CT value of each voxel point in the low-resolution CT image
  • B is the fuzzy operator
  • X is the CT value of each voxel point in the high-resolution CT image
  • Z is the transform domain of X
  • D is the transform domain function
  • is the parameter constraining the L1 norm
  • is the Lagrangian parameter
  • is the penalty parameter
  • x represents each column/row vector in X
  • z represents each column/row vector in Z.
  • the ADMM algorithm is used to solve the above three sub-problems as follows:
  • ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. It can calculate the optimal solution of x, thereby simplifying the calculation.
  • x, y, and z correspond to X, Y, and Z.
  • the mathematical a priori of CT super-resolution is relatively strong.
  • the above formula (1) is used as the preset constraint model, and any image block is input into the preset constraint model to obtain a constraint function.
  • the constraint function is then dually decomposed, and the constraint function obtained by dual decomposition is iteratively calculated based on the alternating direction multiplier method to finally obtain the super-resolution image feature vector.
  • the solution process is relatively simple, avoiding the use of deep learning/neural network methods to achieve the transformation between low-resolution CT images and high-resolution CT images, avoiding complex solution processes, and no need to use a large amount of data for training.
  • the position acquisition module 602 is used to acquire the position information of any image block in the original image.
  • the position information of any image block in the original image can be represented by position coordinates.
  • the position coordinates of the four vertices of any image block in the original image can be obtained, and the obtained position coordinates of the four vertices are used as the position information of the any image block in the original image.
  • the position coordinates of the geometric center point of any image block in the original image can be obtained, and the obtained position coordinates of the geometric center point are used as the position information of the any image block in the original image.
  • the present invention does not impose any restrictions.
  • obtaining the position information of any image block in the original image includes:
  • the position information is generated according to the normalized coordinates.
  • the designated point is a pixel point pre-designated in any image block, and may be the geometric center point of any image block, or the vertex of the upper left corner, or the vertex of the lower right corner, or any other point of any image block.
  • X_min is the minimum abscissa value of all position coordinates
  • X_max is the maximum abscissa value of all position coordinates
  • x' is the normalized abscissa.
  • the maximum ordinate value and the minimum ordinate value of the ordinate values in all position coordinates are obtained, and the ordinate of each position coordinate is normalized according to the maximum ordinate value and the minimum ordinate value.
  • Y_min is the minimum ordinate value of all position coordinates
  • Y_max is the maximum ordinate value of all position coordinates
  • y' is the normalized ordinate.
  • the model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model.
  • the Transformer model in this embodiment may include: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer.
  • the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model comprises:
  • a super-resolution image of any image block is generated according to the target feature vector.
  • the combined feature vector is a feature vector obtained by concatenating the super-resolved image feature vector and the corresponding position information.
  • the combined feature vector is recorded as (position information, super-resolved image feature vector).
  • inputting each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector comprises:
  • the image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; and to splice the super-resolution images obtained according to the position information corresponding to each image block to obtain a spliced target image.
  • One image block corresponds to one super-resolution image, and multiple super-resolution images corresponding to multiple image blocks are spliced according to the position information of the image blocks in the original image, and the obtained image is called a target image.
  • the size of the target image matches the size of the original image.
  • the resolution of the super-resolution image is higher than the resolution of the original image, thereby achieving super-resolution processing of the low-resolution original image.
  • the present invention combines the ADMM algorithm with the Transformer and realizes the reconstruction of the low-resolution original image through the ADMM algorithm.
  • the reconstruction process is simple and does not require a large amount of data training.
  • By inputting the reconstructed image into the Transformer model a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is realized quickly, and the obtained high-resolution image has better effect.
  • the low-resolution CT image is divided into multiple block images, and the positional relationship of each block image in the low-resolution CT image is recorded.
  • the low-resolution CT image is divided into 4 block images (1234)
  • a mask image 1-Mask is set for block image 1
  • a mask image 2-Mask is set for block image 2
  • a mask image 3-Mask is set for block image 3
  • a mask image 4-Mask is set for block image 4.
  • mask images 1-Mask, 2-Mask, 3-Mask and 4-Mask are all binary images composed of 0 and 1, and they are all the same size as the low-resolution CT image.
  • the position of the target pixel with a pixel value of 1 in mask image 1-Mask is consistent with the position of the corresponding block image 1 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 1-Mask except the target pixel are 0.
  • the position of the target pixel with a pixel value of 1 in mask image 2-Mask is consistent with the position of the corresponding block image 2 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 2-Mask except the target pixel are 0.
  • the position of the pixel with a target pixel value of 1 in mask image 3-Mask is consistent with the position of the corresponding block image 3 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 3-Mask except the target pixel are 0.
  • the position of the pixel with a target pixel value of 1 in mask image 4-Mask is consistent with the position of the corresponding block image 4 in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 4-Mask except the target pixel are 0.
  • block image 1 is processed through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105); block image 2 is processed through the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205).
  • block images 3 and 4 are processed using similar processing paths.
  • FIG3 only describes the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205).
  • the low-resolution CT image is calculated through the mask image 1-Mask to obtain the image block corresponding to the block image 1.
  • the image block corresponding to the block image 1 is equivalent to covering up the block images 234 in FIG. 2 .
  • the ADMM algorithm is used to perform K iterations of calculation on the image block obtained after the processing at 101:
  • the ADMM algorithm can be used to process each small block in the image block obtained through 101, so that after being processed by 102, high-resolution image blocks are output through 102, and these high-resolution image blocks correspond to each small block contained in the image block obtained through 101.
  • the position information of the recorded block image 1 can be obtained, and the position information can be encoded to obtain a position information encoding result.
  • the position information encoding result is a position vector obtained by encoding the position information into a vector form.
  • the high-resolution image blocks obtained by processing 102 and the position information encoding results obtained by processing 103 are concatenated to obtain a vector sequence (i.e., a combined feature vector) consisting of high-resolution image blocks containing position information encoding results, and the combined feature vector is input into the Transformer model.
  • a vector sequence i.e., a combined feature vector
  • the combined feature vector is normalized by the first normalization layer of the Transformer model to obtain a first normalized feature vector; the first normalized feature vector is processed by the multi-head attention layer to obtain an attention feature vector; the first normalized feature vector and the attention feature vector are residually connected to obtain a residual feature vector; the residual feature vector is normalized by the second normalization layer to obtain a second normalized feature vector; the second normalized feature vector is fully connected by the fully connected layer to obtain a connection feature vector; the connection feature vector and the residual feature vector are residually connected to obtain a target feature vector.
  • the key information contained in the image block can be highlighted, the noise information contained in the image block can be removed, and the quality of the high-resolution CT image subsequently obtained can be improved.
  • the low-resolution CT image is calculated through the mask image 2-Mask to obtain the image block corresponding to the block image 2, and the image block corresponding to the block image 2 is equivalent to covering up the block images 134 in Figure 2. It should be understood that each image block obtained is consistent with the size of the low-resolution CT image.
  • the target feature vector obtained through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the target feature vector obtained through the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205), as well as the target feature vectors obtained through other similar processing paths, are spliced according to the position information encoding result to obtain a complete high-resolution CT image.
  • the Transformer model requires a large amount of data training, and the training process is relatively complex and difficult to converge. Therefore, the Transformer model cannot usually be used directly to convert low-resolution CT images into high-resolution CT images.
  • the present invention processes any image block in the low-resolution CT image by adding the ADMM algorithm to obtain the corresponding super-resolution image feature vector, and inputs the super-resolution image feature vector and the corresponding position information into the Transformer model for processing, and obtains the super-resolution image corresponding to the any image block output by the Transformer model. Since the ADMM algorithm has a mathematical prior, the corresponding super-resolution image feature vector and the value of the relevant parameters can be better solved. Then, the super-resolution image feature vector corresponding to the low-resolution CT image can be obtained through the ADMM algorithm.
  • the subsequent Transformer model only needs to continue processing the obtained super-resolution image feature vector and position information, and does not need to process the initial low-resolution CT image, so that it can converge quickly and improve processing efficiency.
  • the present invention expands the traditional algorithm ADMM into a differentiable form and combines it with the Transformer, it is first encoded and decoded by the ADMM algorithm, the mathematical prior is given to the Transformer, and then further decoded by the Transformer model, the combination of mathematical prior and deep learning is realized, so that the solution is more accurate and faster, and the effect of reconstructing the high-resolution CT image is better.
  • the high-resolution image reconstructed by the ADMM algorithm combined with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101 ⁇ 102 ⁇ 103 ⁇ 104 ⁇ 105) and the processing path (201 ⁇ 202 ⁇ 203 ⁇ 204 ⁇ 205) as shown in FIG3, as shown in FIG5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of FIG5 is better than that of FIG4. Higher, clearer, and Figure 5 contains more detail information.
  • the present invention can obtain high-resolution CT images without the need for CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.
  • This embodiment provides a computer-readable storage medium, on which a computer program is stored.
  • the steps in the above-mentioned image processing method embodiment are implemented, such as S11-S13 shown in FIG1 :
  • modules 601-604 in FIG. 6 are implemented, such as modules 601-604 in FIG. 6:
  • the constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
  • the position acquisition module 602 is used to acquire the position information of any image block in the original image
  • the model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model;
  • the image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; according to the position information corresponding to each image block, the super-resolution image is spliced to obtain a spliced target image, and the size of the target image matches the size of the original image.
  • the electronic device 70 includes a memory 701 , at least one processor 702 , at least one communication bus 703 and a transceiver 704 .
  • the structure of the electronic device shown in FIG. 7 does not constitute a limitation of the embodiments of the present invention, and may be either a bus structure or a star structure.
  • the electronic device 70 may also include more or less other hardware or software than shown in the figure, or a different component arrangement.
  • the electronic device 70 is a device that can automatically perform numerical calculations and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application-specific integrated circuits, programmable gate arrays, digital processors, and embedded devices.
  • the electronic device 70 may also include client devices, which include but are not limited to any electronic product that can interact with a client through a keyboard, mouse, remote control, touchpad, or voice-controlled device, such as a personal computer, tablet computer, smart phone, digital camera, etc.
  • the electronic device 70 is only an example. Other existing or future electronic products that are suitable for the present invention should also be included in the protection scope of the present invention and are included here by reference.
  • the memory 701 stores a computer program, and when the computer program is executed by the at least one processor 702, all or part of the steps in the image processing method are implemented.
  • the memory 701 includes a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), a one-time programmable read-only memory (OTPROM), an electronically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, magnetic disk storage, magnetic tape storage, or any other computer-readable medium that can be used to carry or store data.
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM erasable programmable read-only memory
  • OTPROM one-time programmable read-only memory
  • EEPROM electronically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • CD-ROM compact disc read-only memory
  • the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application required for at least one function, etc.; the data storage area may store data created according to the use of the blockchain node, etc.
  • the blockchain referred to in the present invention is a new application model of computer technologies such as distributed data storage, peer-to-peer transmission, consensus mechanism, encryption algorithm, etc.
  • Blockchain is essentially a decentralized database, a string of data blocks generated by cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify the validity of its information (anti-counterfeiting) and generate the next block.
  • Blockchain can include the blockchain underlying platform, platform product service layer, and application service layer.
  • the at least one processor 702 is the control core (Control Unit) of the electronic device 70, and uses various interfaces and lines to connect each component of the entire electronic device 70, and executes various functions and processes data of the electronic device 70 by running or executing the program or module stored in the memory 701, and calling the data stored in the memory 701.
  • the at least one processor 702 executes the computer program stored in the memory, it implements all or part of the steps of the image processing method described in the embodiment of the present invention; or implements all or part of the functions of the image processing device.
  • the at least one processor 702 can be composed of an integrated circuit, for example, it can be composed of a single packaged integrated circuit, or it can be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips.
  • CPU central processing units
  • microprocessors digital processing chips
  • graphics processors and various control chips.
  • the at least one communication bus 703 is configured to implement connection and communication between the memory 701 and the at least one processor 702, etc.
  • the electronic device 70 may also include a power source (such as a battery) for supplying power to each component.
  • the power source may be logically connected to the at least one processor 702 through a power management device, so that the power management device can manage charging, discharging, and power consumption management.
  • the power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and other arbitrary components.
  • the electronic device 70 may also include a variety of sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the above-mentioned integrated unit implemented in the form of a software function module can be stored in a computer-readable storage medium.
  • the above-mentioned software function module is stored in a storage medium and includes a number of instructions for enabling a computer device (which can be a personal computer, electronic device, or network device, etc.) or a processor to execute a part of the method described in each embodiment of the present invention.
  • modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present invention may be integrated into one processing unit, each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware or in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure relates to the technical field of image processing, and provides an image processing method, an apparatus, an electronic device and a storage medium. The method comprises: inputting any image block of a low-resolution original image into a preset constraint model for constraint computation, so as to obtain a corresponding super-resolution image feature vector; then inputting the super-resolution image feature vector corresponding to said any image block and position information of said any image block in the original image into a Transformer model for processing, such that the Transformer model can achieve better convergence during a training process and does not need a large amount of data for training; and finally obtaining a super-resolution image output by the Transformer model and corresponding to said any image block. Therefore, the efficiency of converting low-resolution original images into super-resolution images is remarkably improved.

Description

图像处理方法、装置、电子设备及存储介质Image processing method, device, electronic device and storage medium 技术领域Technical Field
本发明涉及图像处理技术领域,尤其涉及一种图像处理方法、装置、电子设备及存储介质。The present invention relates to the field of image processing technology, and in particular to an image processing method, device, electronic equipment and storage medium.
背景技术Background technique
目前,视频监控、医学诊断和遥感应用对高分辨率图像的需求日益迫切。采用图像采集设备直接获取高分辨率图像存在成本高、耗时长、高辐射剂量等问题,难以满足获得高分辨率图像的需求。采用基于机器学习的超分辨率重建方法,需要依赖于训练库进行训练,导致重建效率较低。At present, the demand for high-resolution images in video surveillance, medical diagnosis and remote sensing applications is increasingly urgent. Using image acquisition equipment to directly obtain high-resolution images has problems such as high cost, long time consumption and high radiation dose, which makes it difficult to meet the demand for high-resolution images. The super-resolution reconstruction method based on machine learning needs to rely on the training library for training, resulting in low reconstruction efficiency.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种图像处理方法、装置、电子设备及计算机可读存储介质,能够快速的将低分辨率图像重建为高分辨率图像。In view of the above, it is necessary to propose an image processing method, device, electronic device and computer-readable storage medium, which can quickly reconstruct a low-resolution image into a high-resolution image.
本发明的第一方面提供一种图像处理方法,所述方法包括:A first aspect of the present invention provides an image processing method, the method comprising:
获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;Obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
获取所述任一图像块在所述原始图像中的位置信息;Obtaining position information of any image block in the original image;
将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。The super-resolution image feature vector and position information corresponding to any image block are input into the Transformer model for processing to obtain a super-resolution image corresponding to any image block output by the Transformer model.
根据本发明的一个可选的实施方式,所述获取对应于低分辨率的原始图像的任一图像块包括:According to an optional implementation manner of the present invention, acquiring any image block corresponding to the low-resolution original image includes:
确定匹配于所述任一图像块的图像掩膜;Determining an image mask matching any one of the image blocks;
通过将所述图像掩膜与所述原始图像进行计算,得到所述任一图像块。The arbitrary image block is obtained by calculating the image mask and the original image.
根据本发明的一个可选的实施方式,所述将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量包括:According to an optional implementation of the present invention, inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
将所述任一图像块输入所述预设约束模型中,得到约束函数;Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;
对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,得到所述超分图像特征向量。The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
根据本发明的一个可选的实施方式,所述获取所述任一图像块在所述原始图像中的位置信息包括:According to an optional implementation manner of the present invention, obtaining the position information of any image block in the original image includes:
获取所述任一图像块中的指定点;Obtaining a specified point in any one of the image blocks;
获取每个所述指定点在所述原始图像中的位置坐标;Obtaining the position coordinates of each of the specified points in the original image;
对所述位置坐标进行归一化,得到归一化坐标;Normalizing the position coordinates to obtain normalized coordinates;
根据所述归一化坐标生成所述位置信息。The position information is generated according to the normalized coordinates.
根据本发明的一个可选的实施方式,所述将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像包括:According to an optional embodiment of the present invention, inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model includes:
根据所述任一图像块对应的所述超分图像特征向量及所述位置信息生成组合特征向量;Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;
将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量;Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;
根据所述目标特征向量生成所述任一图像块的超分图像。A super-resolution image of any image block is generated according to the target feature vector.
根据本发明的一个可选的实施方式,所述Transformer模型包括:第一归一化层、多头注意力层、第二归一化层、全连接层及输出层,所述将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量包括:According to an optional embodiment of the present invention, the Transformer model includes: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer, and each of the combined feature vectors is input into the corresponding Transformer model for processing to obtain the target feature vector including:
通过所述第一归一化层对所述组合特征向量进行归一化,得到第一归一化特征向量; Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;
通过所述多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;
对所述第一归一化特征向量及所述注意力特征向量进行残差连接,得到残差特征向量;Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;
通过所述第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;
通过所述全连接层对所述第二归一化特征向量进行全连接计算,得到连接特征向量;Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;
对所述连接特征向量及所述残差特征向量进行残差连接,得到所述目标特征向量。Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.
根据本发明的一个可选的实施方式,所述方法还包括:According to an optional embodiment of the present invention, the method further includes:
获取所述原始图像所含的每个所述图像块的所述超分图像;Acquire the super-resolution image of each image block contained in the original image;
根据每个所述图像块对应的所述位置信息,对获取到的所述超分图像进行拼接,获得拼接后的目标图像,所述目标图像的大小匹配于所述原始图像的大小。The acquired super-resolution images are spliced according to the position information corresponding to each of the image blocks to obtain a spliced target image, wherein the size of the target image matches the size of the original image.
本发明的第二方面提供一种图像处理装置,所述装置包括:A second aspect of the present invention provides an image processing device, the device comprising:
约束计算模块,用于获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;A constraint calculation module, used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
位置获取模块,用于获取所述任一图像块在所述原始图像中的位置信息;A position acquisition module, used to acquire the position information of any image block in the original image;
模型处理模块,用于将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。The model processing module is used to input the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolution image corresponding to any image block output by the Transformer model.
本发明的第三方面提供一种计算机设备,所述计算机设备包括处理器和存储器,所述处理器用于执行所述存储器中存储的计算机程序时实现所述的图像处理方法。A third aspect of the present invention provides a computer device, comprising a processor and a memory, wherein the processor is configured to implement the image processing method when executing a computer program stored in the memory.
本发明的第四方面提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现所述的图像处理方法。A fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, and the computer program implements the image processing method when executed by a processor.
本发明获取对应于低分辨率的原始图像的任一图像块,并将任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量,获取任一图像块在原始图像中的位置信息,将任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由Transformer模型输出的对应于任一图像块的超分图像,本发明中先利用预设约束模型将低分辨率图像的任一图像块转换为相应的超分图像特征向量,然后将超分图像特征向量和位置信息输入Transformer模型进行处理,使得Transformer模型无需对低分辨率的图像进行处理,可以使得Transformer模型在训练过程中能够更好的收敛,无需大量数据训练,显著提升将低分辨率的原始图像转换为超分图像的效率。同时,结合预设约束模型和Transformer模型可以针对图像中细节信息进行处理,可以提升获取到的超分图像的质量。The present invention obtains any image block corresponding to the low-resolution original image, and inputs any image block into a preset constraint model for constraint calculation, obtains the corresponding super-resolution image feature vector, obtains the position information of any image block in the original image, inputs the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtains the super-resolution image corresponding to any image block output by the Transformer model. In the present invention, the preset constraint model is first used to convert any image block of the low-resolution image into the corresponding super-resolution image feature vector, and then the super-resolution image feature vector and position information are input into the Transformer model for processing, so that the Transformer model does not need to process the low-resolution image, and can make the Transformer model better converge during the training process, without the need for a large amount of data training, and significantly improve the efficiency of converting the low-resolution original image into the super-resolution image. At the same time, the combination of the preset constraint model and the Transformer model can be used to process the detail information in the image, and the quality of the super-resolution image obtained can be improved.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief introduction will be given below to the drawings required for use in the embodiments or the description of the prior art. Obviously, the drawings described below are some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative labor.
图1为本申请实施例提供的图像处理方法的流程图;FIG1 is a flow chart of an image processing method provided in an embodiment of the present application;
图2为本申请实施例提供的对图像进行分块得到的多个分块图像的示意图;FIG2 is a schematic diagram of a plurality of segmented images obtained by segmenting an image according to an embodiment of the present application;
图3为本申请实施例提供的ADMM与Transformer实现低分辨CT图像重建为高分辨率CT图像的示意图;FIG3 is a schematic diagram of an ADMM and a Transformer provided in an embodiment of the present application to realize reconstruction of a low-resolution CT image into a high-resolution CT image;
图4为本申请实施例提供的经过ADMM算法重建得到的CT图像的示意图;FIG4 is a schematic diagram of a CT image reconstructed by an ADMM algorithm provided in an embodiment of the present application;
图5为本申请实施例提供的经过ADMM算法结合Transformer重建得到的高分辨率CT图像的示意图;FIG5 is a schematic diagram of a high-resolution CT image obtained by combining the ADMM algorithm with the Transformer reconstruction according to an embodiment of the present application;
图6为本申请实施例提供的图像处理装置的结构示意图;FIG6 is a schematic diagram of the structure of an image processing device provided in an embodiment of the present application;
图7为本申请实施例提供的电子设备的结构示意图。FIG. 7 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
具体实施例Specific embodiments
为了能够更清楚地理解本发明的上述目的、特征和优点,下面结合附图和具体实施例对 本发明进行详细描述。在不冲突的情况下,本发明的实施例及实施例中的特征可以相互组合。In order to more clearly understand the above-mentioned objects, features and advantages of the present invention, the following is a detailed description of the present invention in conjunction with the accompanying drawings and specific embodiments. The present invention is described in detail. In the absence of conflict, the embodiments of the present invention and the features in the embodiments can be combined with each other.
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述在一个可选的实施方式中实施例的目的,不是旨在于限制本发明。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those generally understood by those skilled in the art of the present invention. The terms used in the specification of the present invention herein are only for the purpose of describing embodiments in an optional embodiment and are not intended to limit the present invention.
本发明实施例提供的图像处理方法由电子设备执行,相应地,图像处理装置运行于电子设备中。The image processing method provided by the embodiment of the present invention is executed by an electronic device, and accordingly, the image processing device runs in the electronic device.
本发明实施例可以基于人工智能技术对症状进行标准化处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present invention can standardize the symptoms based on artificial intelligence technology. Artificial Intelligence (AI) is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。The basic technologies of artificial intelligence generally include sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, mechatronics, etc. Artificial intelligence software technologies mainly include computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
实施例一Embodiment 1
图1是本发明实施例一提供的图像处理方法的流程图。所述图像处理方法具体包括以下步骤,根据不同的需求,该流程图中步骤的顺序可以改变,某些可以省略。Fig. 1 is a flow chart of an image processing method provided by Embodiment 1 of the present invention. The image processing method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some steps can be omitted.
S11,获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量。S11, obtaining any image block corresponding to the low-resolution original image, and inputting the any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector.
其中,所述原始图像是指低分辨的数字图像。电子设备可以通过自带的摄像头采集原始图像,也可以接收其他设备发送的原始图像。The original image refers to a low-resolution digital image. The electronic device can collect the original image through its own camera, or receive the original image sent by other devices.
将本发明应用于数字医疗场景中时,则所述原始图像可以为数字医疗图像。所述原始图像可以从数字医疗数据库中获取,所述数字医疗数据库可以是某个医院中存储有患者病例的数字库,所述预设医疗库也可以是多个医院的联网数据库,本发明不做限制。When the present invention is applied to a digital medical scenario, the original image may be a digital medical image. The original image may be obtained from a digital medical database, which may be a digital library storing patient cases in a hospital, or a networked database of multiple hospitals, which is not limited by the present invention.
为了便于理解,以下实施例均以原始图像为由支气管镜采集到的CT图像为例进行说明。For ease of understanding, the following embodiments are described by taking the original image as a CT image acquired by a bronchoscope as an example.
在一个可选的实施方式中,所述获取对应于低分辨率的原始图像的任一图像块包括:In an optional implementation, the acquiring any image block corresponding to the low-resolution original image includes:
确定匹配于所述任一图像块的图像掩膜;Determining an image mask matching any one of the image blocks;
通过将所述图像掩膜与所述原始图像进行计算,得到所述任一图像块。The arbitrary image block is obtained by calculating the image mask and the original image.
电子设备在获取到低分辨率的原始图像之后,可以对原始图像进行分块得到多个分块图像。电子设备根据原始图像的大小对原始图像进行分块,得到多个大小一致的分块图像。示例性的,假设原始图像为64*64,那么可以将原始图像分块为4个大小一致的分块图像,分块图像为32*32。电子设备在得到多个分块图像之后,获取每个分块图像在原始图像中的位置,并根据每个分块图像在原始图像中的位置设置图像掩膜,分块图像与图像掩膜一一对应,每个图像掩膜的大小与原始图像的大小一致。如图2所示,电子设备对原始图像进行分块得到的4个分块图像。After acquiring the low-resolution original image, the electronic device can divide the original image into blocks to obtain multiple block images. The electronic device divides the original image into blocks according to the size of the original image to obtain multiple block images of the same size. Exemplarily, assuming that the original image is 64*64, the original image can be divided into 4 block images of the same size, and the block images are 32*32. After obtaining the multiple block images, the electronic device obtains the position of each block image in the original image, and sets an image mask according to the position of each block image in the original image. The block images correspond to the image masks one by one, and the size of each image mask is consistent with the size of the original image. As shown in Figure 2, the electronic device divides the original image into blocks to obtain 4 block images.
所述图像掩膜为预先设置的由0和1组成的二进制图像。电子设备可以根据原始图像的大小设置图像掩膜的大小,例如,原始图像的大小为W*H,则图像掩膜的大小可以设置为W*H。The image mask is a preset binary image consisting of 0 and 1. The electronic device can set the size of the image mask according to the size of the original image. For example, if the size of the original image is W*H, the size of the image mask can be set to W*H.
电子设备可以利用每个图像掩膜与原始图像进行相乘,具体而言,将原始图像中的每个像素与图像掩膜中的每个对应像素进行与操作,得到感兴趣区图像,感兴趣区内的像素值保持不变,而感兴趣区外的像素值都为0,从而得到需要的图像块。The electronic device can use each image mask to multiply the original image. Specifically, each pixel in the original image is ANDed with each corresponding pixel in the image mask to obtain an image of the region of interest. The pixel values in the region of interest remain unchanged, while the pixel values outside the region of interest are all 0, thereby obtaining the required image block.
上述可选的实施方式,通过设置多个不同的图像掩膜,与原始图像进行计算,能够对原始图像上某些区域进行屏蔽,使其不参与处理。即,对某一个图像块进行处理时,其他图像块不参与处理。The above optional implementation, by setting a plurality of different image masks and performing calculations with the original image, can shield certain areas on the original image so that they do not participate in the processing. That is, when a certain image block is processed, other image blocks do not participate in the processing.
在一个可选的实施方式中,所述将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量包括:In an optional implementation, inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
将所述任一图像块输入所述预设约束模型中,得到约束函数; Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;
对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,得到所述超分图像特征向量。The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
交替方向乘子法(Alternating Direction Method of Multipliers,ADMM)是一种求解具有可分离的凸优化问题的重要方法,处理速度快,收敛性能好。Alternating Direction Method of Multipliers (ADMM) is an important method for solving separable convex optimization problems with fast processing speed and good convergence performance.
接下来对交替方向乘子法的原理进行解释,假设对低分辨率的CT图像进行建模如下:
Next, the principle of the alternating direction multiplier method is explained, assuming that the low-resolution CT image is modeled as follows:
公式(1)中的Y为低分辨率的CT图像中各个体素点的CT值,B为模糊算子,为下采样函数,X为高分辨率的CT图像中各个体素点的CT值,Z为X的变换域,D为变换域函数,λ为约束L1范数的参数,α是拉格朗日参数,ρ是惩罚参数,x表示X里的每列/行向量,z表示Z里的每列/行向量。In formula (1), Y is the CT value of each voxel point in the low-resolution CT image, B is the fuzzy operator, is the downsampling function, X is the CT value of each voxel point in the high-resolution CT image, Z is the transform domain of X, D is the transform domain function, λ is the parameter constraining the L1 norm, α is the Lagrangian parameter, ρ is the penalty parameter, x represents each column/row vector in X, and z represents each column/row vector in Z.
通过对偶分解将上述公式(1)转换为3个子问题:
The above formula (1) is transformed into three sub-problems through dual decomposition:
利用ADMM算法对上面三个子问题求解如下:


The ADMM algorithm is used to solve the above three sub-problems as follows:


其中, in,
ADMM算法基于目标函数的可分性,在面对大规模数据的处理,将原问题的变量分解为三个子问题交替求解,可以计算出x的最优解,从而简化了计算,x,y,z对应为X,Y,Z里的每列/行向量。The ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. The optimal solution for x can be calculated, thereby simplifying the calculation. x, y, z correspond to each column/row vector in X, Y, Z.
CT超分辨率的数学先验性比较强,将上述公式(1)作为所述预设约束模型,将任一图像块输入所述预设约束模型中,从而得到约束函数,再对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,最终得到所述超分图像特征向量,求解过程较为简便,避免了采用深度学习/神经网络的方式,来实现低分辨率CT图像与高分辨率CT图像之间的变换,避免了复杂的求解过程,也无需利用大量数据进行训练。The mathematical a priori of CT super-resolution is relatively strong. The above formula (1) is used as the preset constraint model, and any image block is input into the preset constraint model to obtain a constraint function. The constraint function is then dually decomposed, and the constraint function obtained by dual decomposition is iteratively calculated based on the alternating direction multiplier method to finally obtain the super-resolution image feature vector. The solution process is relatively simple, avoiding the use of deep learning/neural network methods to achieve the transformation between low-resolution CT images and high-resolution CT images, avoiding complex solution processes, and no need to use a large amount of data for training.
S12,获取所述任一图像块在所述原始图像中的位置信息。S12: Obtain position information of any image block in the original image.
所述任一图像块在所述原始图像中的位置信息可以用位置坐标进行表示。例如,可以获取任一图像块的四个顶点在原始图像中的位置坐标,将获取的四个顶点的位置坐标作为该任一图像块在原始图像中的位置信息。又如,可以获取任一图像块的几何中心点在原始图像中的位置坐标,将获取的几何中心点的位置坐标作为该任一图像块在原始图像中的位置信息。本发明不做任何限制。The position information of any image block in the original image can be represented by position coordinates. For example, the position coordinates of the four vertices of any image block in the original image can be obtained, and the obtained position coordinates of the four vertices are used as the position information of the any image block in the original image. For another example, the position coordinates of the geometric center point of any image block in the original image can be obtained, and the obtained position coordinates of the geometric center point are used as the position information of the any image block in the original image. The present invention does not impose any restrictions.
根据本发明的一个可选的实施方式,所述获取所述任一图像块在所述原始图像中的位置信息包括:According to an optional implementation manner of the present invention, obtaining the position information of any image block in the original image includes:
获取所述任一图像块中的指定点;Obtaining a specified point in any one of the image blocks;
获取每个所述指定点在所述原始图像中的位置坐标;Obtaining the position coordinates of each of the specified points in the original image;
对所述位置坐标进行归一化,得到归一化坐标;Normalizing the position coordinates to obtain normalized coordinates;
根据所述归一化坐标生成所述位置信息。The position information is generated according to the normalized coordinates.
其中,指定点为预先在任一图像块中指定的像素点,可以指定任一图像块的几何中心点,也可以指定任一图像块的左上角的顶点,或者右下角的顶点,或者其他任一点。The designated point is a pixel point pre-designated in any image block, and may be the geometric center point of any image block, or the vertex of the upper left corner, or the vertex of the lower right corner, or any other point of any image block.
由于任一图像块中的指定点在原始图像中的位置坐标差别较大,因此,需要对位置坐标 进行归一化,从而将每个指定点的位置坐标都统一到一个数据范围内。Since the position coordinates of a specified point in any image block in the original image are quite different, it is necessary to Normalization is performed so that the position coordinates of each specified point are unified within a data range.
具体实施时,假设每个指定点的位置坐标记为(X,Y),则获取所有位置坐标中的横坐标值的最大横坐标值及最小横坐标值,根据最大横坐标值及最小横坐标值将每个位置坐标的横坐标值进行归一化。In specific implementation, assuming that the position coordinates of each specified point are marked as (X, Y), the maximum and minimum horizontal coordinate values of all the horizontal coordinate values in the position coordinates are obtained, and the horizontal coordinate value of each position coordinate is normalized according to the maximum and minimum horizontal coordinate values.
横坐标值归一化的计算公式如下:
x'=(x-X_min)/(X_max-X_min)   (4)
The calculation formula for normalizing the horizontal axis value is as follows:
x'=(x-X_min)/(X_max-X_min) (4)
公式(4)中,X_min为所有位置坐标中的横坐标值的最小横坐标值,X_max为所有位置坐标中的横坐标值的最大横坐标值,x'为归一化横坐标。In formula (4), X_min is the minimum abscissa value of all position coordinates, X_max is the maximum abscissa value of all position coordinates, and x' is the normalized abscissa.
获取所有位置坐标中的纵坐标值的最大纵坐标值及最小纵坐标值,根据最大纵坐标值及最小纵坐标值将每个位置坐标的纵坐标进行归一化。The maximum ordinate value and the minimum ordinate value of the ordinate values in all position coordinates are obtained, and the ordinate of each position coordinate is normalized according to the maximum ordinate value and the minimum ordinate value.
纵坐标值归一化的计算公式如下:
y'=(y-Y_min)/(Y_max-Y_min)   (5)
The calculation formula for normalizing the vertical coordinate value is as follows:
y'=(y-Y_min)/(Y_max-Y_min) (5)
公式(5)中,Y_min为所有位置坐标中的纵坐标值的最小纵坐标值,Y_max为所有位置坐标中的纵坐标值的最大纵坐标值,y'为归一化纵坐标。In formula (5), Y_min is the minimum ordinate value of all position coordinates, Y_max is the maximum ordinate value of all position coordinates, and y' is the normalized ordinate.
归一化坐标记为(x',y')。Normalized coordinates are denoted by (x', y').
S13,将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。S13, inputting the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtaining the super-resolution image corresponding to any image block output by the Transformer model.
本实施例中的Transformer模型可以包括:第一归一化层、多头注意力层、第二归一化层、全连接层及输出层。The Transformer model in this embodiment may include: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer.
在一个可选的实施方式中,所述将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像包括:In an optional embodiment, the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model comprises:
根据所述任一图像块对应的所述超分图像特征向量及所述位置信息生成组合特征向量;Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;
将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量;Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;
根据所述目标特征向量生成所述任一图像块的超分图像。A super-resolution image of any image block is generated according to the target feature vector.
其中,组合特征向量为超分图像特征向量及对应的位置信息进行拼接得到的特征向量。例如,组合特征向量记为(位置信息,超分图像特征向量)。The combined feature vector is a feature vector obtained by concatenating the super-resolved image feature vector and the corresponding position information. For example, the combined feature vector is recorded as (position information, super-resolved image feature vector).
在一个可选的实施方式中,所述将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量包括:In an optional implementation, inputting each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector comprises:
通过所述第一归一化层对所述组合特征向量进行归一化,得到第一归一化特征向量;Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;
通过所述多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;
对所述第一归一化特征向量及所述注意力特征向量进行残差连接,得到残差特征向量;Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;
通过所述第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;
通过所述全连接层对所述第二归一化特征向量进行全连接计算,得到连接特征向量;Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;
对所述连接特征向量及所述残差特征向量进行残差连接,得到所述目标特征向量。Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.
在一个可选的实施方式中,所述方法还包括:In an optional embodiment, the method further comprises:
获取所述原始图像所含的每个所述图像块的所述超分图像;Acquire the super-resolution image of each image block contained in the original image;
根据每个所述图像块对应的所述位置信息,对获取到的所述超分图像进行拼接,获得拼接后的目标图像。The acquired super-resolution images are spliced according to the position information corresponding to each image block to obtain a spliced target image.
一个图像块对应一个超分图像,将多个图像块对应的多个超分图像按照图像块在原始图像中的位置信息进行拼接,得到的图像称之为目标图像。所述目标图像的大小匹配于所述原始图像的大小。One image block corresponds to one super-resolution image, and multiple super-resolution images corresponding to multiple image blocks are spliced according to the position information of the image blocks in the original image, and the obtained image is called a target image. The size of the target image matches the size of the original image.
上述可选的实施方式,通过将多个超分图像按照对应的图像块的位置信息进行拼接,超分图像的分辨率高于原始图像的分辨率,因而实现了对低分辨率的原始图像的超分辨率的处理。In the above optional implementation, by splicing multiple super-resolution images according to the position information of corresponding image blocks, the resolution of the super-resolution image is higher than the resolution of the original image, thereby achieving super-resolution processing of the low-resolution original image.
本发明将ADMM算法和Transformer进行结合,通过ADMM算法实现对低分辨率的原 始图像的重建,重建过程简单,不需要大量数据训练,并通过将重建后的图像输入到Transformer模型中,得到效果更佳的高分辨率图像,即将数学先验和Transformer进行统一,快速的实现了对低分辨的图像的重建,得到的高分辨的图像的效果较佳。The present invention combines the ADMM algorithm with the Transformer to realize the low-resolution original The reconstruction of the original image has a simple reconstruction process and does not require a large amount of data training. By inputting the reconstructed image into the Transformer model, a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is quickly realized. The high-resolution image obtained has better effect.
下面结合图3来具体描述使用本发明所述的方法将低分辨率CT图像重建为高分辨率CT图像的过程。The following specifically describes the process of reconstructing a low-resolution CT image into a high-resolution CT image using the method of the present invention in conjunction with FIG. 3 .
首先,对低分辨率CT图像进行分块得到多个分块图像,并记录每个分块图像在低分辨率CT图像中的位置关系。假设如图2所示,对低分辨率CT图像进行分块得到4个分块图像(①②③④),为分块图像①设置一个掩模图像1-Mask,为分块图像②设置一个掩模图像2-Mask,为分块图像③设置一个掩模图像3-Mask,为分块图像④设置一个掩模图像4-Mask。其中,掩模图像1-Mask,2-Mask,3-Mask及4-Mask均为由0和1组成的二进制图像,且均与低分辨率CT图像的大小相同。掩模图像1-Mask中像素值为1的目标像素点所在的位置与对应的分块图像①在低分辨率CT图像中的位置一致,掩模图像1-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像2-Mask中像素值为1的目标像素点所在的位置与对应的分块图像②在低分辨率CT图像中的位置一致,掩模图像2-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像3-Mask中目标像素值为1的像素点所在的位置与对应的分块图像③在低分辨率CT图像中的位置一致,掩模图像3-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像4-Mask中目标像素值为1的像素点所在的位置与对应的分块图像④在低分辨率CT图像中的位置一致,掩模图像4-Mask中除了目标像素点之外的其余像素点的像素值为0。First, the low-resolution CT image is divided into multiple block images, and the positional relationship of each block image in the low-resolution CT image is recorded. Assume that as shown in Figure 2, the low-resolution CT image is divided into 4 block images (①②③④), a mask image 1-Mask is set for block image ①, a mask image 2-Mask is set for block image ②, a mask image 3-Mask is set for block image ③, and a mask image 4-Mask is set for block image ④. Among them, mask images 1-Mask, 2-Mask, 3-Mask and 4-Mask are all binary images composed of 0 and 1, and they are all the same size as the low-resolution CT image. The position of the target pixel with a pixel value of 1 in mask image 1-Mask is consistent with the position of the corresponding block image ① in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 1-Mask except the target pixel are 0. The position of the target pixel with a pixel value of 1 in mask image 2-Mask is consistent with the position of the corresponding block image ② in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 2-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 3-Mask is consistent with the position of the corresponding block image ③ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 3-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 4-Mask is consistent with the position of the corresponding block image ④ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 4-Mask except the target pixel are 0.
然后,通过处理路径(101→102→103→104→105)对分块图像①进行处理;通过处理路径(201→202→203→204→205)对分块图像②进行处理。同理使用类似的处理路径对分块图像③和分块图像④进行处理。图3中仅描述了处理路径(101→102→103→104→105)和处理路径(201→202→203→204→205)。Then, block image ① is processed through the processing path (101→102→103→104→105); block image ② is processed through the processing path (201→202→203→204→205). Similarly, block images ③ and ④ are processed using similar processing paths. FIG3 only describes the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205).
在101处,通过掩模图像1-Mask对低分辨率CT图像进行计算,得到分块图像①对应的图像块,分块图像①对应的图像块相当于掩盖掉了图2中的分块图像②③④。At 101, the low-resolution CT image is calculated through the mask image 1-Mask to obtain the image block corresponding to the block image ①. The image block corresponding to the block image ① is equivalent to covering up the block images ②③④ in FIG. 2 .
在102处,对经过101处理得到的图像块,使用ADMM算法进行K次迭代计算:


At 102, the ADMM algorithm is used to perform K iterations of calculation on the image block obtained after the processing at 101:


采用ADMM算法可以对经过101处理得到的图像块中的每一小分块进行处理,从而在经过102处理后,通过102输出高分辨率的图像块,这些高分辨率的图像块对应于经过101处理得到的图像块中包含的每一小分块。The ADMM algorithm can be used to process each small block in the image block obtained through 101, so that after being processed by 102, high-resolution image blocks are output through 102, and these high-resolution image blocks correspond to each small block contained in the image block obtained through 101.
在103处,可以获取记录的分块图像①的位置信息,并对该位置信息进行编码,得到位置信息编码结果。其中,位置信息编码结果为将位置信息编码为向量形式得到的位置向量。At 103, the position information of the recorded block image ① can be obtained, and the position information can be encoded to obtain a position information encoding result. The position information encoding result is a position vector obtained by encoding the position information into a vector form.
在104处,对经过102处理得到的高分辨率的图像块和经过103处理得到的位置信息编码结果进行拼接,得到包含位置信息编码结果的高分辨的图像块组成的向量序列(即,组合特征向量),并将组合特征向量输入到Transformer模型中。At 104, the high-resolution image blocks obtained by processing 102 and the position information encoding results obtained by processing 103 are concatenated to obtain a vector sequence (i.e., a combined feature vector) consisting of high-resolution image blocks containing position information encoding results, and the combined feature vector is input into the Transformer model.
在105处,通过Transformer模型的第一归一化层对组合特征向量进行归一化,得到第一归一化特征向量;通过多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;对第一归一化特征向量及注意力特征向量进行残差连接,得到残差特征向量;通过第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;通过全连接层对第二归一化特征向量进行全连接计算,得到连接特征向量;对连接特征向量及残差特征向量进行残差连接,得到目标特征向量。经过Transformer模型进行处理后,可以突出图像块中所含的关键信息,去除图像块所含的噪声信息,提升后续获取到的高分辨率CT图像的质量。 At 105, the combined feature vector is normalized by the first normalization layer of the Transformer model to obtain a first normalized feature vector; the first normalized feature vector is processed by the multi-head attention layer to obtain an attention feature vector; the first normalized feature vector and the attention feature vector are residually connected to obtain a residual feature vector; the residual feature vector is normalized by the second normalization layer to obtain a second normalized feature vector; the second normalized feature vector is fully connected by the fully connected layer to obtain a connection feature vector; the connection feature vector and the residual feature vector are residually connected to obtain a target feature vector. After being processed by the Transformer model, the key information contained in the image block can be highlighted, the noise information contained in the image block can be removed, and the quality of the high-resolution CT image subsequently obtained can be improved.
应当理解的是,在201处,通过掩模图像2-Mask对低分辨率CT图像进行计算,得到分块图像②对应的图像块,分块图像②对应的图像块相当于掩盖掉了图2中的分块图像①③④。应当理解的是,得到的每个图像块与低分辨率CT图像的大小一致。It should be understood that at 201, the low-resolution CT image is calculated through the mask image 2-Mask to obtain the image block corresponding to the block image ②, and the image block corresponding to the block image ② is equivalent to covering up the block images ①③④ in Figure 2. It should be understood that each image block obtained is consistent with the size of the low-resolution CT image.
最后,在106处,将通过处理路径(101→102→103→104→105)得到的目标特征向量和通过处理路径(201→202→203→204→205)得到的目标特征向量,及其他类似的处理路径得到的目标特征向量,按照位置信息编码结果进行拼接,即可得到完整的高分辨率CT图像。Finally, at 106, the target feature vector obtained through the processing path (101→102→103→104→105) and the target feature vector obtained through the processing path (201→202→203→204→205), as well as the target feature vectors obtained through other similar processing paths, are spliced according to the position information encoding result to obtain a complete high-resolution CT image.
Transformer模型需要大量数据训练,训练过程较为复杂且难以收敛,因此,将低分辨率CT图像转换为高分辨率CT图像通常不能直接采用Transformer模型进行处理。The Transformer model requires a large amount of data training, and the training process is relatively complex and difficult to converge. Therefore, the Transformer model cannot usually be used directly to convert low-resolution CT images into high-resolution CT images.
本发明通过加入ADMM算法对低分辨率CT图像中的任一图像块进行处理,得到相应的超分图像特征向量,并将超分图像特征向量和相应的位置信息输入Transformer模型进行处理,获得由Transformer模型输出的对应于该任一图像块的超分图像,而由于ADMM算法有数学先验,可以更好的求解相应的超分图像特征向量以及相关参数的取值,那么经过ADMM算法可以获得对应于低分辨率CT图像的超分图像特征向量,那么后续Transformer模型只需要针对得到的超分图像特征向量和位置信息进行继续处理即可,无需针对初始的低分辨率CT图像进行处理,从而能够快速收敛,提升处理效率。同时,由于本发明将ADMM这种传统算法展开为可导形式,和Transformer结合在一起,先通过了ADMM算法进行编码解码,把数学先验给Transformer,再通过Transformer模型进一步解码,实现了数学先验和深度学习的结合,使得求解更加精确和快速,因而重建得到的高分辨率CT图像的效果更佳。The present invention processes any image block in the low-resolution CT image by adding the ADMM algorithm to obtain the corresponding super-resolution image feature vector, and inputs the super-resolution image feature vector and the corresponding position information into the Transformer model for processing, and obtains the super-resolution image corresponding to the any image block output by the Transformer model. Since the ADMM algorithm has a mathematical prior, the corresponding super-resolution image feature vector and the value of the relevant parameters can be better solved. Then, the super-resolution image feature vector corresponding to the low-resolution CT image can be obtained through the ADMM algorithm. Then, the subsequent Transformer model only needs to continue processing the obtained super-resolution image feature vector and position information, and does not need to process the initial low-resolution CT image, so that it can converge quickly and improve processing efficiency. At the same time, since the present invention expands the traditional algorithm ADMM into a differentiable form and combines it with the Transformer, it is first encoded and decoded by the ADMM algorithm, the mathematical prior is given to the Transformer, and then further decoded by the Transformer model, the combination of mathematical prior and deep learning is realized, so that the solution is more accurate and faster, and the effect of reconstructing the high-resolution CT image is better.
此外,相较于直接采用ADMM算法重建得到的高分辨率图像,即直接经过如图3所示的处理路径(101→102→103→104)或者处理路径(201→202→203→204)等图像块拼接得到的CT图像,如图4所示,本发明实施例中采用ADMM算法结合Transformer模型重建得到的高分辨率图像,即经由如图3所示的处理路径(101→102→103→104→105)以及处理路径(201→202→203→204→205)处理得到的高分辨率CT图像,如图5所示,由于Transformer可以捕捉长距离依赖,能得到更好的细节,因而图5相较于图4的图像质量会更高,更加清晰,并且图5包含的细节信息更多。In addition, compared with the high-resolution image reconstructed directly by using the ADMM algorithm, that is, the CT image directly obtained by splicing image blocks such as the processing path (101→102→103→104) or the processing path (201→202→203→204) as shown in Figure 3, as shown in Figure 4, the high-resolution image reconstructed by using the ADMM algorithm in combination with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205) as shown in Figure 3, as shown in Figure 5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of Figure 5 is higher and clearer than that of Figure 4, and Figure 5 contains more detail information.
本发明无需用CT超分辨率扫描也能得到高分辨率的CT图像,从而在提高CT图像分辨率的同时降低患者在扫描过程中损害健康的风险。The present invention can obtain high-resolution CT images without using CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.
实施例二Embodiment 2
图6是本发明实施例二提供的图像处理装置的结构图。FIG. 6 is a structural diagram of an image processing device provided in Embodiment 2 of the present invention.
在一些实施例中,所述图像处理装置60可以包括多个由计算机程序段所组成的功能模块。所述图像处理装置60中的每个程序段的计算机程序可以存储于电子设备的存储器中,并由至少一个处理器所执行,以执行(详见图1描述)图像处理的功能。In some embodiments, the image processing device 60 may include a plurality of functional modules composed of computer program segments. The computer program of each program segment in the image processing device 60 may be stored in a memory of an electronic device and executed by at least one processor to perform (see FIG. 1 for details) image processing functions.
本实施例中,所述图像处理装置60根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:约束计算模块601、位置获取模块602、模型处理模块603及图像融合模块604。本发明所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机程序段,其存储在存储器中。在本实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the image processing device 60 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: a constraint calculation module 601, a position acquisition module 602, a model processing module 603 and an image fusion module 604. The module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, which are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
所述约束计算模块601,用于获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量。The constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector.
其中,所述原始图像是指低分辨的数字图像。电子设备可以通过自带的摄像头采集原始图像,也可以接收其他设备发送的原始图像。The original image refers to a low-resolution digital image. The electronic device can collect the original image through its own camera, or receive the original image sent by other devices.
将本发明应用于数字医疗场景中时,则所述原始图像可以为数字医疗图像。所述原始图像可以从数字医疗数据库中获取,所述数字医疗数据库可以是某个医院中存储有患者病例的数字库,所述预设医疗库也可以是多个医院的联网数据库,本发明不做限制。When the present invention is applied to a digital medical scenario, the original image may be a digital medical image. The original image may be obtained from a digital medical database, which may be a digital library storing patient cases in a hospital, or a networked database of multiple hospitals, which is not limited by the present invention.
为了便于理解,以下实施例均以原始图像为由支气管镜采集到的CT图像为例进行说明。For ease of understanding, the following embodiments are described by taking the original image as a CT image acquired by a bronchoscope as an example.
在一个可选的实施方式中,所述获取对应于低分辨率的原始图像的任一图像块包括: In an optional implementation, the acquiring any image block corresponding to the low-resolution original image includes:
确定匹配于所述任一图像块的图像掩膜;Determining an image mask matching any one of the image blocks;
通过将所述图像掩膜与所述原始图像进行计算,得到所述任一图像块。The arbitrary image block is obtained by calculating the image mask and the original image.
电子设备在获取到低分辨率的原始图像之后,可以对原始图像进行分块得到多个分块图像。电子设备根据原始图像的大小对原始图像进行分块,得到多个大小一致的分块图像。示例性的,假设原始图像为64*64,那么可以将原始图像分块为4个大小一致的分块图像,分块图像为32*32。电子设备在得到多个分块图像之后,获取每个分块图像在原始图像中的位置,并根据每个分块图像在原始图像中的位置设置图像掩膜,分块图像与图像掩膜一一对应,每个图像掩膜的大小与原始图像的大小一致。如图2所示,电子设备对原始图像进行分块得到的4个分块图像。After acquiring the low-resolution original image, the electronic device can divide the original image into blocks to obtain multiple block images. The electronic device divides the original image into blocks according to the size of the original image to obtain multiple block images of the same size. Exemplarily, assuming that the original image is 64*64, the original image can be divided into 4 block images of the same size, and the block images are 32*32. After obtaining the multiple block images, the electronic device obtains the position of each block image in the original image, and sets an image mask according to the position of each block image in the original image. The block images correspond to the image masks one by one, and the size of each image mask is consistent with the size of the original image. As shown in Figure 2, the electronic device divides the original image into blocks to obtain 4 block images.
所述图像掩膜为预先设置的由0和1组成的二进制图像。电子设备可以根据原始图像的大小设置图像掩膜的大小,例如,原始图像的大小为W*H,则图像掩膜的大小可以设置为W*H。The image mask is a preset binary image consisting of 0 and 1. The electronic device can set the size of the image mask according to the size of the original image. For example, if the size of the original image is W*H, the size of the image mask can be set to W*H.
电子设备可以利用每个图像掩膜与原始图像进行相乘,具体而言,将原始图像中的每个像素与图像掩膜中的每个对应像素进行与操作,得到感兴趣区图像,感兴趣区内的像素值保持不变,而感兴趣区外的像素值都为0,从而得到需要的图像块。The electronic device can use each image mask to multiply the original image. Specifically, each pixel in the original image is ANDed with each corresponding pixel in the image mask to obtain an image of the region of interest. The pixel values in the region of interest remain unchanged, while the pixel values outside the region of interest are all 0, thereby obtaining the required image block.
上述可选的实施方式,通过设置多个不同的图像掩膜,与原始图像进行计算,能够对原始图像上某些区域进行屏蔽,使其不参与处理。即,对某一个图像块进行处理时,其他图像块不参与处理。The above optional implementation, by setting a plurality of different image masks and performing calculations with the original image, can shield certain areas on the original image so that they do not participate in the processing. That is, when a certain image block is processed, other image blocks do not participate in the processing.
在一个可选的实施方式中,所述将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量包括:In an optional implementation, inputting any image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
将所述任一图像块输入所述预设约束模型中,得到约束函数;Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;
对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,得到所述超分图像特征向量。The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
交替方向乘子法(Alternating Direction Method of Multipliers,ADMM)是一种求解具有可分离的凸优化问题的重要方法,处理速度快,收敛性能好。Alternating Direction Method of Multipliers (ADMM) is an important method for solving separable convex optimization problems with fast processing speed and good convergence performance.
接下来对交替方向乘子法的原理进行解释,假设对低分辨率的CT图像进行建模如下:
Next, the principle of the alternating direction multiplier method is explained, assuming that the low-resolution CT image is modeled as follows:
公式(1)中的Y为低分辨率的CT图像中各个体素点的CT值,B为模糊算子,为下采样函数,X为高分辨率的CT图像中各个体素点的CT值,Z为X的变换域,D为变换域函数,λ为约束L1范数的参数,α是拉格朗日参数,ρ是惩罚参数,x表示X里的每列/行向量,z表示Z里的每列/行向量。In formula (1), Y is the CT value of each voxel point in the low-resolution CT image, B is the fuzzy operator, is the downsampling function, X is the CT value of each voxel point in the high-resolution CT image, Z is the transform domain of X, D is the transform domain function, λ is the parameter constraining the L1 norm, α is the Lagrangian parameter, ρ is the penalty parameter, x represents each column/row vector in X, and z represents each column/row vector in Z.
通过对偶分解将上述公式(1)转换为3个子问题:
The above formula (1) is transformed into three sub-problems through dual decomposition:
利用ADMM算法对上面三个子问题求解如下:
The ADMM algorithm is used to solve the above three sub-problems as follows:
其中, in,
ADMM算法基于目标函数的可分性,在面对大规模数据的处理,将原问题的变量分解为三个子问题交替求解,可以计算出x的最优解,从而简化了计算,x,y,z对应为X,Y,Z里 的每列/行向量。ADMM algorithm is based on the separability of the objective function. When dealing with large-scale data, it decomposes the variables of the original problem into three sub-problems and solves them alternately. It can calculate the optimal solution of x, thereby simplifying the calculation. x, y, and z correspond to X, Y, and Z. Each column/row vector of .
CT超分辨率的数学先验性比较强,将上述公式(1)作为所述预设约束模型,将任一图像块输入所述预设约束模型中,从而得到约束函数,再对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,最终得到所述超分图像特征向量,求解过程较为简便,避免了采用深度学习/神经网络的方式,来实现低分辨率CT图像与高分辨率CT图像之间的变换,避免了复杂的求解过程,也无需利用大量数据进行训练。The mathematical a priori of CT super-resolution is relatively strong. The above formula (1) is used as the preset constraint model, and any image block is input into the preset constraint model to obtain a constraint function. The constraint function is then dually decomposed, and the constraint function obtained by dual decomposition is iteratively calculated based on the alternating direction multiplier method to finally obtain the super-resolution image feature vector. The solution process is relatively simple, avoiding the use of deep learning/neural network methods to achieve the transformation between low-resolution CT images and high-resolution CT images, avoiding complex solution processes, and no need to use a large amount of data for training.
所述位置获取模块602,用于获取所述任一图像块在所述原始图像中的位置信息。The position acquisition module 602 is used to acquire the position information of any image block in the original image.
所述任一图像块在所述原始图像中的位置信息可以用位置坐标进行表示。例如,可以获取任一图像块的四个顶点在原始图像中的位置坐标,将获取的四个顶点的位置坐标作为该任一图像块在原始图像中的位置信息。又如,可以获取任一图像块的几何中心点在原始图像中的位置坐标,将获取的几何中心点的位置坐标作为该任一图像块在原始图像中的位置信息。本发明不做任何限制。The position information of any image block in the original image can be represented by position coordinates. For example, the position coordinates of the four vertices of any image block in the original image can be obtained, and the obtained position coordinates of the four vertices are used as the position information of the any image block in the original image. For another example, the position coordinates of the geometric center point of any image block in the original image can be obtained, and the obtained position coordinates of the geometric center point are used as the position information of the any image block in the original image. The present invention does not impose any restrictions.
根据本发明的一个可选的实施方式,所述获取所述任一图像块在所述原始图像中的位置信息包括:According to an optional implementation manner of the present invention, obtaining the position information of any image block in the original image includes:
获取所述任一图像块中的指定点;Obtaining a specified point in any one of the image blocks;
获取每个所述指定点在所述原始图像中的位置坐标;Obtaining the position coordinates of each of the specified points in the original image;
对所述位置坐标进行归一化,得到归一化坐标;Normalizing the position coordinates to obtain normalized coordinates;
根据所述归一化坐标生成所述位置信息。The position information is generated according to the normalized coordinates.
其中,指定点为预先在任一图像块中指定的像素点,可以指定任一图像块的几何中心点,也可以指定任一图像块的左上角的顶点,或者右下角的顶点,或者其他任一点。The designated point is a pixel point pre-designated in any image block, and may be the geometric center point of any image block, or the vertex of the upper left corner, or the vertex of the lower right corner, or any other point of any image block.
由于任一图像块中的指定点在原始图像中的位置坐标差别较大,因此,需要对位置坐标进行归一化,从而将每个指定点的位置坐标都统一到一个数据范围内。Since the position coordinates of a designated point in any image block in the original image are quite different, it is necessary to normalize the position coordinates so as to unify the position coordinates of each designated point into one data range.
具体实施时,假设每个指定点的位置坐标记为(X,Y),则获取所有位置坐标中的横坐标值的最大横坐标值及最小横坐标值,根据最大横坐标值及最小横坐标值将每个位置坐标的横坐标值进行归一化。In specific implementation, assuming that the position coordinates of each specified point are marked as (X, Y), the maximum and minimum horizontal coordinate values of all the horizontal coordinate values in the position coordinates are obtained, and the horizontal coordinate value of each position coordinate is normalized according to the maximum and minimum horizontal coordinate values.
横坐标值归一化的计算公式如下:
x'=(x-X_min)/(X_max-X_min)   (4)
The calculation formula for normalizing the horizontal axis value is as follows:
x'=(x-X_min)/(X_max-X_min) (4)
公式(4)中,X_min为所有位置坐标中的横坐标值的最小横坐标值,X_max为所有位置坐标中的横坐标值的最大横坐标值,x'为归一化横坐标。In formula (4), X_min is the minimum abscissa value of all position coordinates, X_max is the maximum abscissa value of all position coordinates, and x' is the normalized abscissa.
获取所有位置坐标中的纵坐标值的最大纵坐标值及最小纵坐标值,根据最大纵坐标值及最小纵坐标值将每个位置坐标的纵坐标进行归一化。The maximum ordinate value and the minimum ordinate value of the ordinate values in all position coordinates are obtained, and the ordinate of each position coordinate is normalized according to the maximum ordinate value and the minimum ordinate value.
纵坐标值归一化的计算公式如下:
y'=(y-Y_min)/(Y_max-Y_min)   (4)
The calculation formula for normalizing the vertical coordinate value is as follows:
y'=(y-Y_min)/(Y_max-Y_min) (4)
公式(4)中,Y_min为所有位置坐标中的纵坐标值的最小纵坐标值,Y_max为所有位置坐标中的纵坐标值的最大纵坐标值,y'为归一化纵坐标。In formula (4), Y_min is the minimum ordinate value of all position coordinates, Y_max is the maximum ordinate value of all position coordinates, and y' is the normalized ordinate.
归一化坐标记为(x',y')。Normalized coordinates are denoted by (x', y').
所述模型处理模块603,用于将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。The model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model.
本实施例中的Transformer模型可以包括:第一归一化层、多头注意力层、第二归一化层、全连接层及输出层。The Transformer model in this embodiment may include: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer and an output layer.
在一个可选的实施方式中,所述将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像包括:In an optional embodiment, the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing to obtain a super-resolved image corresponding to any image block output by the Transformer model comprises:
根据所述任一图像块对应的所述超分图像特征向量及所述位置信息生成组合特征向量;Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;
将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量; Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;
根据所述目标特征向量生成所述任一图像块的超分图像。A super-resolution image of any image block is generated according to the target feature vector.
其中,组合特征向量为超分图像特征向量及对应的位置信息进行拼接得到的特征向量。例如,组合特征向量记为(位置信息,超分图像特征向量)。The combined feature vector is a feature vector obtained by concatenating the super-resolved image feature vector and the corresponding position information. For example, the combined feature vector is recorded as (position information, super-resolved image feature vector).
在一个可选的实施方式中,所述将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量包括:In an optional implementation, inputting each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector comprises:
通过所述第一归一化层对所述组合特征向量进行归一化,得到第一归一化特征向量;Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;
通过所述多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;
对所述第一归一化特征向量及所述注意力特征向量进行残差连接,得到残差特征向量;Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;
通过所述第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;
通过所述全连接层对所述第二归一化特征向量进行全连接计算,得到连接特征向量;Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;
对所述连接特征向量及所述残差特征向量进行残差连接,得到所述目标特征向量。Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.
所述图像融合模块604,用于获取所述原始图像所含的每个所述图像块的所述超分图像;根据每个所述图像块对应的所述位置信息,对获取到的所述超分图像进行拼接,获得拼接后的目标图像。The image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; and to splice the super-resolution images obtained according to the position information corresponding to each image block to obtain a spliced target image.
一个图像块对应一个超分图像,将多个图像块对应的多个超分图像按照图像块在原始图像中的位置信息进行拼接,得到的图像称之为目标图像。所述目标图像的大小匹配于所述原始图像的大小。One image block corresponds to one super-resolution image, and multiple super-resolution images corresponding to multiple image blocks are spliced according to the position information of the image blocks in the original image, and the obtained image is called a target image. The size of the target image matches the size of the original image.
上述可选的实施方式,通过将多个超分图像按照对应的图像块的位置信息进行拼接,超分图像的分辨率高于原始图像的分辨率,因而实现了对低分辨率的原始图像的超分辨率的处理。In the above optional implementation, by splicing multiple super-resolution images according to the position information of corresponding image blocks, the resolution of the super-resolution image is higher than the resolution of the original image, thereby achieving super-resolution processing of the low-resolution original image.
本发明将ADMM算法和Transformer进行结合,通过ADMM算法实现对低分辨率的原始图像的重建,重建过程简单,不需要大量数据训练,并通过将重建后的图像输入到Transformer模型中,得到效果更佳的高分辨率图像,即将数学先验和Transformer进行统一,快速的实现了对低分辨的图像的重建,得到的高分辨的图像的效果较佳。The present invention combines the ADMM algorithm with the Transformer and realizes the reconstruction of the low-resolution original image through the ADMM algorithm. The reconstruction process is simple and does not require a large amount of data training. By inputting the reconstructed image into the Transformer model, a high-resolution image with better effect is obtained. That is, the mathematical prior and the Transformer are unified, and the reconstruction of the low-resolution image is realized quickly, and the obtained high-resolution image has better effect.
下面结合图3来具体描述使用本发明所述的方法将低分辨率CT图像重建为高分辨率CT图像的过程。The following specifically describes the process of reconstructing a low-resolution CT image into a high-resolution CT image using the method of the present invention in conjunction with FIG. 3 .
首先,对低分辨率CT图像进行分块得到多个分块图像,并记录每个分块图像在低分辨率CT图像中的位置关系。假设如图2所示,对低分辨率CT图像进行分块得到4个分块图像(①②③④),为分块图像①设置一个掩模图像1-Mask,为分块图像②设置一个掩模图像2-Mask,为分块图像③设置一个掩模图像3-Mask,为分块图像④设置一个掩模图像4-Mask。其中,掩模图像1-Mask,2-Mask,3-Mask及4-Mask均为由0和1组成的二进制图像,且均与低分辨率CT图像的大小相同。掩模图像1-Mask中像素值为1的目标像素点所在的位置与对应的分块图像①在低分辨率CT图像中的位置一致,掩模图像1-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像2-Mask中像素值为1的目标像素点所在的位置与对应的分块图像②在低分辨率CT图像中的位置一致,掩模图像2-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像3-Mask中目标像素值为1的像素点所在的位置与对应的分块图像③在低分辨率CT图像中的位置一致,掩模图像3-Mask中除了目标像素点之外的其余像素点的像素值为0。掩模图像4-Mask中目标像素值为1的像素点所在的位置与对应的分块图像④在低分辨率CT图像中的位置一致,掩模图像4-Mask中除了目标像素点之外的其余像素点的像素值为0。First, the low-resolution CT image is divided into multiple block images, and the positional relationship of each block image in the low-resolution CT image is recorded. Assume that as shown in Figure 2, the low-resolution CT image is divided into 4 block images (①②③④), a mask image 1-Mask is set for block image ①, a mask image 2-Mask is set for block image ②, a mask image 3-Mask is set for block image ③, and a mask image 4-Mask is set for block image ④. Among them, mask images 1-Mask, 2-Mask, 3-Mask and 4-Mask are all binary images composed of 0 and 1, and they are all the same size as the low-resolution CT image. The position of the target pixel with a pixel value of 1 in mask image 1-Mask is consistent with the position of the corresponding block image ① in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 1-Mask except the target pixel are 0. The position of the target pixel with a pixel value of 1 in mask image 2-Mask is consistent with the position of the corresponding block image ② in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 2-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 3-Mask is consistent with the position of the corresponding block image ③ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 3-Mask except the target pixel are 0. The position of the pixel with a target pixel value of 1 in mask image 4-Mask is consistent with the position of the corresponding block image ④ in the low-resolution CT image, and the pixel values of the remaining pixels in mask image 4-Mask except the target pixel are 0.
然后,通过处理路径(101→102→103→104→105)对分块图像①进行处理;通过处理路径(201→202→203→204→205)对分块图像②进行处理。同理使用类似的处理路径对分块图像③和分块图像④进行处理。图3中仅描述了处理路径(101→102→103→104→105)和处理路径(201→202→203→204→205)。Then, block image ① is processed through the processing path (101→102→103→104→105); block image ② is processed through the processing path (201→202→203→204→205). Similarly, block images ③ and ④ are processed using similar processing paths. FIG3 only describes the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205).
在101处,通过掩模图像1-Mask对低分辨率CT图像进行计算,得到分块图像①对应的图像块,分块图像①对应的图像块相当于掩盖掉了图2中的分块图像②③④。 At 101, the low-resolution CT image is calculated through the mask image 1-Mask to obtain the image block corresponding to the block image ①. The image block corresponding to the block image ① is equivalent to covering up the block images ②③④ in FIG. 2 .
在102处,对经过101处理得到的图像块,使用ADMM算法进行K次迭代计算:


At 102, the ADMM algorithm is used to perform K iterations of calculation on the image block obtained after the processing at 101:


采用ADMM算法可以对经过101处理得到的图像块中的每一小分块进行处理,从而在经过102处理后,通过102输出高分辨率的图像块,这些高分辨率的图像块对应于经过101处理得到的图像块中包含的每一小分块。The ADMM algorithm can be used to process each small block in the image block obtained through 101, so that after being processed by 102, high-resolution image blocks are output through 102, and these high-resolution image blocks correspond to each small block contained in the image block obtained through 101.
在103处,可以获取记录的分块图像①的位置信息,并对该位置信息进行编码,得到位置信息编码结果。其中,位置信息编码结果为将位置信息编码为向量形式得到的位置向量。At 103, the position information of the recorded block image ① can be obtained, and the position information can be encoded to obtain a position information encoding result. The position information encoding result is a position vector obtained by encoding the position information into a vector form.
在104处,对经过102处理得到的高分辨率的图像块和经过103处理得到的位置信息编码结果进行拼接,得到包含位置信息编码结果的高分辨的图像块组成的向量序列(即,组合特征向量),并将组合特征向量输入到Transformer模型中。At 104, the high-resolution image blocks obtained by processing 102 and the position information encoding results obtained by processing 103 are concatenated to obtain a vector sequence (i.e., a combined feature vector) consisting of high-resolution image blocks containing position information encoding results, and the combined feature vector is input into the Transformer model.
在105处,通过Transformer模型的第一归一化层对组合特征向量进行归一化,得到第一归一化特征向量;通过多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;对第一归一化特征向量及注意力特征向量进行残差连接,得到残差特征向量;通过第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;通过全连接层对第二归一化特征向量进行全连接计算,得到连接特征向量;对连接特征向量及残差特征向量进行残差连接,得到目标特征向量。经过Transformer模型进行处理后,可以突出图像块中所含的关键信息,去除图像块所含的噪声信息,提升后续获取到的高分辨率CT图像的质量。At 105, the combined feature vector is normalized by the first normalization layer of the Transformer model to obtain a first normalized feature vector; the first normalized feature vector is processed by the multi-head attention layer to obtain an attention feature vector; the first normalized feature vector and the attention feature vector are residually connected to obtain a residual feature vector; the residual feature vector is normalized by the second normalization layer to obtain a second normalized feature vector; the second normalized feature vector is fully connected by the fully connected layer to obtain a connection feature vector; the connection feature vector and the residual feature vector are residually connected to obtain a target feature vector. After being processed by the Transformer model, the key information contained in the image block can be highlighted, the noise information contained in the image block can be removed, and the quality of the high-resolution CT image subsequently obtained can be improved.
应当理解的是,在201处,通过掩模图像2-Mask对低分辨率CT图像进行计算,得到分块图像②对应的图像块,分块图像②对应的图像块相当于掩盖掉了图2中的分块图像①③④。应当理解的是,得到的每个图像块与低分辨率CT图像的大小一致。It should be understood that at 201, the low-resolution CT image is calculated through the mask image 2-Mask to obtain the image block corresponding to the block image ②, and the image block corresponding to the block image ② is equivalent to covering up the block images ①③④ in Figure 2. It should be understood that each image block obtained is consistent with the size of the low-resolution CT image.
最后,在106处,将通过处理路径(101→102→103→104→105)得到的目标特征向量和通过处理路径(201→202→203→204→205)得到的目标特征向量,及其他类似的处理路径得到的目标特征向量,按照位置信息编码结果进行拼接,即可得到完整的高分辨率CT图像。Finally, at 106, the target feature vector obtained through the processing path (101→102→103→104→105) and the target feature vector obtained through the processing path (201→202→203→204→205), as well as the target feature vectors obtained through other similar processing paths, are spliced according to the position information encoding result to obtain a complete high-resolution CT image.
Transformer模型需要大量数据训练,训练过程较为复杂且难以收敛,因此,将低分辨率CT图像转换为高分辨率CT图像通常不能直接采用Transformer模型进行处理。The Transformer model requires a large amount of data training, and the training process is relatively complex and difficult to converge. Therefore, the Transformer model cannot usually be used directly to convert low-resolution CT images into high-resolution CT images.
本发明通过加入ADMM算法对低分辨率CT图像中的任一图像块进行处理,得到相应的超分图像特征向量,并将超分图像特征向量和相应的位置信息输入Transformer模型进行处理,获得由Transformer模型输出的对应于该任一图像块的超分图像,而由于ADMM算法有数学先验,可以更好的求解相应的超分图像特征向量以及相关参数的取值,那么经过ADMM算法可以获得对应于低分辨率CT图像的超分图像特征向量,那么后续Transformer模型只需要针对得到的超分图像特征向量和位置信息进行继续处理即可,无需针对初始的低分辨率CT图像进行处理,从而能够快速收敛,提升处理效率。同时,由于本发明将ADMM这种传统算法展开为可导形式,和Transformer结合在一起,先通过了ADMM算法进行编码解码,把数学先验给Transformer,再通过Transformer模型进一步解码,实现了数学先验和深度学习的结合,使得求解更加精确和快速,因而重建得到的高分辨率CT图像的效果更佳。The present invention processes any image block in the low-resolution CT image by adding the ADMM algorithm to obtain the corresponding super-resolution image feature vector, and inputs the super-resolution image feature vector and the corresponding position information into the Transformer model for processing, and obtains the super-resolution image corresponding to the any image block output by the Transformer model. Since the ADMM algorithm has a mathematical prior, the corresponding super-resolution image feature vector and the value of the relevant parameters can be better solved. Then, the super-resolution image feature vector corresponding to the low-resolution CT image can be obtained through the ADMM algorithm. Then, the subsequent Transformer model only needs to continue processing the obtained super-resolution image feature vector and position information, and does not need to process the initial low-resolution CT image, so that it can converge quickly and improve processing efficiency. At the same time, since the present invention expands the traditional algorithm ADMM into a differentiable form and combines it with the Transformer, it is first encoded and decoded by the ADMM algorithm, the mathematical prior is given to the Transformer, and then further decoded by the Transformer model, the combination of mathematical prior and deep learning is realized, so that the solution is more accurate and faster, and the effect of reconstructing the high-resolution CT image is better.
此外,相较于直接采用ADMM算法重建得到的高分辨率图像,即直接经过如图3所示的处理路径(101→102→103→104)或者处理路径(201→202→203→204)等图像块拼接得到的CT图像,如图4所示,本发明实施例中采用ADMM算法结合Transformer模型重建得到的高分辨率图像,即经由如图3所示的处理路径(101→102→103→104→105)以及处理路径(201→202→203→204→205)处理得到的高分辨率CT图像,如图5所示,由于Transformer可以捕捉长距离依赖,能得到更好的细节,因而图5相较于图4的图像质量会更 高,更加清晰,并且图5包含的细节信息更多。In addition, compared with the high-resolution image reconstructed directly by the ADMM algorithm, that is, the CT image directly obtained by splicing image blocks such as the processing path (101→102→103→104) or the processing path (201→202→203→204) as shown in FIG3, as shown in FIG4, the high-resolution image reconstructed by the ADMM algorithm combined with the Transformer model in an embodiment of the present invention, that is, the high-resolution CT image obtained by processing through the processing path (101→102→103→104→105) and the processing path (201→202→203→204→205) as shown in FIG3, as shown in FIG5, since the Transformer can capture long-distance dependencies and obtain better details, the image quality of FIG5 is better than that of FIG4. Higher, clearer, and Figure 5 contains more detail information.
本发明无需需用CT超分辨率扫描也能得到高分辨率的CT图像,从而在提高CT图像分辨率的同时降低患者在扫描过程中损害健康的风险。The present invention can obtain high-resolution CT images without the need for CT super-resolution scanning, thereby improving the resolution of CT images while reducing the risk of damaging the health of patients during the scanning process.
实施例三Embodiment 3
本实施例提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述图像处理方法实施例中的步骤,例如图1所示的S11-S13:This embodiment provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps in the above-mentioned image processing method embodiment are implemented, such as S11-S13 shown in FIG1 :
S11,获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;S11, obtaining any image block corresponding to the low-resolution original image, and inputting the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
S12,获取所述任一图像块在所述原始图像中的位置信息;S12, obtaining position information of any image block in the original image;
S13,将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。S13, inputting the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, and obtaining the super-resolution image corresponding to any image block output by the Transformer model.
或者,该计算机程序被处理器执行时实现上述装置实施例中各模块/单元的功能,例如图6中的模块601-604:Alternatively, when the computer program is executed by a processor, the functions of each module/unit in the above-mentioned device embodiment are implemented, such as modules 601-604 in FIG. 6:
所述约束计算模块601,用于获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;The constraint calculation module 601 is used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
所述位置获取模块602,用于获取所述任一图像块在所述原始图像中的位置信息;The position acquisition module 602 is used to acquire the position information of any image block in the original image;
所述模型处理模块603,用于将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像;The model processing module 603 is used to input the super-resolved image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolved image corresponding to any image block output by the Transformer model;
所述图像融合模块604,用于获取所述原始图像所含的每个所述图像块的所述超分图像;根据每个所述图像块对应的所述位置信息,对获取到的所述超分图像进行拼接,获得拼接后的目标图像,所述目标图像的大小匹配于所述原始图像的大小。The image fusion module 604 is used to obtain the super-resolution image of each image block contained in the original image; according to the position information corresponding to each image block, the super-resolution image is spliced to obtain a spliced target image, and the size of the target image matches the size of the original image.
实施例四Embodiment 4
参阅图7所示,为本发明实施例三提供的电子设备的结构示意图。在本发明较佳实施例中,所述电子设备70包括存储器701、至少一个处理器702、至少一条通信总线703及收发器704。7 is a schematic diagram of the structure of an electronic device provided in Embodiment 3 of the present invention. In a preferred embodiment of the present invention, the electronic device 70 includes a memory 701 , at least one processor 702 , at least one communication bus 703 and a transceiver 704 .
本领域技术人员应该了解,图7示出的电子设备的结构并不构成本发明实施例的限定,既可以是总线型结构,也可以是星形结构,所述电子设备70还可以包括比图示更多或更少的其他硬件或者软件,或者不同的部件布置。Those skilled in the art should understand that the structure of the electronic device shown in FIG. 7 does not constitute a limitation of the embodiments of the present invention, and may be either a bus structure or a star structure. The electronic device 70 may also include more or less other hardware or software than shown in the figure, or a different component arrangement.
在一些实施例中,所述电子设备70是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路、可编程门阵列、数字处理器及嵌入式设备等。所述电子设备70还可包括客户设备,所述客户设备包括但不限于任何一种可与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、数码相机等。In some embodiments, the electronic device 70 is a device that can automatically perform numerical calculations and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application-specific integrated circuits, programmable gate arrays, digital processors, and embedded devices. The electronic device 70 may also include client devices, which include but are not limited to any electronic product that can interact with a client through a keyboard, mouse, remote control, touchpad, or voice-controlled device, such as a personal computer, tablet computer, smart phone, digital camera, etc.
所述电子设备70仅为举例,其他现有的或今后可能出现的电子产品如可适应于本发明,也应包含在本发明的保护范围以内,并以引用方式包含于此。The electronic device 70 is only an example. Other existing or future electronic products that are suitable for the present invention should also be included in the protection scope of the present invention and are included here by reference.
在一些实施例中,所述存储器701中存储有计算机程序,所述计算机程序被所述至少一个处理器702执行时实现如所述的图像处理方法中的全部或者部分步骤。所述存储器701包括只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。 In some embodiments, the memory 701 stores a computer program, and when the computer program is executed by the at least one processor 702, all or part of the steps in the image processing method are implemented. The memory 701 includes a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), a one-time programmable read-only memory (OTPROM), an electronically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, magnetic disk storage, magnetic tape storage, or any other computer-readable medium that can be used to carry or store data.
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Furthermore, the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application required for at least one function, etc.; the data storage area may store data created according to the use of the blockchain node, etc.
本发明所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in the present invention is a new application model of computer technologies such as distributed data storage, peer-to-peer transmission, consensus mechanism, encryption algorithm, etc. Blockchain is essentially a decentralized database, a string of data blocks generated by cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify the validity of its information (anti-counterfeiting) and generate the next block. Blockchain can include the blockchain underlying platform, platform product service layer, and application service layer.
在一些实施例中,所述至少一个处理器702是所述电子设备70的控制核心(Control Unit),利用各种接口和线路连接整个电子设备70的每个部件,通过运行或执行存储在所述存储器701内的程序或者模块,以及调用存储在所述存储器701内的数据,以执行电子设备70的各种功能和处理数据。例如,所述至少一个处理器702执行所述存储器中存储的计算机程序时实现本发明实施例中所述的图像处理方法的全部或者部分步骤;或者实现图像处理装置的全部或者部分功能。所述至少一个处理器702可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。In some embodiments, the at least one processor 702 is the control core (Control Unit) of the electronic device 70, and uses various interfaces and lines to connect each component of the entire electronic device 70, and executes various functions and processes data of the electronic device 70 by running or executing the program or module stored in the memory 701, and calling the data stored in the memory 701. For example, when the at least one processor 702 executes the computer program stored in the memory, it implements all or part of the steps of the image processing method described in the embodiment of the present invention; or implements all or part of the functions of the image processing device. The at least one processor 702 can be composed of an integrated circuit, for example, it can be composed of a single packaged integrated circuit, or it can be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips.
在一些实施例中,所述至少一条通信总线703被设置为实现所述存储器701以及所述至少一个处理器702等之间的连接通信。In some embodiments, the at least one communication bus 703 is configured to implement connection and communication between the memory 701 and the at least one processor 702, etc.
尽管未示出,所述电子设备70还可以包括给每个部件供电的电源(比如电池),优选的,电源可以通过电源管理装置与所述至少一个处理器702逻辑相连,从而通过电源管理装置实现管理充电、放电、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备70还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。Although not shown, the electronic device 70 may also include a power source (such as a battery) for supplying power to each component. Preferably, the power source may be logically connected to the at least one processor 702 through a power management device, so that the power management device can manage charging, discharging, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and other arbitrary components. The electronic device 70 may also include a variety of sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,电子设备,或者网络设备等)或处理器(processor)执行本发明每个实施例所述方法的部分。The above-mentioned integrated unit implemented in the form of a software function module can be stored in a computer-readable storage medium. The above-mentioned software function module is stored in a storage medium and includes a number of instructions for enabling a computer device (which can be a personal computer, electronic device, or network device, etc.) or a processor to execute a part of the method described in each embodiment of the present invention.
在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided by the present invention, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative, for example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,既可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本发明每个实施例中的各功能模块可以集成在一个处理单元中,也可以是每个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present invention may be integrated into one processing unit, each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。说明书中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。It is obvious to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the present invention can be implemented in other specific forms without departing from the spirit or essential features of the present invention. Therefore, from any point of view, the embodiments should be regarded as exemplary and non-restrictive, and the scope of the present invention is defined by the appended claims rather than the above description, and it is intended that all changes falling within the meaning and scope of the equivalent elements of the claims are included in the present invention. Any figure mark in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other units or, and the singular does not exclude the plural. Multiple units or devices stated in the specification may also be implemented by one unit or device through software or hardware. The words first, second, etc. are used to indicate names, and do not indicate any particular order.
最后应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实 施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或等同替换,而不脱离本发明技术方案的精神和范围。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit the present invention. The embodiments describe the present invention in detail. Those skilled in the art should understand that the technical solutions of the present invention may be modified or replaced by equivalents without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

  1. 一种图像处理方法,其中,所述方法包括:An image processing method, wherein the method comprises:
    获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;Acquire any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
    获取所述任一图像块在所述原始图像中的位置信息;Obtaining position information of any image block in the original image;
    将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。The super-resolution image feature vector and position information corresponding to any image block are input into the Transformer model for processing to obtain a super-resolution image corresponding to any image block output by the Transformer model.
  2. 如权利要求1所述的图像处理方法,其中,所述获取对应于低分辨率的原始图像的任一图像块包括:The image processing method according to claim 1, wherein the acquiring any image block corresponding to the low-resolution original image comprises:
    确定匹配于所述任一图像块的图像掩膜;Determining an image mask matching any one of the image blocks;
    通过将所述图像掩膜与所述原始图像进行计算,得到所述任一图像块。The arbitrary image block is obtained by calculating the image mask and the original image.
  3. 如权利要求1所述的图像处理方法,其中,所述将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量包括:The image processing method according to claim 1, wherein the step of inputting any one of the image blocks into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector comprises:
    将所述任一图像块输入所述预设约束模型中,得到约束函数;Inputting any one of the image blocks into the preset constraint model to obtain a constraint function;
    对所述约束函数进行对偶分解,并基于交替方向乘子法对经过对偶分解得到的约束函数进行迭代计算,得到所述超分图像特征向量。The constraint function is dually decomposed, and the constraint function obtained by the dual decomposition is iteratively calculated based on the alternating direction multiplier method to obtain the super-resolution image feature vector.
  4. 如权利要求1所述的图像处理方法,其中,所述获取所述任一图像块在所述原始图像中的位置信息包括:The image processing method according to claim 1, wherein the step of obtaining the position information of any image block in the original image comprises:
    获取所述任一图像块中的指定点;Obtaining a specified point in any one of the image blocks;
    获取每个所述指定点在所述原始图像中的位置坐标;Obtaining the position coordinates of each of the specified points in the original image;
    对所述位置坐标进行归一化,得到归一化坐标;Normalizing the position coordinates to obtain normalized coordinates;
    根据所述归一化坐标生成所述位置信息。The position information is generated according to the normalized coordinates.
  5. 如权利要求4所述的图像处理方法,其中,所述将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像包括:The image processing method according to claim 4, wherein the step of inputting the super-resolved image feature vector and position information corresponding to any image block into a Transformer model for processing, and obtaining a super-resolved image corresponding to any image block output by the Transformer model comprises:
    根据所述任一图像块对应的所述超分图像特征向量及所述位置信息生成组合特征向量;Generate a combined feature vector according to the super-resolved image feature vector corresponding to any one of the image blocks and the position information;
    将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量;Input each of the combined feature vectors into a corresponding Transformer model for processing to obtain a target feature vector;
    根据所述目标特征向量生成所述任一图像块的超分图像。A super-resolution image of any image block is generated according to the target feature vector.
  6. 如权利要求5所述的图像处理方法,其中,所述Transformer模型包括:第一归一化层、多头注意力层、第二归一化层、全连接层及输出层,所述将每个所述组合特征向量输入对应的Transformer模型中进行处理,得到目标特征向量包括:The image processing method according to claim 5, wherein the Transformer model comprises: a first normalization layer, a multi-head attention layer, a second normalization layer, a fully connected layer, and an output layer, and the step of inputting each of the combined feature vectors into the corresponding Transformer model for processing to obtain the target feature vector comprises:
    通过所述第一归一化层对所述组合特征向量进行归一化,得到第一归一化特征向量;Normalizing the combined feature vector by the first normalization layer to obtain a first normalized feature vector;
    通过所述多头注意力层对所述第一归一化特征向量进行处理,得到注意力特征向量;Processing the first normalized feature vector through the multi-head attention layer to obtain an attention feature vector;
    对所述第一归一化特征向量及所述注意力特征向量进行残差连接,得到残差特征向量;Performing a residual connection on the first normalized feature vector and the attention feature vector to obtain a residual feature vector;
    通过所述第二归一化层对所述残差特征向量进行归一化,得到第二归一化特征向量;Normalizing the residual feature vector by the second normalization layer to obtain a second normalized feature vector;
    通过所述全连接层对所述第二归一化特征向量进行全连接计算,得到连接特征向量;Performing a full connection calculation on the second normalized feature vector through the fully connected layer to obtain a connected feature vector;
    对所述连接特征向量及所述残差特征向量进行残差连接,得到所述目标特征向量。Perform residual connection on the connection feature vector and the residual feature vector to obtain the target feature vector.
  7. 如权利要求1至6中任意一项所述的图像处理方法,其中,所述方法还包括:The image processing method according to any one of claims 1 to 6, wherein the method further comprises:
    获取所述原始图像所含的每个所述图像块的所述超分图像;Acquire the super-resolution image of each image block contained in the original image;
    根据每个所述图像块对应的所述位置信息,对获取到的所述超分图像进行拼接,获得拼接后的目标图像,所述目标图像的大小匹配于所述原始图像的大小。The acquired super-resolution images are spliced according to the position information corresponding to each of the image blocks to obtain a spliced target image, wherein the size of the target image matches the size of the original image.
  8. 一种图像处理装置,其特征在于,所述装置包括:An image processing device, characterized in that the device comprises:
    约束计算模块,用于获取对应于低分辨率的原始图像的任一图像块,并将所述任一图像块输入预设约束模型进行约束计算,获得相应的超分图像特征向量;A constraint calculation module, used to obtain any image block corresponding to the low-resolution original image, and input the image block into a preset constraint model for constraint calculation to obtain a corresponding super-resolution image feature vector;
    位置获取模块,用于获取所述任一图像块在所述原始图像中的位置信息; A position acquisition module, used to acquire the position information of any image block in the original image;
    模型处理模块,用于将所述任一图像块对应的超分图像特征向量和位置信息输入Transformer模型进行处理,获得由所述Transformer模型输出的对应于所述任一图像块的超分图像。The model processing module is used to input the super-resolution image feature vector and position information corresponding to any image block into the Transformer model for processing, so as to obtain the super-resolution image corresponding to any image block output by the Transformer model.
  9. 一种计算机设备,其中,所述计算机设备包括处理器和存储器,所述处理器用于执行所述存储器中存储的计算机程序时实现如权利要求1至7中任意一项所述的图像处理方法。A computer device, wherein the computer device comprises a processor and a memory, and the processor is used to implement the image processing method according to any one of claims 1 to 7 when executing a computer program stored in the memory.
  10. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至7中任意一项所述的图像处理方法。 A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the image processing method according to any one of claims 1 to 7 is implemented.
PCT/CN2023/134026 2022-12-05 2023-11-24 Image processing method, apparatus, electronic device and storage medium WO2024120224A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211546911.5 2022-12-05
CN202211546911.5A CN118154422A (en) 2022-12-05 2022-12-05 Image processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2024120224A1 true WO2024120224A1 (en) 2024-06-13

Family

ID=91291422

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/134026 WO2024120224A1 (en) 2022-12-05 2023-11-24 Image processing method, apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN118154422A (en)
WO (1) WO2024120224A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
CN112669214A (en) * 2021-01-04 2021-04-16 东北大学 Fuzzy image super-resolution reconstruction method based on alternative direction multiplier algorithm
CN114049255A (en) * 2021-11-08 2022-02-15 Oppo广东移动通信有限公司 Image processing method and device, integrated storage and calculation chip and electronic equipment
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115311187A (en) * 2022-10-12 2022-11-08 湖南大学 Hyperspectral fusion imaging method, system and medium based on internal and external prior

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
CN112669214A (en) * 2021-01-04 2021-04-16 东北大学 Fuzzy image super-resolution reconstruction method based on alternative direction multiplier algorithm
CN114049255A (en) * 2021-11-08 2022-02-15 Oppo广东移动通信有限公司 Image processing method and device, integrated storage and calculation chip and electronic equipment
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115311187A (en) * 2022-10-12 2022-11-08 湖南大学 Hyperspectral fusion imaging method, system and medium based on internal and external prior

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHEYUAN LI, CHEN XIANGYU, QIAO YU, DONG CHAO, JING KUN: "Research of Single Image Super Resolution Based on Attention Mechanism", JOURNAL OF INTEGRATION TECHNOLOGY, KEXUE CHUBANSHE,SCIENCE PRESS, CN, vol. 11, no. 5, 15 September 2022 (2022-09-15), CN, pages 58 - 79, XP093178080, ISSN: 2095-3135, DOI: 10.12146/j.issn.2095-3135.20211209001 *

Also Published As

Publication number Publication date
CN118154422A (en) 2024-06-07

Similar Documents

Publication Publication Date Title
WO2020199693A1 (en) Large-pose face recognition method and apparatus, and device
WO2020042720A1 (en) Human body three-dimensional model reconstruction method, device, and storage medium
WO2020019738A1 (en) Plaque processing method and device capable of performing magnetic resonance vessel wall imaging, and computing device
CN115456161A (en) Data processing method and data processing system
Sengan et al. Cost-effective and efficient 3D human model creation and re-identification application for human digital twins
DE112020003547T5 (en) Transfer learning for neural networks
DE102021113690A1 (en) VIDEO SYNTHESIS USING ONE OR MORE NEURAL NETWORKS
CN109616197A (en) Tooth data processing method, device, electronic equipment and computer-readable medium
Hsieh An efficient development of 3D surface registration by Point Cloud Library (PCL)
Xin et al. Skeleton mixformer: Multivariate topology representation for skeleton-based action recognition
CN113570634B (en) Object three-dimensional reconstruction method, device, electronic equipment and storage medium
Leng et al. Self-sampling meta SAM: enhancing few-shot medical image segmentation with meta-learning
WO2024120224A1 (en) Image processing method, apparatus, electronic device and storage medium
CN113781653B (en) Object model generation method and device, electronic equipment and storage medium
CN114820861A (en) MR synthetic CT method, equipment and computer readable storage medium based on cycleGAN
CN117094362B (en) Task processing method and related device
US20220415481A1 (en) Mesh topology adaptation
Hao et al. HyperGraph based human mesh hierarchical representation and reconstruction from a single image
CN114155400B (en) Image processing method, device and equipment
CN114743248A (en) Face key point detection method and device, readable storage medium and terminal equipment
Kumar et al. CIS2VR: CNN-based Indoor Scan to VR Environment Authoring Framework
Zhang et al. Self-sampling meta sam: Enhancing few-shot medical image segmentation with meta-learning
US20150142626A1 (en) Risk scenario generation
CN118379372B (en) Image processing acceleration method, device and product
Wang et al. A New Parallel Scheduling Algorithm Based on MPI

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23899803

Country of ref document: EP

Kind code of ref document: A1