CN114494442A - Image processing method, device and equipment - Google Patents

Image processing method, device and equipment Download PDF

Info

Publication number
CN114494442A
CN114494442A CN202210339991.0A CN202210339991A CN114494442A CN 114494442 A CN114494442 A CN 114494442A CN 202210339991 A CN202210339991 A CN 202210339991A CN 114494442 A CN114494442 A CN 114494442A
Authority
CN
China
Prior art keywords
processing
algorithm
dimensional
convolution
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210339991.0A
Other languages
Chinese (zh)
Inventor
周波
田欣兴
苗瑞
邹小刚
梁书玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen HQVT Technology Co Ltd
Original Assignee
Shenzhen HQVT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen HQVT Technology Co Ltd filed Critical Shenzhen HQVT Technology Co Ltd
Priority to CN202210339991.0A priority Critical patent/CN114494442A/en
Publication of CN114494442A publication Critical patent/CN114494442A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image processing method, an image processing device and image processing equipment, which relate to the image processing technology, and the method comprises the following steps: acquiring a plurality of images to be identified; according to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises a coding algorithm and a decoding algorithm, wherein the coding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network; and decoding the preliminary characteristic graph according to a decoding algorithm to obtain a target characteristic graph. The method improves the segmentation precision of the network and solves the technical problem that the position accuracy of the identified focus in the medical image is low.

Description

Image processing method, device and equipment
Technical Field
The present application relates to image processing technologies, and in particular, to an image processing method, an image processing apparatus, and an image processing device.
Background
Currently, in order to identify the lesion position in a medical image, the medical image needs to be identified.
In the prior art, when identifying a lesion position in a medical image, a texture feature-based medical image segmentation method may be used, where the texture feature-based medical image segmentation method is to count a plurality of pixel points in an image or an image region according to a texture distribution feature of a target corresponding to the image or the image region, and further calculate a pixel distribution state of the image.
However, in the prior art, since the texture characteristics reflect only the local distribution characteristics of the target object, the image information of a higher level cannot be obtained only by using the texture characteristics, and particularly, when the image includes a plurality of texture features, the over-segmentation is easily caused, so that the accuracy of the lesion position in the identified medical image is low.
Disclosure of Invention
The application provides an image processing method, device and equipment, which are used for solving the technical problem that the accuracy of a focus position in an identified medical image is low.
In a first aspect, the present application provides an image processing method, comprising
Acquiring a plurality of images to be identified;
according to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be recognized to obtain a primary characteristic diagram; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, wherein the encoding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network;
decoding the preliminary characteristic graph according to the decoding algorithm to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Further, according to a coding algorithm in a preset hybrid U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be recognized to obtain a preliminary characteristic diagram, including:
according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing on the image to be identified in a two-dimensional convolution layer to obtain a two-dimensional characteristic diagram; wherein the two-dimensional convolutional layer of the coding algorithm comprises 2 layers;
performing three-dimensional convolution processing on the two-dimensional characteristic diagram on the three-dimensional convolution layer to obtain a preliminary characteristic diagram; wherein the three-dimensional convolutional layer of the coding algorithm comprises 3 layers.
Further, decoding the preliminary feature map according to the decoding algorithm to obtain a target feature map, including:
performing three-dimensional convolution processing on the preliminary feature map at a three-dimensional convolution layer according to a residual learning network in the decoding algorithm to obtain a three-dimensional feature image; wherein the three-dimensional convolutional layer in the decoding algorithm comprises 2 layers;
performing two-dimensional convolution processing on the three-dimensional characteristic diagram on the two-dimensional convolution layer to obtain a convolution characteristic diagram; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers;
performing convolution processing on the convolution characteristic graph by using a preset normalization index function to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Further, after performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be recognized according to a coding algorithm in a preset hybrid U-Net network algorithm to obtain a preliminary feature map, the method further includes:
performing hole convolution processing on the preliminary feature map according to a preset hole convolution algorithm to obtain a first feature map; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
Further, the method further comprises:
performing 1 × 1 convolution operation processing on the first feature maps respectively according to a preset spatial position attention algorithm to obtain a plurality of second feature maps;
performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map;
and summing the first channel attention heat map and the first feature map to obtain a third feature map.
Further, the method further comprises:
according to a preset channel attention algorithm, performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map to obtain a second channel attention heat map;
and summing the second channel attention heat map and the first feature map to obtain a fourth feature map.
Further, the method further comprises:
performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram, and processing through a preset linear rectification function to obtain a fifth characteristic diagram; wherein the fifth feature map characterizes a preliminary feature map that has undergone multiple convolutions.
In a second aspect, the present application provides an image processing apparatus comprising:
the device comprises a first acquisition unit, a second acquisition unit and a recognition unit, wherein the first acquisition unit is used for acquiring a plurality of images to be recognized;
the first convolution unit is used for performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be identified according to a coding algorithm in a preset mixed U-Net network algorithm to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, wherein the encoding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network;
the decoding unit is used for decoding the preliminary characteristic graph according to the decoding algorithm to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Further, the first convolution unit includes:
the first convolution module is used for performing two-dimensional convolution processing on the image to be identified in a two-dimensional convolution layer according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm to obtain a two-dimensional characteristic diagram; wherein the two-dimensional convolutional layer of the coding algorithm comprises 2 layers;
the second convolution module is used for performing three-dimensional convolution processing on the two-dimensional characteristic diagram on the three-dimensional convolution layer to obtain a preliminary characteristic diagram; wherein the three-dimensional convolutional layer of the coding algorithm comprises 3 layers.
Further, the decoding unit includes:
the third convolution module is used for performing three-dimensional convolution processing on the preliminary characteristic graph in a three-dimensional convolution layer according to the residual learning network in the decoding algorithm to obtain a three-dimensional characteristic image; wherein, the three-dimensional convolution layer in the decoding algorithm comprises 2 layers;
the fourth convolution module is used for performing two-dimensional convolution processing on the three-dimensional characteristic graph on the two-dimensional convolution layer to obtain a convolution characteristic graph; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers;
the fifth convolution module is used for carrying out convolution processing on the convolution characteristic graph by utilizing a preset normalization index function to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Further, the apparatus further comprises:
the second convolution unit is used for performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be identified according to a coding algorithm in a preset mixed U-Net network algorithm to obtain a preliminary feature map, and then performing hole convolution processing on the preliminary feature map according to a preset hole convolution algorithm to obtain a first feature map; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
Further, the apparatus further comprises:
the third convolution unit is used for respectively carrying out 1 multiplied by 1 convolution operation processing on the first feature map according to a preset spatial position attention algorithm to obtain a plurality of second feature maps;
the first processing unit is used for performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map;
and the second processing unit is used for summing the first channel attention heat map and the first feature map to obtain a third feature map.
Further, the apparatus further comprises:
the second acquisition unit is used for performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map according to a preset channel attention algorithm to obtain a second channel attention heat map;
and the third processing unit is used for summing the second channel attention heat map and the first feature map to obtain a fourth feature map.
Further, the apparatus further comprises:
the residual error unit is used for performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram and processing the third characteristic diagram and the fourth characteristic diagram through a preset linear rectification function to obtain a fifth characteristic diagram; wherein the fifth feature map characterizes a preliminary feature map that has undergone multiple convolutions.
In a third aspect, the present application provides an electronic device, comprising a memory and a processor, wherein the memory stores a computer program operable on the processor, and the processor implements the method of the first aspect when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions for implementing the method of the first aspect when executed by a processor.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
The application provides an image processing method, device and equipment, which are used for acquiring a plurality of images to be identified;
according to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises a coding algorithm and a decoding algorithm, wherein the coding algorithm and the decoding algorithm are used for carrying out convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network; decoding the preliminary characteristic graph according to a decoding algorithm to obtain a target characteristic graph; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions. According to the scheme, according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, two-dimensional convolution processing is sequentially carried out on an image to be recognized in a two-dimensional convolution layer and three-dimensional convolution processing is sequentially carried out on a three-dimensional convolution layer to obtain a preliminary feature map, and finally decoding processing is carried out on the preliminary feature map according to the residual learning network in a decoding algorithm to obtain a target feature map. Therefore, when the image to be recognized is recognized, the two-dimensional convolution operation and the three-dimensional convolution operation are carried out through the residual learning network, so that the shallow information and the deep information of the image to be recognized can be extracted, the problems of gradient disappearance and overfitting in the training process of the liver and the tumor thereof are prevented, and better information of the target position can be obtained, so that the network segmentation precision is improved, and the technical problem of lower accuracy of the focus position in the recognized medical image is realized.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of another image processing method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a hybrid U-Net network algorithm provided in an embodiment of the present application;
fig. 4 is a schematic flowchart of another image processing method according to an embodiment of the present application;
fig. 5 is a scene schematic diagram of a hole convolution algorithm according to an embodiment of the present application;
FIG. 6 is a schematic view of a spatial attention algorithm provided in an embodiment of the present application;
fig. 7 is a schematic view of a scenario of a channel attention algorithm provided in an embodiment of the present application;
fig. 8 is a schematic view of a scenario of a residual double attention module according to an embodiment of the present disclosure;
fig. 9 is a schematic flowchart of another image processing method according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 13 is a block diagram of an electronic device according to an embodiment of the present application.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure.
In one example, in order to identify a lesion location in a medical image, the medical image needs to be identified. In the prior art, when identifying a lesion position in a medical image, a texture feature-based medical image segmentation method may be used, where the texture feature-based medical image segmentation method is to count a plurality of pixel points in an image or an image region according to a texture distribution feature of a target corresponding to the image or the image region, and further calculate a pixel distribution state of the image. However, in the prior art, since the texture characteristics reflect only the local distribution characteristics of the target object, the image information of a higher level cannot be obtained only by using the texture characteristics, and particularly, when the image includes a plurality of texture features, the over-segmentation is easily caused, so that the accuracy of the lesion position in the identified medical image is low.
The application provides an image processing method, an image processing device and image processing equipment, and aims to solve the above technical problems in the prior art.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present application, and as shown in fig. 1, the method includes:
101. a plurality of images to be recognized are acquired.
For example, the execution subject of this embodiment may be an electronic device, or a terminal device, or an image processing apparatus or device, or other apparatuses or devices that can execute this embodiment, which is not limited in this respect. In this embodiment, an execution main body is described as an electronic device.
First, the electronic device needs to acquire a plurality of images to be recognized. Exemplarily, fig. 2 is a schematic flowchart of another image processing method provided in an embodiment of the present application, and as can be seen from fig. 2, a first step is to adjust a window, and set all gray values of an original image, for example, a liver CT image, which are less than-100 to-100, that is, all gray ranges below-100 are set to black, all gray values of a liver CT image, which are greater than 400, are set to 400, and all gray ranges above 400 are set to white, so as to enhance the degree of black-white contrast of the liver and its tumor; secondly, enhancing the image, and achieving the purpose of enhancing the image through operations of denoising, equalization, normalization and the like of the image; the third step is the cutting of the image, because the original image is 512 x 512, the pixel is too big, it is unfavorable for the network training, cut the image to 256 x 256; and fourthly, sending the processed images into a trained mixed U-Net network for training, and obtaining predicted images of the liver and the liver tumor through multiple times of training. Therefore, the electronic device needs to acquire a plurality of images to be recognized, which are obtained by cutting the original image in the first step, the second step and the third step.
102. According to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, and the encoding algorithm and the decoding algorithm are used for performing convolution processing on the images on the two-dimensional convolution layer and the three-dimensional convolution layer through the residual learning network.
Illustratively, the U-Net network is an algorithm for semantic segmentation by using a full convolution network, and the hybrid U-Net network algorithm is improved based on the U-Net network and is a classic left half part coding structure and a right half part decoding structure. Fig. 3 is a schematic structural diagram of a hybrid U-Net network algorithm provided in an embodiment of the present application, and as can be seen from fig. 3, a coding algorithm of a coding part includes five layers of convolutions, where two-dimensional convolution layers L1 and L2 are both two-dimensional convolution operations, three-dimensional convolution layers L3, L4 and L5 are all three-dimensional convolution operations, and the five layers of convolutions introduce a residual learning network; the first two layers of the decoding part and the encoding part which are symmetrical are three-dimensional convolution operation, the second two layers are two-dimensional convolution operation, the last layer is convolution and normalization index function (Softmax function) operation, and the five-layer convolution also introduces a residual learning network, because the resolution in the medical image plane is about 4 times of the resolution between the layers, and the resolution in the plane and the resolution between the layers are the same through the two-layer two-dimensional convolution operation, and then the three-dimensional convolution can be used for operation. A plurality of continuous adjacent two-dimensional images are input into a mixed U-Net Network algorithm, firstly, shallow information of the images is sequentially extracted from two-dimensional Convolution layers L1 and L2 through a two-dimensional Convolution neural Network, the position information of the liver and tumors of the liver is determined, then, the roughly divided images are stacked according to the original sequence, then, the images are input into a three-dimensional Convolution neural Network, and the deep information of the images is sequentially extracted from three-dimensional Convolution layers L3, L4 and L5, so that the full Convolution neural Network (FCN) can obtain better position information, and therefore the dividing precision of the Network is improved.
103. Decoding the preliminary characteristic graph according to a decoding algorithm to obtain a target characteristic graph; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Exemplarily, the electronic device sequentially performs three-dimensional convolution processing on the preliminary feature map at two three-dimensional convolution layers according to a residual learning network in a decoding algorithm to obtain a three-dimensional feature image, sequentially performs two-dimensional convolution processing on the three-dimensional feature map at two-dimensional convolution layers to obtain a convolution feature map, and finally performs convolution processing on the convolution feature map by using a preset Softmax function to obtain a target feature map, wherein the target feature map characterizes a plurality of pixel points obtained through decoding processing, and the plurality of pixel points include pixel points occupied by a target position and pixel points occupied by a background except the target position, so that the target position can be determined.
In the embodiment of the application, a plurality of images to be identified are obtained. According to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, and the encoding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through the residual learning network. Decoding the preliminary characteristic graph according to a decoding algorithm to obtain a target characteristic graph; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions. According to the scheme, according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, two-dimensional convolution processing is sequentially carried out on an image to be recognized in a two-dimensional convolution layer and three-dimensional convolution processing is sequentially carried out on a three-dimensional convolution layer to obtain a preliminary feature map, and finally decoding processing is carried out on the preliminary feature map according to the residual learning network in a decoding algorithm to obtain a target feature map. Therefore, when the image to be recognized is recognized, the two-dimensional convolution operation and the three-dimensional convolution operation are carried out through the residual learning network, so that the shallow information and the deep information of the image to be recognized can be extracted, the problems of gradient disappearance and overfitting in the training process of the liver and the tumor thereof are prevented, and better information of the target position can be obtained, so that the network segmentation precision is improved, and the technical problem of lower accuracy of the focus position in the recognized medical image is realized.
Fig. 4 is a schematic flowchart of another image processing method according to an embodiment of the present application, and as shown in fig. 4, the method includes:
201. a plurality of images to be recognized are acquired.
For example, this step may refer to step 101 in fig. 1, and is not described again.
202. According to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing on an image to be recognized in a two-dimensional convolution layer to obtain a two-dimensional characteristic map; wherein, the two-dimensional convolution layer of the coding algorithm comprises 2 layers.
For example, the electronic device may sequentially perform two-dimensional convolution processing on each image to be recognized in the two-dimensional convolution layers L1 and L2 according to a residual learning network of a coding algorithm in a preset hybrid U-Net network algorithm to obtain a two-dimensional feature map, and may further determine position information of the liver and a tumor thereof according to the two-dimensional feature map, and finally, may perform image stitching on each two-dimensional feature map subjected to coarse segmentation according to a composition sequence of the original image to form a complete image.
203. Performing three-dimensional convolution processing on the two-dimensional characteristic map on the three-dimensional convolution layer to obtain a primary characteristic map; wherein, the three-dimensional convolution layer of the coding algorithm comprises 3 layers.
For example, the electronic device may perform three-dimensional convolution processing on the two-dimensional feature map in the three-dimensional convolution layers L3, L4, and L5 in sequence, extract information at deep levels in the two-dimensional feature map, and enable the full convolution neural network to obtain better position information.
204. Performing hole convolution processing on the initial characteristic graph according to a preset hole convolution algorithm to obtain a first characteristic graph; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
Exemplarily, as shown in fig. 5, fig. 5 is a scene schematic diagram of a hole convolution algorithm provided in an embodiment of the present application, and as can be known from fig. 5, the hole convolution algorithm includes five different expansion rates combined into a hole convolution branch including an original feature map, where the original feature map is an image to be recognized before two-dimensional convolution. The electronic equipment can combine to obtain a cavity convolution branch according to five different expansion rates in a cavity convolution algorithm, the cavity convolution processing is carried out on the primary characteristic graph to obtain a first characteristic graph, the perception field of image characteristic extraction is improved by increasing the cavity convolution under the condition that the resolution of an image is not damaged, the loss of the resolution of image information in the process of deep layer convolution downsampling in the front is made up, and the global information of the image is increased. The expansion rate parameter can be used for increasing the receptive field of the image, and has good effect on the detection of the tumor and the improvement of the segmentation precision. The receptive field is the mapping area size of the pixel points of the local characteristic diagram on the image to be identified after the convolution operation processing is carried out through the specified convolution kernel, and the formulas of the convolution kernel and the receptive field are as follows:
Figure 293910DEST_PATH_IMAGE001
wherein the content of the first and second substances,ksizeis the size of the convolution kernel in the first two-dimensional convolution layer L1,r1 is the size of the receptive field of the void convolution kernel,dis a diagnosis rate: (d-1) the size is the number of filled spaces,strideis the step size of the convolution operation,RFithe receptive field of the upper layer is the receptive field of the upper layer,RFi+1 being current
And (4) receptive field.
After step 204, step 205 and step 206 are parallel schemes, and step 205 and step 206 are as follows:
205. respectively carrying out 1 × 1 convolution operation processing on the first feature map according to a preset spatial position attention algorithm to obtain a plurality of second feature maps; performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map; and summing the first channel attention heat map and the first feature map to obtain a third feature map.
Exemplarily, as shown in fig. 6, fig. 6 is a scene schematic diagram of a spatial location attention algorithm provided in an embodiment of the present application, and as can be seen from fig. 6, an electronic device first reshapes a size of a first feature map a to C × N, where N is D × H × W; the first characteristic diagram A respectively carries out three 1 multiplied by 1 convolution operations to obtain B, C, D second characteristic diagrams respectively; the second characteristic diagram B is subjected to size reshaping and dimension transformation to obtain E, the size is changed from original C multiplied by D multiplied by H multiplied by W to N multiplied by C, wherein N is D multiplied by H multiplied by W; e, multiplying the result of the second characteristic diagram C after size reshaping by the second characteristic diagram C, and obtaining a space supervision diagram S through a softmax function, wherein the size of the space supervision diagram S is NxN; and S and the result of size reshaping of the second characteristic diagram D are multiplied to obtain a first channel attention heat map, the first channel attention heat map is multiplied by a scale coefficient alpha, the first channel attention heat map is subjected to size reshaping to restore the original size of the first channel attention heat map, the first channel attention heat map is subjected to size reshaping, and finally the first channel attention heat map and the first characteristic diagram A are added to obtain the output of a spatial position attention algorithm, namely a third characteristic diagram G.
The elements of the S matrix are as follows:
Figure 351864DEST_PATH_IMAGE002
wherein, BiAn element of B, CiIs an element of C, SjiThe influence of the pixel at the ith position on the pixel at the jth position in the feature map is measured, and the more similar the feature representations of the two positions are, the more they are
The greater the relevance of (a), and conversely, the smaller the relevance.
The formula of the spatial locality attention algorithm is as follows:
Figure 489585DEST_PATH_IMAGE003
wherein the elements of the S matrix are SjiAlpha is a scale factor, initialized to 0, and gradually more weight is distributed in the training process, DiAn element of D, AjIs an element of A. As can be seen from the output formula of the spatial locality attention algorithm, EjThe sum of the features of all positions and the first feature map A is obtained, therefore, the spatial position attention algorithm contains context information of the image, the features of key positions are highlighted according to selective aggregation of the context of the spatial position feature map of the image, the segmentation accuracy of the image is improved, 1 × 1 convolution operation is added after the convolution of branch holes, the nonlinear characteristic can be added on the premise that the feature scale is kept unchanged, and the network expresses more complex features.
206. According to a preset channel attention algorithm, performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map to obtain a second channel attention heat map; and summing the second channel attention heat map and the first feature map to obtain a fourth feature map.
Exemplarily, each convolution layer includes a plurality of filters, and channel information in a local receptive field region of an image can be learned through the convolution filters, as shown in fig. 7, fig. 7 is a scene schematic diagram of a channel attention algorithm provided in an embodiment of the present application, for convolution of a two-dimensional image and obtaining a two-dimensional feature map, two-dimensional parameter information in the two-dimensional feature map is obtained, where the two-dimensional parameter information includes a length, a width, and channel pixel points of the image; and for the three-dimensional image, obtaining a preliminary characteristic diagram, and obtaining three-dimensional parameter information in the preliminary characteristic diagram, wherein the three-dimensional parameter information comprises the length, the width, the height and the channel of the image. By adding a channel attention algorithm, the correlation characteristics among all channel maps can be integrated, and the integration process comprises the following steps: learning the weight of each channel, performing feature recombination according to the specific gravity of each channel in the feature map, performing global down-sampling, convolution operation and activation function (such as softmax function) processing, obtaining a second channel attention heat map by encoding the channel features, and multiplying the feature map subjected to two-dimensional convolution processing and the feature map subjected to three-dimensional convolution processing by two elements of the second channel attention heat map, so that the dependency relationship among all channels can be integrated.
Unlike the spatial attention mechanism, the channel attention mechanism operates as follows: firstly, reshaping the size of the first characteristic diagram A into C multiplied by N, wherein N is D multiplied by H multiplied by W; performing size reshaping on the first characteristic diagram A to obtain B, wherein the size of B is NxC, and B is subjected to a softmax function to obtain X, and the size of B is CxC; and multiplying the result B of the first characteristic diagram A through size reshaping by the X to obtain a second channel attention heat map, multiplying the second channel attention heat map by a scale coefficient beta, recovering the second channel attention heat map to a preset size through size reshaping, and finally adding the second channel attention heat map and the first characteristic diagram A to obtain the output of a channel attention algorithm, namely a fourth characteristic diagram E.
The elements of the X matrix are as follows:
Figure 309380DEST_PATH_IMAGE004
wherein A isi、AjIs an element of A, XjiMeasured is the ith in the characteristic diagramthJ channel pairthThe more similar the feature representations of two positions are, the greater their relevance, and vice versa, the smaller the relevance is.
The formula for the channel attention algorithm is as follows:
Figure 532551DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 128617DEST_PATH_IMAGE006
for the scale factor, initialized to 0, more weights are assigned step by step during the training process, AjIs an element of A, XjiThe elements of matrix X. As can be seen from the output formula of the channel attention algorithm, EjThe method is the sum of the characteristics of all channels and the original characteristics A, so that the dependency relationship of the channels is established, and the distinguishing performance of the characteristics is improved.
207. Performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram, and processing through a preset linear rectification function to obtain a fifth characteristic diagram; and the fifth feature map represents the preliminary feature map after multiple convolutions.
Exemplarily, as shown in fig. 8, fig. 8 is a scene schematic diagram of a residual double attention module according to an embodiment of the present application, and as can be seen from fig. 8, a spatial attention algorithm and a channel attention algorithm are connected in parallel, the spatial attention algorithm outputs a third feature map U, the channel attention algorithm outputs a fourth feature map V, and the electronic device may perform residual connection processing on the third feature map U and the fourth feature map V, and perform processing through a preset Linear rectification function (ReLU) to obtain a fifth feature map, that is, a preliminary feature map after multiple convolutions.
208. Performing three-dimensional convolution processing on the preliminary feature map at the three-dimensional convolution layer according to a residual learning network in a decoding algorithm to obtain a three-dimensional feature image; wherein, the three-dimensional convolution layer in the decoding algorithm comprises 2 layers.
Exemplarily, the electronic device may sequentially perform three-dimensional convolution processing on the preliminary feature map at 2 three-dimensional convolution layers according to a residual learning network in a decoding algorithm to obtain a three-dimensional feature image.
209. Performing two-dimensional convolution processing on the three-dimensional characteristic graph on the two-dimensional convolution layer to obtain a convolution characteristic graph; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers.
For example, the electronic device may perform two-dimensional convolution processing on the three-dimensional feature map in 2 two-dimensional convolution layers in sequence to obtain a convolution feature map.
210. Performing convolution processing on the convolution characteristic graph by using a preset normalization index function to obtain a target characteristic graph; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
Exemplarily, the electronic device may perform convolution processing on the convolution characteristic graph by using a preset softmax function to obtain a target characteristic graph, the target characteristic graph characterizes a plurality of pixel points obtained through decoding processing, the plurality of pixel points include pixel points occupied by a target position and pixel points occupied by a background other than the target position, and then the target position may be displayed according to the pixel points.
In the embodiment of the application, a plurality of images to be identified are obtained. According to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing on an image to be recognized in a two-dimensional convolution layer to obtain a two-dimensional characteristic map; wherein, the two-dimensional convolution layer of the coding algorithm comprises 2 layers. Performing three-dimensional convolution processing on the two-dimensional characteristic diagram in the three-dimensional convolution layer to obtain a preliminary characteristic diagram; wherein, the three-dimensional convolution layer of the coding algorithm comprises 3 layers. Performing hole convolution processing on the initial characteristic graph according to a preset hole convolution algorithm to obtain a first characteristic graph; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified. Respectively carrying out 1 × 1 convolution operation processing on the first feature map according to a preset spatial position attention algorithm to obtain a plurality of second feature maps; performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map; and summing the first channel attention heat map and the first feature map to obtain a third feature map. According to a preset channel attention algorithm, performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map to obtain a second channel attention heat map; and summing the second channel attention heat map and the first feature map to obtain a fourth feature map. And performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram, and processing through a preset linear rectification function to obtain a fifth characteristic diagram. Performing three-dimensional convolution processing on the preliminary feature map at the three-dimensional convolution layer according to a residual learning network in a decoding algorithm to obtain a three-dimensional feature image; wherein, the three-dimensional convolution layer in the decoding algorithm comprises 2 layers. Performing two-dimensional convolution processing on the three-dimensional characteristic graph on the two-dimensional convolution layer to obtain a convolution characteristic graph; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers. Performing convolution processing on the convolution characteristic graph by using a preset normalization index function to obtain a target characteristic graph; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions. Therefore, when the image to be recognized is recognized, the two-dimensional convolution operation and the three-dimensional convolution operation are carried out through the residual learning network, so that the shallow information and the deep information of the image to be recognized can be extracted, the problems of gradient disappearance and overfitting in the training process of the liver and the tumor thereof are prevented, and better information of the target position can be obtained, so that the network segmentation precision is improved, and the technical problem of lower accuracy of the focus position in the recognized medical image is realized. Moreover, the cavity convolution module is added at the output part of the two-dimensional convolution operation and the three-dimensional convolution operation, so that the characteristic loss can be reduced, the visual field perception domain of the liver and the tumor thereof can be increased, the target detection and classification precision of the liver and the tumor image thereof can be improved, the analysis capability of the target detection on the image background and the target distribution condition can be enhanced, and the target detection and classification precision of the liver and the tumor image thereof, which are different in size, complex in background and difficult in tumor boundary classification, can be improved; meanwhile, by introducing a spatial position attention algorithm and a channel attention algorithm, semantic information of the spatial position dimension and the channel dimension is correlated, the dependency relationship between the channel and the spatial position is enhanced, and further the feature expression capability of the network is enhanced, so that more accurate segmentation is realized.
Exemplarily, fig. 9 is a schematic flowchart of another image processing method provided in an embodiment of the present application.
Fig. 10 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application, and as shown in fig. 10, the apparatus includes:
a first acquiring unit 31 for acquiring a plurality of images to be recognized.
The first convolution unit 32 is configured to perform two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized according to a coding algorithm in a preset hybrid U-Net network algorithm to obtain a preliminary feature map; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, and the encoding algorithm and the decoding algorithm are used for performing convolution processing on the images on the two-dimensional convolution layer and the three-dimensional convolution layer through the residual learning network.
The decoding unit 33 is configured to perform decoding processing on the preliminary feature map according to a decoding algorithm to obtain a target feature map; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
The apparatus of this embodiment may execute the technical solution in the method, and the specific implementation process and the technical principle are the same, which are not described herein again.
Fig. 11 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present application, and based on the embodiment shown in fig. 10, as shown in fig. 11, the first convolution unit 32 includes:
the first convolution module 321 is configured to perform two-dimensional convolution processing on an image to be recognized in a two-dimensional convolution layer according to a residual learning network of a coding algorithm in a preset hybrid U-Net network algorithm to obtain a two-dimensional feature map; wherein, the two-dimensional convolution layer of the coding algorithm comprises 2 layers.
A second convolution module 322, configured to perform three-dimensional convolution processing on the two-dimensional feature map in the three-dimensional convolution layer to obtain a preliminary feature map; wherein, the three-dimensional convolution layer of the coding algorithm comprises 3 layers.
In one example, the decoding unit 33 includes:
the third convolution module 331 is configured to perform three-dimensional convolution processing on the preliminary feature map at the three-dimensional convolution layer according to a residual learning network in the decoding algorithm to obtain a three-dimensional feature image; wherein, the three-dimensional convolution layer in the decoding algorithm comprises 2 layers.
A fourth convolution module 332, configured to perform two-dimensional convolution processing on the three-dimensional feature map in the two-dimensional convolution layer to obtain a convolution feature map; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers.
A fifth convolution module 333, configured to perform convolution processing on the convolution feature map by using a preset normalized exponential function to obtain a target feature map; the target characteristic graph represents a plurality of pixel points obtained through decoding processing, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
In one example, the apparatus further comprises:
the second convolution unit 41 is configured to perform two-dimensional convolution processing and three-dimensional convolution processing on an image to be recognized according to a coding algorithm in a preset hybrid U-Net network algorithm to obtain a preliminary feature map, and then perform hole convolution processing on the preliminary feature map according to a preset hole convolution algorithm to obtain a first feature map; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
In one example, the apparatus further comprises:
and a third convolution unit 42, configured to perform 1 × 1 convolution operation processing on the first feature maps respectively according to a preset spatial position attention algorithm, so as to obtain a plurality of second feature maps.
The first processing unit 43 is configured to perform a resizing process, a dimension transformation process, and a multiplication process on the plurality of second feature maps to obtain a first channel attention heat map.
And the second processing unit 44 is configured to sum the first channel attention heat map and the first feature map to obtain a third feature map.
In one example, the apparatus further comprises:
and a second obtaining unit 45, configured to perform size reshaping processing, dimension transformation processing, and multiplication processing on the first feature map according to a preset channel attention algorithm, so as to obtain a second channel attention heat map.
And the third processing unit 46 is configured to sum the second channel attention heat map and the first feature map to obtain a fourth feature map.
In one example, the apparatus further comprises:
a residual error unit 47, configured to perform residual error connection processing on the third feature map and the fourth feature map, and perform processing through a preset linear rectification function to obtain a fifth feature map; and the fifth feature map represents the preliminary feature map after multiple convolutions.
The apparatus of this embodiment may execute the technical solution in the method, and the specific implementation process and the technical principle are the same, which are not described herein again.
Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 12, the electronic device includes: a memory 51, a processor 52;
the memory 51 stores a computer program that can be run on the processor 52.
The processor 52 is configured to perform the methods provided in the embodiments described above.
The electronic device further comprises a receiver 53 and a transmitter 54. The receiver 53 is used for receiving commands and data transmitted from an external device, and the transmitter 54 is used for transmitting commands and data to an external device.
Fig. 13 is a block diagram of an electronic device, which may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, etc., according to an embodiment of the present application.
Apparatus 600 may include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, and communication component 616.
The processing component 602 generally controls overall operation of the device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 can include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.
The memory 604 is configured to store various types of data to support operations at the apparatus 600. Examples of such data include instructions for any application or method operating on device 600, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 604 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power supply component 606 provides power to the various components of device 600. The power components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 600.
The multimedia component 608 includes a screen that provides an output interface between the device 600 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 600 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 610 is configured to output and/or input audio signals. For example, audio component 610 includes a Microphone (MIC) configured to receive external audio signals when apparatus 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.
The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor component 614 includes one or more sensors for providing status assessment of various aspects of the apparatus 600. For example, the sensor component 614 may detect an open/closed state of the device 600, the relative positioning of the components, such as a display and keypad of the device 600, the sensor component 614 may also detect a change in position of the device 600 or a component of the device 600, the presence or absence of user contact with the device 600, orientation or acceleration/deceleration of the device 600, and a change in temperature of the device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 616 is configured to facilitate communications between the apparatus 600 and other devices in a wired or wireless manner. The apparatus 600 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 616 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 604 comprising instructions, executable by the processor 620 of the apparatus 600 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Embodiments of the present application also provide a non-transitory computer-readable storage medium, where instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method provided by the above embodiments.
An embodiment of the present application further provides a computer program product, where the computer program product includes: a computer program, stored in a readable storage medium, from which at least one processor of the electronic device can read the computer program, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any of the embodiments described above.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (17)

1. An image processing method, comprising:
acquiring a plurality of images to be identified;
according to a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be recognized to obtain a primary characteristic diagram; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, wherein the encoding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network;
decoding the preliminary characteristic graph according to the decoding algorithm to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
2. The method according to claim 1, wherein the two-dimensional convolution processing and the three-dimensional convolution processing are performed on the image to be recognized according to a coding algorithm in a preset hybrid U-Net network algorithm to obtain a preliminary feature map, and the preliminary feature map comprises:
according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm, performing two-dimensional convolution processing on the image to be identified in a two-dimensional convolution layer to obtain a two-dimensional characteristic diagram; wherein the two-dimensional convolutional layer of the coding algorithm comprises 2 layers;
performing three-dimensional convolution processing on the two-dimensional characteristic diagram on the three-dimensional convolution layer to obtain a preliminary characteristic diagram; wherein the three-dimensional convolutional layer of the coding algorithm comprises 3 layers.
3. The method according to claim 1, wherein decoding the preliminary feature map according to the decoding algorithm to obtain a target feature map comprises:
performing three-dimensional convolution processing on the preliminary feature map at a three-dimensional convolution layer according to a residual learning network in the decoding algorithm to obtain a three-dimensional feature image; wherein the three-dimensional convolutional layer in the decoding algorithm comprises 2 layers;
performing two-dimensional convolution processing on the three-dimensional characteristic diagram on the two-dimensional convolution layer to obtain a convolution characteristic diagram; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers;
performing convolution processing on the convolution characteristic graph by using a preset normalization index function to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
4. The method according to claim 1, wherein after performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be recognized according to a coding algorithm in a preset hybrid U-Net network algorithm to obtain a preliminary feature map, the method further comprises:
performing hole convolution processing on the preliminary feature map according to a preset hole convolution algorithm to obtain a first feature map; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
5. The method of claim 4, further comprising:
performing 1 × 1 convolution operation processing on the first feature maps respectively according to a preset spatial position attention algorithm to obtain a plurality of second feature maps;
performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map;
and summing the first channel attention heat map and the first feature map to obtain a third feature map.
6. The method of claim 4, further comprising:
according to a preset channel attention algorithm, performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map to obtain a second channel attention heat map;
and summing the second channel attention heat map and the first feature map to obtain a fourth feature map.
7. The method according to any one of claims 5-6, further comprising:
performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram, and processing through a preset linear rectification function to obtain a fifth characteristic diagram; wherein the fifth feature map characterizes a preliminary feature map that has undergone multiple convolutions.
8. An image processing apparatus characterized by comprising:
the device comprises a first acquisition unit, a second acquisition unit and a recognition unit, wherein the first acquisition unit is used for acquiring a plurality of images to be recognized;
the first convolution unit is used for performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be identified according to a coding algorithm in a preset mixed U-Net network algorithm to obtain a preliminary characteristic map; the hybrid U-Net network algorithm comprises an encoding algorithm and a decoding algorithm, wherein the encoding algorithm and the decoding algorithm are used for performing convolution processing on the image on the two-dimensional convolution layer and the three-dimensional convolution layer through a residual learning network;
the decoding unit is used for decoding the preliminary characteristic graph according to the decoding algorithm to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
9. The apparatus of claim 8, wherein the first convolution unit comprises:
the first convolution module is used for performing two-dimensional convolution processing on the image to be identified in a two-dimensional convolution layer according to a residual learning network of a coding algorithm in a preset mixed U-Net network algorithm to obtain a two-dimensional characteristic diagram; wherein the two-dimensional convolutional layer of the coding algorithm comprises 2 layers;
the second convolution module is used for performing three-dimensional convolution processing on the two-dimensional characteristic diagram on the three-dimensional convolution layer to obtain a preliminary characteristic diagram; wherein the three-dimensional convolutional layer of the coding algorithm comprises 3 layers.
10. The apparatus of claim 8, wherein the decoding unit comprises:
the third convolution module is used for performing three-dimensional convolution processing on the preliminary characteristic graph in a three-dimensional convolution layer according to the residual learning network in the decoding algorithm to obtain a three-dimensional characteristic image; wherein the three-dimensional convolutional layer in the decoding algorithm comprises 2 layers;
the fourth convolution module is used for performing two-dimensional convolution processing on the three-dimensional characteristic graph on the two-dimensional convolution layer to obtain a convolution characteristic graph; wherein, the two-dimensional convolution layer in the decoding algorithm comprises 2 layers;
the fifth convolution module is used for carrying out convolution processing on the convolution characteristic graph by using a preset normalized exponential function to obtain a target characteristic graph; the target feature map represents a plurality of pixel points obtained through decoding, and the pixel points comprise pixel points occupied by target positions and pixel points occupied by backgrounds except the target positions.
11. The apparatus of claim 8, further comprising:
the second convolution unit is used for performing two-dimensional convolution processing and three-dimensional convolution processing on the image to be identified according to a coding algorithm in a preset mixed U-Net network algorithm to obtain a preliminary feature map, and then performing hole convolution processing on the preliminary feature map according to a preset hole convolution algorithm to obtain a first feature map; the receptive field of the first characteristic diagram is larger than that of the preliminary characteristic diagram, and the receptive field represents a mapping area of pixel points of the local characteristic diagram in the first characteristic diagram on the image to be identified.
12. The apparatus of claim 11, further comprising:
the third convolution unit is used for respectively carrying out 1 multiplied by 1 convolution operation processing on the first characteristic diagram according to a preset spatial position attention algorithm to obtain a plurality of second characteristic diagrams;
the first processing unit is used for performing size reshaping processing, dimension transformation processing and multiplication processing on the plurality of second feature maps to obtain a first channel attention heat map;
and the second processing unit is used for summing the first channel attention heat map and the first feature map to obtain a third feature map.
13. The apparatus of claim 11, further comprising:
the second acquisition unit is used for performing size remodeling processing, dimension transformation processing and multiplication processing on the first feature map according to a preset channel attention algorithm to obtain a second channel attention heat map;
and the third processing unit is used for summing the second channel attention heat map and the first feature map to obtain a fourth feature map.
14. The device according to any one of claims 12-13, wherein the device further comprises
Comprises the following steps:
the residual error unit is used for performing residual error connection processing on the third characteristic diagram and the fourth characteristic diagram and processing the third characteristic diagram and the fourth characteristic diagram through a preset linear rectification function to obtain a fifth characteristic diagram; wherein the fifth feature map characterizes a preliminary feature map that has undergone multiple convolutions.
15. An electronic device, comprising a memory, a processor, a computer program being stored in the memory and being executable on the processor, the processor implementing the method of any of the preceding claims 1-7 when executing the computer program.
16. A computer-readable storage medium having computer-executable instructions stored thereon, which when executed by a processor, perform the method of any one of claims 1-7.
17. A computer program product, characterized in that it comprises a computer program which, when being executed by a processor, carries out the method of any one of claims 1-7.
CN202210339991.0A 2022-04-02 2022-04-02 Image processing method, device and equipment Pending CN114494442A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210339991.0A CN114494442A (en) 2022-04-02 2022-04-02 Image processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210339991.0A CN114494442A (en) 2022-04-02 2022-04-02 Image processing method, device and equipment

Publications (1)

Publication Number Publication Date
CN114494442A true CN114494442A (en) 2022-05-13

Family

ID=81489163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210339991.0A Pending CN114494442A (en) 2022-04-02 2022-04-02 Image processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN114494442A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018862A (en) * 2022-05-26 2022-09-06 杭州深睿博联科技有限公司 Liver tumor segmentation method and device based on hybrid neural network
CN115170510A (en) * 2022-07-04 2022-10-11 北京医准智能科技有限公司 Focus detection method and device, electronic equipment and readable storage medium
CN116245951A (en) * 2023-05-12 2023-06-09 南昌大学第二附属医院 Brain tissue hemorrhage localization and classification and hemorrhage quantification method, device, medium and program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335217A (en) * 2019-07-10 2019-10-15 东北大学 One kind being based on the decoded medical image denoising method of 3D residual coding
CN110889852A (en) * 2018-09-07 2020-03-17 天津大学 Liver segmentation method based on residual error-attention deep neural network
CN111179237A (en) * 2019-12-23 2020-05-19 北京理工大学 Image segmentation method and device for liver and liver tumor
CN112508973A (en) * 2020-10-19 2021-03-16 杭州电子科技大学 MRI image segmentation method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889852A (en) * 2018-09-07 2020-03-17 天津大学 Liver segmentation method based on residual error-attention deep neural network
CN110335217A (en) * 2019-07-10 2019-10-15 东北大学 One kind being based on the decoded medical image denoising method of 3D residual coding
CN111179237A (en) * 2019-12-23 2020-05-19 北京理工大学 Image segmentation method and device for liver and liver tumor
CN112508973A (en) * 2020-10-19 2021-03-16 杭州电子科技大学 MRI image segmentation method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王辉涛等: "基于全局时空感受野的高效视频分类方法", 《小型微型计算机系统》 *
马金林等: "肝脏肿瘤CT图像深度学习分割方法综述", 《中国图象图形学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018862A (en) * 2022-05-26 2022-09-06 杭州深睿博联科技有限公司 Liver tumor segmentation method and device based on hybrid neural network
CN115170510A (en) * 2022-07-04 2022-10-11 北京医准智能科技有限公司 Focus detection method and device, electronic equipment and readable storage medium
CN116245951A (en) * 2023-05-12 2023-06-09 南昌大学第二附属医院 Brain tissue hemorrhage localization and classification and hemorrhage quantification method, device, medium and program
CN116245951B (en) * 2023-05-12 2023-08-29 南昌大学第二附属医院 Brain tissue hemorrhage localization and classification and hemorrhage quantification method, device, medium and program

Similar Documents

Publication Publication Date Title
CN109670397B (en) Method and device for detecting key points of human skeleton, electronic equipment and storage medium
CN106651955B (en) Method and device for positioning target object in picture
CN107798669B (en) Image defogging method and device and computer readable storage medium
CN108629354B (en) Target detection method and device
CN114494442A (en) Image processing method, device and equipment
CN109889724B (en) Image blurring method and device, electronic equipment and readable storage medium
CN106778773B (en) Method and device for positioning target object in picture
CN106557759B (en) Signpost information acquisition method and device
CN112767329A (en) Image processing method and device and electronic equipment
CN111461182B (en) Image processing method, image processing apparatus, and storage medium
CN107133354B (en) Method and device for acquiring image description information
CN112508974B (en) Training method and device for image segmentation model, electronic equipment and storage medium
KR102367648B1 (en) Method and apparatus for synthesizing omni-directional parallax view, and storage medium
CN114140611A (en) Salient object detection method and device, electronic equipment and storage medium
CN111833344A (en) Medical image processing method and device, electronic equipment and storage medium
CN115223018B (en) Camouflage object collaborative detection method and device, electronic equipment and storage medium
CN114120034A (en) Image classification method and device, electronic equipment and storage medium
CN115083021A (en) Object posture recognition method and device, electronic equipment and storage medium
CN117529753A (en) Training method of image segmentation model, image segmentation method and device
CN114612790A (en) Image processing method and device, electronic equipment and storage medium
CN114863392A (en) Lane line detection method, lane line detection device, vehicle, and storage medium
CN109711386B (en) Method and device for obtaining recognition model, electronic equipment and storage medium
CN113473012A (en) Virtualization processing method and device and electronic equipment
CN112036487A (en) Image processing method and device, electronic equipment and storage medium
CN112380388B (en) Video ordering method and device under search scene, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220513