WO2021147257A1 - Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage - Google Patents

Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2021147257A1
WO2021147257A1 PCT/CN2020/100723 CN2020100723W WO2021147257A1 WO 2021147257 A1 WO2021147257 A1 WO 2021147257A1 CN 2020100723 W CN2020100723 W CN 2020100723W WO 2021147257 A1 WO2021147257 A1 WO 2021147257A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature image
decoding layer
feature
trained
Prior art date
Application number
PCT/CN2020/100723
Other languages
English (en)
Chinese (zh)
Inventor
王国泰
顾然
宋涛
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Priority to KR1020217034486A priority Critical patent/KR20210140757A/ko
Priority to JP2021539612A priority patent/JP2022521130A/ja
Publication of WO2021147257A1 publication Critical patent/WO2021147257A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular to a network training, image processing method and device, electronic equipment, and storage medium.
  • Image segmentation refers to the image processing process of dividing an image into several specific disjoint "connected" areas according to the distribution attributes in the area. Related features have a certain category of consistency or similarity in the same area. This difference It is most obvious at the boundary of each area.
  • Medical image segmentation has important academic research significance and application value in research and practice fields such as medical research, clinical diagnosis, pathological analysis, and image information processing. It is mainly used for: extracting regions of interest in medical images to facilitate medical image analysis; Calculate the volume and volume of human organs, tissues or lesions in medical images to facilitate the calculation of clinical parameters; three-dimensional reconstruction or visualization of medical images; medical image retrieval research, etc. Therefore, an effective image segmentation method is urgently needed.
  • the embodiments of the present application provide a network training, image processing method and device, electronic equipment, and storage medium.
  • the embodiment of the application provides a network training method, the network training method is used to train a neural network model, and the image is segmented according to the neural network model obtained by the training.
  • the method includes: using a segmentation network in a preset dimension
  • the attention mechanism performs feature extraction on the sample images included in the training samples to obtain feature extraction results.
  • the preset dimensions include spatial dimensions, channel dimensions, and scale dimensions, and the training samples also include the corresponding sample images.
  • the feature extraction of the sample image included in the training sample is performed using the attention mechanism to obtain the feature extraction result, and the sample image is segmented according to the feature extraction result
  • the image segmentation result is obtained, and the segmentation network is trained according to the image segmentation result and the segmentation label information corresponding to the sample image included in the training sample, so that the trained segmentation network can improve the segmentation accuracy when performing image segmentation processing.
  • the segmentation network includes an encoder and a decoder, the encoder includes a plurality of coding layers, and the decoder includes a plurality of decoding layers;
  • Using the attention mechanism to perform feature extraction on the sample images included in the training samples to obtain the feature extraction results includes: inputting the sample images into the encoder to determine the first feature image corresponding to each coding layer, where different codes are The scale of the first feature image corresponding to each layer is different; for any decoding layer, the first feature image corresponding to the scale of the decoding layer is used, and the attention mechanism is used to input the second feature image of the decoding layer in the spatial dimension and the channel dimension.
  • the feature image is trained to determine the third feature image corresponding to the decoding layer, where the second feature image input to the decoding layer is determined according to the third feature image corresponding to the previous decoding layer of the decoding layer, and different decoding layers correspond to The scales of the third feature images of are different; and the feature extraction result is determined according to a plurality of third feature images of different scales determined by a plurality of decoding layers.
  • the second feature image uses the attention mechanism for feature training in the spatial and channel dimensions to determine the third feature image corresponding to each decoding layer, and then according to the third feature image of different scales, it can be effectively determined to enhance the interest in the sample image
  • the spatial feature information and channel feature information of the region, and the feature extraction results of the spatial feature information and channel feature information of the uninteresting region in the image are suppressed.
  • the first feature image corresponding to the scale of the decoding layer is used, and the second characteristic image input to the decoding layer is determined by using the attention mechanism in the spatial dimension and the channel dimension.
  • Training the feature image to determine the third feature image corresponding to the decoding layer includes: using the first feature image corresponding to the scale of the decoding layer to train the first feature image to be trained by using an attention mechanism in the spatial dimension, Determine the fourth feature image corresponding to the decoding layer, where the first feature image to be trained is the second feature image input to the decoding layer; the second feature image input to the decoding layer and the fourth feature image corresponding to the decoding layer are input Splicing is performed to obtain the second feature image to be trained; the second feature image to be trained is trained by using the attention mechanism in the channel dimension to determine the third feature image corresponding to the decoding layer.
  • the corresponding first feature image in the coding layer to train the first feature image to be trained corresponding to the decoding layer in the spatial dimension using the attention mechanism, so that the spatial feature information that enhances the region of interest in the sample image can be effectively determined and suppressed
  • the fourth feature image of the spatial feature information of the uninteresting region in the image and then the fourth feature image is spliced with the second feature image of the input decoding layer to obtain the second feature image to be trained, and the attention mechanism is used in the channel dimension to The second feature image to be trained is trained, so that the third feature image that enhances the channel feature information of the region of interest in the sample image and suppresses the channel feature information of the region of interest in the image can be effectively determined.
  • the first feature image corresponding to the scale of the decoding layer is used, and the second characteristic image input to the decoding layer is determined by using the attention mechanism in the spatial dimension and the channel dimension.
  • the feature image is trained to determine the third feature image corresponding to the decoding layer, including: stitching the first feature image corresponding to the scale of the decoding layer and the second feature image input to the decoding layer to determine the second feature image to be trained ; By using the attention mechanism in the channel dimension to train the second feature image to be trained, determine the first feature image to be trained; using the first feature image corresponding to the decoding layer scale, by using the attention mechanism in the spatial dimension Training is performed on the first feature image to be trained, and the third feature image corresponding to the decoding layer is determined.
  • the second feature image of the input decoding layer is spliced with the first feature image of the corresponding coding layer to obtain the second feature image to be trained, and the attention mechanism is used to train the second feature image to be trained in the channel dimension, so that it can be effectively determined
  • the first feature image to be trained that enhances the channel feature information of the region of interest in the sample image and suppresses the channel feature information of the region of interest in the image
  • uses the attention mechanism to train the first feature image to be trained in the spatial dimension This makes it possible to effectively determine the third feature image that enhances the spatial feature information of the region of interest in the sample image and suppresses the spatial feature information of the region of interest in the image.
  • using the first feature image corresponding to the scale of the decoding layer to train the first feature image to be trained by using an attention mechanism in the spatial dimension includes: The first feature image corresponding to the scale and the first feature image to be trained, the spatial attention weight distribution corresponding to the decoding layer is determined, where the spatial attention weight distribution corresponding to the decoding layer is used to indicate the first feature image to be trained Each pixel in the first feature image to be trained is calibrated according to the spatial attention weight distribution corresponding to the decoding layer.
  • Each pixel is calibrated to complete the training of using the attention mechanism in the spatial dimension, so that the spatial feature information of the region of interest in the sample image can be effectively enhanced, and the spatial feature information of the uninteresting region in the image can be suppressed.
  • the decoding layer includes a plurality of spatial attention training layers; said determining according to the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained
  • the spatial attention weight distribution corresponding to the decoding layer includes: inputting the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained into the plurality of spatial attention training layers, and determining the first feature image to be trained Multiple weights of each pixel in the feature image; and determine the spatial attention weight distribution corresponding to the decoding layer according to the multiple weights of each pixel in the first feature image to be trained.
  • the corresponding first feature image in the coding layer and the first feature image to be trained corresponding to the decoding layer are used to determine the decoding layer.
  • the multiple spatial attention training layers Determine multiple weights of each pixel in the first feature image to be trained, and then comprehensively determine the spatial attention weight distribution corresponding to the decoding layer according to the multiple weights of each pixel in the first feature image to be trained, so that It can effectively improve the accuracy of spatial attention weight distribution.
  • the training of the second feature image to be trained by using the attention mechanism in the channel dimension includes: determining the channel attention weight distribution corresponding to the decoding layer, wherein the decoding layer The corresponding channel attention weight distribution is used to indicate the weight of each channel in the second feature image to be trained; each channel in the second feature image to be trained is calibrated according to the channel attention weight distribution corresponding to the decoding layer.
  • the determining the channel attention weight distribution corresponding to the decoding layer includes: performing an average pooling operation on the second feature image to be trained to obtain an average pooling result; The feature image is subjected to a maximum pooling operation to obtain a maximum pooling result; according to the average pooling result and the maximum pooling result, the channel attention weight distribution corresponding to the decoding layer is determined.
  • the determining the feature extraction result according to multiple third feature images of different scales determined by multiple decoding layers includes: stitching third feature images of different scales to obtain The third feature image to be trained, wherein the scale of the third feature image to be trained is the same as the scale of the sample image; the third feature image to be trained is trained by using the attention mechanism in the scale dimension to determine the feature extraction result.
  • the training of the third feature image to be trained by using the attention mechanism in the scale dimension includes: determining the scale attention weight distribution, wherein the scale attention weight distribution It is used to indicate the weights of different scales; the third feature image to be trained is calibrated according to the attention weight distribution of the scales.
  • the sample image is a medical image
  • the segmentation and labeling information is a gold standard for manual labeling
  • An embodiment of the present application provides an image processing method, including: performing image segmentation processing on an image to be segmented through a segmentation network to obtain a segmentation result; wherein the segmentation network is obtained by training using the above-mentioned network training method.
  • the feature extraction of the sample image included in the training sample is performed using the attention mechanism to obtain the feature extraction result, and the sample image is segmented according to the feature extraction result Process to obtain the image segmentation result, train the segmentation network according to the image segmentation result and the segmentation label information corresponding to the sample image included in the training sample, and then use the trained segmentation network to perform image segmentation processing on the image to be segmented, which can effectively improve the segmentation accuracy .
  • the image to be segmented is a medical image to be segmented; the image segmentation processing of the image to be segmented through the segmentation network to obtain the segmentation result includes: image processing of the medical image to be segmented through the segmentation network The segmentation process obtains the segmented lesion area or target organ area.
  • An embodiment of the application provides a network training device, the network training device is used to train a neural network model, and segment the image according to the neural network model obtained by the training.
  • the device includes: a feature extraction module configured to pass through the segmentation network
  • the attention mechanism is used to perform feature extraction on the sample images included in the training samples on preset dimensions to obtain feature extraction results.
  • the preset dimensions include spatial dimensions, channel dimensions, and scale dimensions, and the training samples also It includes segmentation label information corresponding to the sample image; a segmentation module configured to perform image segmentation processing on the sample image according to the feature extraction result to obtain an image segmentation result; a training module configured to perform image segmentation processing based on the image segmentation result and the results
  • the segmentation label information is used to train the segmentation network.
  • An embodiment of the present application provides an electronic device, including: a processor; a memory configured to store executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute the aforementioned network training method .
  • the embodiment of the present application provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing network training method is implemented.
  • An embodiment of the application provides an image processing device, including: an image processing module configured to perform image segmentation processing on an image to be segmented through a segmentation network to obtain a segmentation result; wherein the segmentation network is obtained by training using the aforementioned network training method .
  • the image to be segmented is a medical image to be segmented; the image processing module is configured to perform image segmentation processing on the medical image to be segmented through a segmentation network to obtain the segmented lesion area or target Organ area.
  • An embodiment of the present application provides an electronic device, including: a processor; a memory configured to store executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute the above-mentioned image processing method .
  • the embodiment of the present application provides a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing image processing method is implemented.
  • FIG. 1 is a schematic flowchart of a network training method provided by an embodiment of this application
  • FIG. 2 is a schematic structural diagram of a segmented network provided by an embodiment of this application.
  • FIG. 3 is a schematic structural diagram of the spatial attention module 2022 in FIG. 2 provided by an embodiment of the application;
  • FIG. 4 is a schematic structural diagram of the spatial attention module 2025 in FIG. 2 provided by an embodiment of the application;
  • FIG. 5 is a schematic structural diagram of the channel attention module 2026 in FIG. 2 provided by an embodiment of the application;
  • FIG. 6 is a schematic structural diagram of the medium-scale attention module 2049 of FIG. 2 provided by an embodiment of the application;
  • FIG. 7 is a schematic flowchart of an image processing method provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a network training device provided by an embodiment of this application.
  • FIG. 9 is a schematic structural diagram of an image processing device provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • FIG. 11 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • FIG. 1 is a schematic flowchart of a network training method provided by an embodiment of this application.
  • the network training method can be executed by a terminal device or other processing device, where the terminal device can be a user equipment (UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, or a personal digital assistant (Personal Digital Assistant). , PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • Other processing equipment can be servers or cloud servers.
  • the network training method can be implemented by a processor invoking computer-readable instructions stored in a memory. As shown in Figure 1, the method may include:
  • Step S11 using the attention mechanism on the preset dimensions to perform feature extraction on the sample images included in the training samples through the segmentation network to obtain feature extraction results, where the preset dimensions include: spatial dimensions, channel dimensions, and scale dimensions, and training samples It also includes segmentation and annotation information corresponding to the sample image.
  • Step S12 Perform image segmentation processing on the sample image according to the feature extraction result to obtain an image segmentation result.
  • Step S13 training a segmentation network according to the image segmentation result and segmentation label information.
  • a training sample is created in advance, and the training sample includes the sample image and segmentation label information corresponding to the sample image, where the segmentation label information corresponding to the sample image is used to indicate the reference segmentation result of the sample image.
  • the segmentation network can be trained using the attention mechanism in the preset dimensions of the spatial dimension, the channel dimension, and the scale dimension, so that the trained segmentation network can improve the segmentation accuracy when performing image segmentation processing.
  • the segmentation network may be a convolutional neural network improved based on the U-net network model, or other network models that can implement corresponding processing, which is not specifically limited in the embodiment of the present application.
  • the sample image may be obtained after preprocessing the medical image.
  • Obtain the medical image resample the medical image to the 256*342 scale, and then normalize the resampled medical image to between 0 and 1 to obtain the first image; perform random flips, random rotations, and Random cropping realizes data enhancement and obtains sample images, where the number of channels of the sample image is 3 and the scale is 224*300.
  • the method for determining the sample image may adopt other methods, and the number of channels and the scale of the sample image may be determined according to the actual situation, which is not specifically limited in the embodiment of the present application.
  • the segmentation network includes an encoder and a decoder, the encoder includes multiple encoding layers, and the decoder includes multiple decoding layers; the segmentation network uses an attention mechanism to analyze training samples in preset dimensions. Perform feature extraction on the included sample images to obtain the feature extraction results, including: inputting the sample images into the encoder, and determining the first feature image corresponding to each coding layer, wherein the scales of the first feature images corresponding to different coding layers are different; For any decoding layer, use the first feature image corresponding to the scale of the decoding layer to train the second feature image input to the decoding layer by using the attention mechanism in the spatial and channel dimensions to determine the first feature image corresponding to the decoding layer.
  • the multiple third feature images of different scales determined by the decoding layer determine the feature extraction result.
  • Fig. 2 shows a schematic structural diagram of a segmented network provided by an embodiment of this application.
  • the segmentation network is based on the U-net network model as a backbone network improvement.
  • the segmented network can be based on the U-net network model as the backbone network, and can also be based on other network models as the backbone network, which is not specifically limited in the embodiment of the present application.
  • the segmentation network includes an encoder 2001 and a decoder 2002.
  • Encoder 2001 includes coding layers 2003 to 2007, of which, coding layer 2003 includes convolutional layer 2008, coding layer 2004 includes maximum pooling layer 2009 and convolutional layer 2010, coding layer 2005 includes maximum pooling layer 2011 and Convolutional layer 2012, coding layer 2006 includes maximum pooling layer 2013 and convolutional layer 2014, coding layer 2007 includes maximum pooling layer 2015 and convolutional layer 2016.
  • the decoder 2002 includes decoding layers 2017 to 2020.
  • the decoding layer 2017 includes a convolutional layer 2021, a spatial attention module 2022, and a channel attention module 2023.
  • the decoding layer 2018 includes a convolutional layer 2024 and a spatial attention module 2025.
  • decoding layer 2019 includes convolutional layer 2027, spatial attention module 2028 and channel attention module 2029
  • decoding layer 2020 includes convolutional layer 2030, spatial attention module 2031 and channel attention module 2032 .
  • the convolutional layer in the segmentation network can be a standard convolutional layer with 3*3 convolution kernels, and the maximum pooling layer can implement down-sampling of the input data and reduce the scale of the input data.
  • the sample image 2033 is input to the encoder 2001 of the segmentation network.
  • the scale of the sample image 2033 may be 224*300.
  • the first feature image corresponding to the coding layer 2003 with a scale of 224*300 and a number of channels of 16 is obtained; the scale is 224*300 and the number of channels is 16
  • the first feature image of the coding layer 2004 is 112*150 and the number of channels is 32.
  • the first feature image is obtained; the scale is 112*150, the first feature image with the number of channels 32 passes through the maximum pooling layer 2011 and two convolutional layers 2012 in the coding layer 2005, and the corresponding scale of the coding layer 2005 is 56*75, and the number of channels is 64
  • the first feature image; the first feature image with a scale of 56*75 and a channel number of 64 passes through the largest pooling layer 2013 and two convolutional layers 2014 in the coding layer 2006, and the corresponding scale of the coding layer 2006 is 28*37, the first feature image with the number of channels 128; the first feature image with the scale of 28*37 and the number of channels 128 passes through the maximum pooling layer 2015 and the two convolutional layers 2016 in the coding layer 2007,
  • the first feature image with a scale of 14*18 and a channel number of 256 corresponding to the coding layer 2007 is obtained.
  • the scale and the number of channels of the first feature image corresponding to different coding layers may be determined according to actual conditions, which is not specifically limited in
  • the first feature image corresponding to the scale of the decoding layer is used to train the second feature image input to the decoding layer by using the attention mechanism in the spatial dimension and the channel dimension.
  • the process of determining the third characteristic image corresponding to the decoding layer is used to train the second feature image input to the decoding layer by using the attention mechanism in the spatial dimension and the channel dimension.
  • the first feature image corresponding to the lowest coding layer is up-sampled, and the first feature image corresponding to the previous coding layer is spliced to obtain the second feature image input to the highest decoding layer ;
  • the first feature image corresponding to the lowest coding layer (the first feature image with the smallest scale) includes the global feature information of the sample image
  • the first feature image corresponding to the lowest coding layer is up-sampled with the first feature image corresponding to the previous coding layer. After a feature image is spliced, the attention training of spatial dimension and channel dimension can be performed to realize global training.
  • the first feature image corresponding to the previous coding layer (coding layer 2006) is up-sampling.
  • a feature image (28*37 scale) is spliced to obtain a second feature image (28*37 scale, 256 channels) input to the highest decoding layer (decoding layer 2017), and the second feature image input to the decoding layer 2017 is used as the decoding
  • the first feature image to be trained corresponding to the layer 2017 is input to the spatial attention module 2022 for spatial attention training, and the fourth feature image (28*37 scale, 256 channels) corresponding to the decoding layer 2017 is obtained; the fourth feature image corresponding to the decoding layer 2017 is obtained.
  • the feature image passes through the convolutional layer 2021, the channel attention module 2023, and the convolutional layer 2021 for channel attention training, and the third feature image (28*37 scale, 128 channels) corresponding to the decoding layer 2017 is obtained.
  • “ ⁇ 2” is used to indicate up-sampling processing, where up-sampling processing can be carried out through the up-pooling layer, up-sampling processing can be carried out through the deconvolution layer, and up-sampling processing can also be carried out in other ways.
  • the application embodiment does not specifically limit this.
  • FIG. 3 is a schematic structural diagram of the spatial attention module 2022 in FIG. 2 provided by an embodiment of the application.
  • the spatial attention module 2022 includes a plurality of 1 ⁇ 1 convolutional layers 2034, a plurality of transpose layers (Transpose layers) 2035, and a normalization layer 2036.
  • the first feature image corresponding to the scale of the decoding layer 2017 (the first feature image corresponding to the coding layer 2006) and the first feature image to be trained corresponding to the decoding layer 2017 are input to the spatial attention module 2022, and each pass through multiple 1 ⁇ 1 volumes
  • the accumulation layer 2034, the multiple transposed layers 2035, and the normalization layer 2036 obtain the spatial attention weight distribution corresponding to the decoding layer 2017.
  • the spatial attention module 2022 can determine the spatial attention weight distribution corresponding to the decoding layer 2017 through the following formula (1-1)
  • Softmax ( ⁇ ) is a normalization function
  • x is a pixel in the first feature image to be trained corresponding to the decoding layer 2017,
  • ⁇ T ( ⁇ ) and ⁇ ( ⁇ ) are convolution operations.
  • each pixel in the first feature image to be trained corresponding to the decoding layer 2017 is calibrated to obtain the corresponding decoding layer 2017 that needs to be trained using the attention mechanism in the channel dimension The second feature image to be trained.
  • the first feature image corresponding to the size of the decoding layer is used, and the second feature image input to the decoding layer is determined by using the attention mechanism in the spatial dimension and the channel dimension.
  • Training to determine the third feature image corresponding to the decoding layer includes: stitching the first feature image corresponding to the scale of the decoding layer and the second feature image input to the decoding layer to determine the second feature image to be trained; In the channel dimension, the attention mechanism is used to train the second feature image to be trained, and the first feature image to be trained is determined; the first feature image corresponding to the scale of the decoding layer is used to analyze the second feature image by using the attention mechanism in the spatial dimension. Training is performed on a feature image to be trained, and a third feature image corresponding to the decoding layer is determined.
  • the attention mechanism can be used in the channel dimension to splice the first feature image corresponding to the scale of the decoding layer and the second feature image input to the decoding layer to determine the second feature image to be trained. Training, and then use the attention mechanism in the spatial dimension to train the first feature image to be trained obtained by using the attention mechanism training in the channel dimension, so as to determine the third feature image corresponding to the decoding layer.
  • the embodiments of the present application may use the aforementioned attention mechanism in the channel dimension to train the second feature image to be trained, and then use the attention mechanism in the spatial dimension to train the first feature image to be trained, It is also possible to first use the attention mechanism to train the first feature image to be trained in the spatial dimension, and then use the attention mechanism to train the second feature image to be trained in the channel dimension, which is not specifically limited in the embodiment of this application. .
  • the following uses the attention mechanism to train the first feature image to be trained in the spatial dimension, and then uses the attention mechanism to train the second feature image to be trained in the channel dimension as an example for detailed introduction.
  • the first feature image corresponding to the size of the decoding layer is used, and the second feature image input to the decoding layer is determined by using the attention mechanism in the spatial dimension and the channel dimension.
  • Training to determine the third feature image corresponding to the decoding layer includes: using the first feature image corresponding to the scale of the decoding layer to train the first image to be trained by using an attention mechanism in the spatial dimension to determine the decoding.
  • the fourth feature image corresponding to the layer, where the first feature image to be trained is the second feature image input to the decoding layer; the second feature image input to the decoding layer and the fourth feature image corresponding to the decoding layer are stitched together, Obtain the second feature image to be trained; by using the attention mechanism in the channel dimension to train the second feature image to be trained, the third feature image corresponding to the decoding layer is determined.
  • using the first feature image corresponding to the scale of the decoding layer to train the first feature image to be trained by using an attention mechanism in the spatial dimension includes: according to the scale corresponding to the decoding layer The first feature image and the first feature image to be trained, the spatial attention weight distribution corresponding to the decoding layer is determined, where the spatial attention weight distribution corresponding to the decoding layer is used to indicate each of the first feature images to be trained Pixel weights; each pixel in the first feature image to be trained is calibrated according to the spatial attention weight distribution corresponding to the decoding layer.
  • the decoding layer includes a plurality of spatial attention training layers; the decoding is determined according to the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained
  • the spatial attention weight distribution corresponding to the layer includes: inputting the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained into multiple spatial attention training layers, and determining each of the first feature images to be trained Multiple weights of pixels; according to multiple weights of each pixel in the first feature image to be trained, the spatial attention weight distribution corresponding to the decoding layer is determined.
  • the third feature image (28*37 scale, 128 channels) corresponding to the decoding layer 2017 is up-sampled to obtain the second feature image (56*75 scale, 64 channels) of the input decoding layer 2018,
  • the second feature image of the input decoding layer 2018 is used as the first feature image to be trained corresponding to the decoding layer 2018 and input to the spatial attention module 2025 for spatial attention training, to obtain the fourth feature image corresponding to the decoding layer 2018 (56*75 scale, 64 channels); splicing the second feature image of the input decoding layer 2018 and the fourth feature image corresponding to the decoding layer 2018 to obtain the second feature image to be trained (56*75 scale, 128 channels) corresponding to the decoding layer 2018;
  • the third feature image corresponding to the decoding layer 2018 is obtained.
  • FIG. 4 is a schematic structural diagram of the spatial attention module 2025 in FIG. 2 provided by an embodiment of the application.
  • the spatial attention module 2025 includes two spatial attention training layers 2037 to 2038, and uses the first feature image corresponding to the scale of the decoding layer 2018 (the first feature image corresponding to the coding layer 2005) as the source of the query.
  • the value (query), and the first feature image to be trained corresponding to the decoding layer 2018 as the query value (key) of the query are input into the spatial attention training layer 2037 and the spatial attention training layer 2038, respectively.
  • the number of spatial attention training layers can be determined according to actual conditions, which is not specifically limited in the embodiment of the present application.
  • each spatial attention training layer includes multiple 1 ⁇ 1 convolutional layers 2039, an upsampling layer 2040, an activation layer (rectified linear unit (ReLU) layer) 2041, an activation layer (sigmoid Layer) 2042. Resample layer (Resample layer) 2043.
  • Any one of the spatial attention training layers in the spatial attention module 2025 can determine the weight of each pixel in the first feature image to be trained corresponding to the decoding layer 2018. For example, for any spatial attention training layer in the spatial attention module 2025, the weight ⁇ i of the pixel i in the first feature image to be trained corresponding to the decoding layer 2018 can be determined according to the following formula (1-2):
  • sigmoid( ⁇ ) is the activation function
  • F i is the first feature image corresponding to the scale of the decoding layer
  • F g is the first feature image to be trained corresponding to the decoding layer 2018
  • Conv 1,1 ( ⁇ ) is a 1 ⁇ 1 convolution
  • ReLU ( ⁇ ) is an activation function
  • b is a deviation term.
  • Pixel weights determine the spatial attention weight distribution corresponding to the decoding layer 2018, and then, according to the spatial attention weight distribution corresponding to the decoding layer 2018, determine each pixel in the first feature image to be trained corresponding to the decoding layer 2018 Perform calibration to obtain the fourth characteristic image corresponding to the decoding layer 2018.
  • the manner of determining the fourth feature image corresponding to the decoding layer 2019 and the fourth feature image corresponding to the decoding layer 2020 is similar to the manner of determining the fourth feature image corresponding to the decoding layer 2018, and will not be repeated here.
  • the structures of the spatial attention module 2028 and the spatial attention module 2031 are similar to the spatial attention module 2025, and will not be repeated here.
  • the trained segmentation network can enhance the spatial feature information of the region of interest in the image and suppress the spatial feature of the uninterested region in the image during image segmentation processing. Information, which in turn can improve the segmentation accuracy of the segmentation network.
  • the second characteristic image input to the decoding layer and the second characteristic image corresponding to the decoding layer are spliced (channel cascading) to obtain the decoding
  • the second feature image to be trained corresponding to the layer For example, for the decoding layer 2018, the second feature image (56*75 scale, 64 channels) of the input decoding layer 2018 and the fourth feature image (56*75 scale, 64 channels) corresponding to the decoding layer 2018 are subjected to channel cascade splicing , The second feature image to be trained (56*75 scale, 128 channels) corresponding to the decoding layer 2018 is obtained.
  • training the second feature image to be trained by using an attention mechanism in the channel dimension includes: determining the channel attention weight distribution corresponding to the decoding layer, where the decoding layer corresponds to The channel attention weight distribution is used to indicate the weight of each channel in the second feature image to be trained; each channel in the second feature image to be trained is calibrated according to the channel attention weight distribution corresponding to the decoding layer.
  • determining the channel attention weight distribution corresponding to the decoding layer includes: performing an average pooling operation on the second feature image to be trained to obtain an average pooling result; Perform the maximum pooling operation to obtain the maximum pooling result; according to the average pooling result and the maximum pooling result, determine the channel attention weight distribution corresponding to the decoding layer.
  • FIG. 5 is a schematic structural diagram of the channel attention module 2026 in FIG. 2 provided by an embodiment of the application.
  • the channel attention module 2026 includes a maximum pooling layer 2044, an average pooling layer 2045, a fully connected layer (Fully Connected Layers, FC layer) 2046, an activation layer (ReLU layer) 2047, and a fully connected layer (FC Layer) 2048.
  • FC layer Fully Connected Layers
  • ReLU layer activation layer
  • FC Layer fully connected layer
  • the layer (FC layer) 2048 determines the channel attention weight distribution corresponding to the decoding layer 2018.
  • the channel attention module 2026 may determine the weight ⁇ j of the channel j in the second feature image to be trained corresponding to the decoding layer 2018 by the following formula (1-3):
  • ⁇ j sigmoid(L 1 (L 0 (P avg (F j ))) + L 1 (L 0 (P max (F j )))) (1-3).
  • sigmoid ( ⁇ ) is the activation function
  • F j is the second feature image to be trained corresponding to the decoding layer 2018
  • L 0 ( ⁇ ) is the fully connected operation and the ReLU operation
  • L 1 ( ⁇ ) is the fully connected operation
  • Pavg ( ⁇ ) is the average pooling function
  • P max ( ⁇ ) is the maximum pooling function.
  • each channel in the second feature image to be trained corresponding to the decoding layer 2018 is calibrated according to the channel attention weight distribution corresponding to the decoding layer 2018 to obtain the decoding layer 2018 The corresponding third feature image.
  • the method of determining the third characteristic image corresponding to the decoding layer 2017, the third characteristic image corresponding to the decoding layer 2019, and the third characteristic image corresponding to the decoding layer 2020 is similar to the method of determining the third characteristic image corresponding to the decoding layer 2018, here No longer.
  • the structure of the channel attention module 2023, the channel attention module 2029, and the channel attention module 2032 are similar to the channel attention module 2026, and will not be repeated here.
  • the trained segmentation network can enhance the channel feature information of the region of interest in the image and suppress the channel feature of the uninterested region in the image during image segmentation processing. Information, which in turn can improve the segmentation accuracy of the segmentation network.
  • determining the feature extraction result according to a plurality of third feature images of different scales determined by multiple decoding layers includes: stitching the third feature images of different scales to obtain the third feature image to be trained The feature image, wherein the scale of the third feature image to be trained is the same as the scale of the sample image; the third feature image to be trained is trained by using the attention mechanism in the scale dimension to determine the feature extraction result.
  • training the third feature image to be trained by using the attention mechanism in the scale dimension includes: determining the scale attention weight distribution, where the scale attention weight distribution is used to indicate different The weight of the scale; the third feature image to be trained is calibrated according to the distribution of the attention weight of the scale.
  • the segmentation network also includes a scale attention module 2049.
  • the third feature image corresponding to the decoding layer 2017, the third feature image corresponding to the decoding layer 2018, the third feature image corresponding to the decoding layer 2019, and the third feature image corresponding to the decoding layer 2020 are stitched together.
  • the decoding The third feature image corresponding to layer 2017 (28*37 scale), the third feature image corresponding to decoding layer 2018 (56*75 scale), and the third feature image corresponding to decoding layer 2019 (112*150 scale) are all up-sampled Up to 224*300 scale (the same scale as the sample image), during the stitching process, the third feature image corresponding to each decoding layer can only retain 4 channels, and after stitching, a 224*300 scale sixth feature image (16 channels) is obtained .
  • the sixth feature image is input to the scale attention module 2049 for attention training in the scale dimension.
  • FIG. 6 is a schematic structural diagram of the medium-scale attention module 2049 of FIG. 2 provided by an embodiment of this application.
  • the standard attention module 2049 includes a maximum pooling layer 2050, an average pooling layer 2051, a fully connected layer (FC layer) 2052, an activation layer (ReLU layer) 2053, a fully connected layer (FC layer) 2054, Convolutional layer 2055, activation layer (ReLU layer) 2056, convolutional layer 2057, activation layer (Sigmoid layer) 2058.
  • the sixth feature image is input to the scale attention module 2049, the maximum pooling operation is performed through the maximum pooling layer 2050 to obtain the maximum pooling result, and the average pooling operation is performed through the average pooling layer 2051 to obtain the average pooling result, and then
  • the average pooling result and the maximum pooling result respectively go through the fully connected layer (FC layer) 2052, the activation layer (ReLU layer) 2053, and the fully connected layer (FC layer) 2054 to determine the scale attention weight distribution.
  • the scale attention module 2049 can determine the weight ⁇ s of the scale s through the following formula (1-4):
  • ⁇ s Sigmoid(L 1 (L 0 (P avg (F))) + L 1 (L 0 (P max (F)))) (1-4).
  • sigmoid ( ⁇ ) is the activation function
  • F is the sixth feature image
  • L 0 ( ⁇ ) is the fully connected operation and the ReLU operation
  • L 1 ( ⁇ ) is the fully connected operation
  • Pavg ( ⁇ ) is the average pooling function
  • P max ( ⁇ ) is the maximum pooling function.
  • the sixth feature image is calibrated for the first time based on the scale attention weight distribution, and the sixth feature image after the first calibration is obtained.
  • the trained segmentation network can enhance the feature information at the appropriate scale when performing image segmentation, and suppress the feature information at the inappropriate scale in the image. Improve the segmentation accuracy of the segmentation network.
  • the weight of each pixel in the sixth feature image after secondary calibration can be determined by the following formula (1-5)
  • Sigmoid ( ⁇ ) is the activation function
  • ReLU ( ⁇ ) is the activation function
  • ⁇ 1 ( ⁇ ) is the convolution operation and batch normalization operation (Batch Normalizationc operation)
  • ⁇ 2 ( ⁇ ) is the convolution operation and batch normalization operation (Batch Normalizationc operation)
  • F a sixth feature of the first calibration image.
  • each pixel in the sixth feature image after the first calibration recalibrate each pixel in the sixth feature image after the first calibration to obtain the sixth feature image after the second calibration , Determine the sixth feature image after the second calibration as the feature extraction result of the sample image.
  • the segmentation network also includes a classifier (class) 2059 and a normalization layer (Softmax layer) 2060.
  • class class
  • Softmax layer normalization layer
  • the segmentation loss of the segmentation network is determined according to the segmentation result of the sample image and the segmentation label information corresponding to the sample image, and the network parameters of the segmentation network are adjusted according to the segmentation loss. Train the segmentation network iteratively until the segmentation loss of the segmentation network converges or the number of iterations reaches a preset number.
  • the DICE loss function, the Softdice loss function, the Cross Entropy loss function, the Focalloss loss function, and other loss functions can also be used to determine the segmentation loss. There is no specific limitation.
  • the segmentation network is trained on the spatial dimension, channel dimension and scale dimension of the comprehensive attention, so that the trained segmentation network can improve the segmentation when performing image segmentation.
  • Accuracy suitable for medical image segmentation problems, for example, tumors, tissue damage and necrosis, specific
  • the segmentation of organs assists the doctor in judging the condition or making a more accurate assessment of the patient's health.
  • the feature extraction of the sample image included in the training sample is performed using the attention mechanism to obtain the feature extraction result, and the sample image is segmented according to the feature extraction result
  • the image segmentation result is obtained, and the segmentation network is trained according to the image segmentation result and the segmentation label information corresponding to the sample image included in the training sample, so that the trained segmentation network can improve the segmentation accuracy when performing image segmentation processing.
  • the embodiment of the application provides a network training method, which is applied to medical image analysis.
  • the network training method can be executed by a terminal device or other processing equipment.
  • the terminal device can be a user equipment (User Equipment, UE), a mobile device, or a user. Terminals, terminals, cellular phones, cordless phones, personal digital assistants (PDAs), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • Other processing equipment can be servers or cloud servers.
  • the network training method can be implemented by a processor invoking computer-readable instructions stored in a memory.
  • the method can include:
  • step S31 the medical image is preprocessed, and the image is cropped and normalized.
  • step S32 a U-Net network model with a very stable effect in medical image analysis is selected as the backbone network.
  • the dot product sum method is used to connect each pixel to the correlation of all other pixels.
  • the decoded information is used to query the same Features in the hierarchical coding process. In this step is the spatial attention method.
  • step S33 channel attention is embedded in the middle of each decoding layer, where the average pooling and maximum pooling information are used at the same time to calibrate the characteristic channel information of the current layer.
  • step S34 the intermediate output of each layer of the decoding layer is unified to the same size as the original input image through upsampling, channels containing features of different scales are spliced, and finally an attention mechanism is introduced for different scale information.
  • Step S35 Perform image segmentation on the sample image to obtain a segmentation result of the sample image.
  • the segmentation result is compared with the gold standard labeled manually (including but not limited to doctors, nurses, etc.), and the loss function is repeatedly and iteratively trained using the gradient descent method through the backpropagation algorithm to optimize the model parameters.
  • the loss function adopts the segmented DICE loss function.
  • the network training method provided by the embodiments of the present application introduces an attention mechanism in multiple dimensions of features on a network widely used in medical imaging, and compared with the previous attention mechanism, it strengthens the attention of the region of interest and improves the self-reliance of the network. adaptability.
  • the network in the case of greatly improving the ability of network segmentation tasks, the network only adds a small amount of parameters and computational overhead. Therefore, the network training method can be well adapted to devices that require memory.
  • FIG. 7 is a schematic flowchart of an image processing method provided by an embodiment of the application.
  • the image processing method can be executed by a terminal device or other processing device, where the terminal device can be a user equipment (UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, or a personal digital assistant (Personal Digital Assistant). , PDA), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • Other processing equipment can be servers or cloud servers.
  • the image processing method may be implemented by a processor invoking computer-readable instructions stored in the memory. As shown in Figure 7, the method may include:
  • Step S71 Perform image segmentation processing on the image to be segmented through the segmentation network to obtain a segmentation result; wherein the segmentation network is obtained by training using the network training method of the foregoing embodiment.
  • the segmentation network trained in the foregoing embodiment may be used to perform image segmentation processing on the image to be processed.
  • the image to be processed is input to a segmentation network, and the output of the segmentation network is an image segmentation result of the image to be processed. Since the segmentation network is trained using the attention mechanism in the spatial dimension, channel dimension and scale dimension, the segmentation accuracy of the image to be processed by the segmentation network is improved.
  • the image processing method provided in the embodiments of the present application may include:
  • Step S701 preprocessing the dermatoscope picture; resample the picture to a size of 224*300, and then normalize it to a value between 0 and 1.
  • step S702 the dermoscopic picture preprocessed into 3*224*300 is input into the network as training data.
  • the image Before entering the network training, the image needs to be randomly flipped, rotated, and cropped to enhance the data, and then the enhanced training data and corresponding annotations are input into the network for training.
  • step S703 a network structure based on a Fully Convolutional Network (FCN) or U-Net is adopted to pass a dermatoscope image with a size of 3*224*300 through different layers of convolution.
  • FCN Fully Convolutional Network
  • U-Net U-Net
  • the feature image of the same resolution (such as 32*32*32) in the down-sampling is fused with the feature image of the same size in the up-sampling, and then combined with the spatial attention mechanism.
  • the feature combines the local and global information in the image, and at the same time strengthens the focus of the feature area.
  • step S704 an improved channel attention mechanism is inserted in the middle of the convolution operation on the image with a size of 3*224*300 obtained by the up-sampling. Then for each layer upsampling, the intermediate feature results are upsampled to the size of the input picture. Then through the scale attention mechanism to strengthen the attention on the feature scale. Finally, the segmentation results are compared with the original labeled segmentation results, and the DICE loss function, the intersection over union (IOU) loss function, or other loss functions are used to calculate the loss to form the final loss function.
  • Use the backpropagation algorithm to update the model parameters using the loss function, and iteratively optimize the model until the model converges or reaches the maximum number of iterations.
  • Step S705 Use the trained model to perform image processing on the dermoscopic picture to be processed to obtain a segmentation result.
  • the DICE coefficient, IOU, or average symmetric surface distance (ASSD) can be used as evaluation indicators to evaluate the training effect of the network.
  • the image processing method provided in the embodiments of this application adopts a network method based on full attention, which is very versatile for medical image segmentation problems, and can also be used in MRI, CT, ultrasound, X-ray and other medical images for tumor, tissue damage and necrosis Wait for the segmentation task of lesion area or specific organ. You only need to set the data parameters input to the network to achieve training and testing for different tasks.
  • the tumor or organ that needs to be segmented can be segmented in real time, so that CT radiotherapy area delineation, remote medical diagnosis, and cloud platform can be realized Assist in intelligent diagnosis, etc., assist doctors in judging the condition of the disease or make more accurate evaluations of the patient’s health.
  • the intelligent diagnosis equipment based on the image processing method provided by the embodiments of this application can adapt to the embedding on cloud platforms, large servers, and mobile devices at the same time. Imaging doctors, clinicians, etc. can conveniently use various devices for real-time viewing according to diagnosis needs .
  • this application also provides network training devices, image processing devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any of the network training methods, image processing methods, and corresponding technical solutions provided in this application. Description and refer to the corresponding records in the method section, and will not repeat them.
  • FIG. 8 is a schematic structural diagram of a network training device provided by an embodiment of the application. As shown in Fig. 8, the device 80 includes:
  • the feature extraction module 81 is configured to perform feature extraction on the sample images included in the training samples by using the attention mechanism on preset dimensions through the segmentation network to obtain feature extraction results, where the preset dimensions include: spatial dimensions, channel dimensions, and scales Dimension, the training sample also includes segmentation and labeling information corresponding to the sample image;
  • the segmentation module 82 is configured to perform image segmentation processing on the sample image according to the feature extraction result to obtain an image segmentation result
  • the training module 83 is configured to train the segmentation network according to the image segmentation result and segmentation label information.
  • the segmentation network includes an encoder and a decoder, the encoder includes multiple coding layers, and the decoder includes multiple decoding layers;
  • the feature extraction module 81 includes:
  • the first determining submodule is configured to input the sample image into the encoder and determine the first feature image corresponding to each coding layer, wherein the scales of the first feature images corresponding to different coding layers are different;
  • the second determining sub-module is configured to use the first feature image corresponding to the scale of the decoding layer for any decoding layer, and use the attention mechanism in the spatial dimension and channel dimension to perform processing on the second feature image input to the decoding layer.
  • Training to determine the third feature image corresponding to the decoding layer where the second feature image input to the decoding layer is determined according to the third feature image corresponding to the previous decoding layer of the decoding layer, and the third feature image corresponding to the different decoding layer
  • the scale of the feature image is different;
  • the third determining submodule is configured to determine a feature extraction result according to a plurality of third feature images of different scales determined by a plurality of decoding layers.
  • the second determining submodule includes:
  • the first training unit is configured to use the first feature image corresponding to the scale of the decoding layer to train the first feature image to be trained by using the attention mechanism in the spatial dimension to determine the fourth feature image corresponding to the decoding layer, Wherein, the first feature image to be trained is the second feature image input to the decoding layer;
  • the first splicing unit is configured to splice the second feature image input to the decoding layer and the fourth feature image corresponding to the decoding layer to obtain a second feature image to be trained;
  • the second training unit is configured to train the second feature image to be trained by using the attention mechanism in the channel dimension to determine the third feature image corresponding to the decoding layer.
  • the second determining submodule includes:
  • the second splicing unit is configured to splice the first feature image corresponding to the scale of the decoding layer and the second feature image input to the decoding layer to determine the second feature image to be trained;
  • the second training unit is configured to train the second feature image to be trained by using an attention mechanism in the channel dimension to determine the first feature image to be trained;
  • the first training unit uses the first feature image corresponding to the scale of the decoding layer to train the first feature image to be trained by using the attention mechanism in the spatial dimension to determine the third feature image corresponding to the decoding layer.
  • the first training unit includes:
  • the first determining subunit is configured to determine the spatial attention weight distribution corresponding to the decoding layer according to the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained, wherein the spatial attention corresponding to the decoding layer
  • the force weight distribution is used to indicate the weight of each pixel in the first feature image to be trained
  • the first calibration subunit is configured to calibrate each pixel in the first feature image to be trained according to the spatial attention weight distribution corresponding to the decoding layer.
  • the decoding layer includes multiple spatial attention training layers;
  • the specific configuration of the first determining subunit is:
  • Input the first feature image corresponding to the scale of the decoding layer and the first feature image to be trained into the multiple spatial attention training layers, and determine multiple weights of each pixel in the first feature image to be trained;
  • the spatial attention weight distribution corresponding to the decoding layer is determined.
  • the second training unit includes:
  • the second determining subunit is configured to determine the channel attention weight distribution corresponding to the decoding layer, where the channel attention weight distribution corresponding to the decoding layer is used to indicate the weight of each channel in the second feature image to be trained;
  • the second calibration subunit is configured to calibrate each channel in the second feature image to be trained according to the channel attention weight distribution corresponding to the decoding layer.
  • the second determining subunit is specifically configured as:
  • the channel attention weight distribution corresponding to the decoding layer is determined.
  • the third determining sub-module includes:
  • the third splicing unit is configured to splice third feature images at different scales to obtain a third feature image to be trained, wherein the scale of the third feature image to be trained is the same as the scale of the sample image;
  • the determining unit is configured to train the third feature image to be trained by using the attention mechanism in the scale dimension to determine the feature extraction result.
  • the determining unit is specifically configured as:
  • the third feature image to be trained is calibrated according to the scale attention weight distribution.
  • the sample image is a medical image
  • the segmentation and labeling information is a gold standard for manual labeling
  • FIG. 9 is a schematic structural diagram of an image processing apparatus provided by an embodiment of this application. As shown in Fig. 9, the device 90 includes:
  • the image processing module 91 is configured to perform image segmentation processing on the image to be segmented through a segmentation network to obtain a segmentation result;
  • the segmentation network is obtained by training using the network training method of the foregoing embodiment.
  • the image to be segmented is a medical image to be segmented; the image processing module 91 is configured to perform image segmentation processing on the medical image to be segmented through a segmentation network to obtain the segmented lesion area or target organ area.
  • the functions or modules included in the device provided in the embodiments of the application can be configured to execute the methods described in the above method embodiments.
  • the functions or modules included in the device provided in the embodiments of the application can be configured to execute the methods described in the above method embodiments.
  • the embodiment of the present application also proposes a computer-readable storage medium on which computer program instructions are stored, and the computer program instructions implement the foregoing method when executed by a processor.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • An embodiment of the present application also proposes an electronic device, including: a processor; a memory configured to store executable instructions of the processor; wherein the processor is configured to call the instructions stored in the memory to execute the above method.
  • the embodiments of the present application also provide a computer program product, including computer-readable code.
  • the processor in the device executes the network training/image processing method provided in the above embodiment. Instructions.
  • the embodiments of the present application also provide another computer program product configured to store computer-readable instructions, which when executed, cause the computer to perform the operations of the network training/image processing method provided by any of the foregoing embodiments.
  • the electronic device can be provided as a terminal, server or other form of device.
  • FIG. 10 is a schematic diagram of an electronic device provided by an embodiment of the application.
  • the electronic device 1000 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.
  • the electronic device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power supply component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, The sensor component 1014, and the communication component 1016.
  • the processing component 1002 generally controls overall operations of the electronic device 1000, such as operations associated with display, telephone calls, data communication, camera operations, and recording operations.
  • the processing component 1002 may include one or more processors 1020 to execute instructions to complete all or part of the steps of the foregoing method.
  • the processing component 1002 may include one or more modules to facilitate the interaction between the processing component 1002 and other components.
  • the processing component 1002 may include a multimedia module to facilitate the interaction between the multimedia component 1008 and the processing component 1002.
  • the memory 1004 is configured to store various types of data to support operations in the electronic device 1000. Examples of these data include instructions for any application or method operating on the electronic device 1000, contact data, phone book data, messages, pictures, videos, etc.
  • the memory 1004 can be implemented by any type of volatile or non-volatile storage devices or their combination, such as static random access memory (Static Random-Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically erasable programmable read-only memory).
  • Erasable Programmable Read Only Memory EEPROM, Erasable Programmable Read-Only Memory (Electrical Programmable Read Only Memory, EPROM), Programmable Read-Only Memory (Programmable Read-Only Memory, PROM), Read-Only Memory (Read-Only Memory) , ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • the power supply component 1006 provides power for various components of the electronic device 1000.
  • the power supply component 1006 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the electronic device 1000.
  • the multimedia component 1008 includes a screen that provides an output interface between the electronic device 1000 and the user.
  • the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (Touch Pad, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
  • the multimedia component 1008 includes a front camera and/or a rear camera. When the electronic device 1000 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 1010 is configured to output and/or input audio signals.
  • the audio component 1010 includes a microphone (Microphone, MIC).
  • the microphone is configured to receive external audio signals.
  • the received audio signal may be further stored in the memory 1004 or transmitted via the communication component 1016.
  • the audio component 1010 further includes a speaker configured to output audio signals.
  • the I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module.
  • the above-mentioned peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.
  • the sensor component 1014 includes one or more sensors configured to provide the electronic device 1000 with various aspects of state evaluation.
  • the sensor component 1014 can detect the on/off status of the electronic device 1000 and the relative positioning of the components.
  • the component is the display and the keypad of the electronic device 1000, and the sensor component 1014 can also detect the electronic device 1000 or the electronic device 1000.
  • the position of the component changes, the presence or absence of contact between the user and the electronic device 1000, the orientation or acceleration/deceleration of the electronic device 1000, and the temperature change of the electronic device 1000.
  • the sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
  • the sensor component 1014 may also include a light sensor, such as a complementary metal oxide semiconductor (Complementary Metal Oxide Semiconductor, CMOS) or a charge coupled device (Charge Coupled Device, CCD) image sensor, configured to be used in imaging applications.
  • CMOS Complementary Metal Oxide Semiconductor
  • CCD Charge Coupled Device
  • the sensor component 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 1016 is configured to facilitate wired or wireless communication between the electronic device 1000 and other devices.
  • the electronic device 1000 can access a wireless network based on communication standards, such as Wireless Fidelity (WiFi), the second generation (2 th Generation, 2G) or the third generation (3 th Generation, 3G), or a combination thereof .
  • WiFi Wireless Fidelity
  • the communication component 1016 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 1016 further includes a Near Field Communication (NFC) module to facilitate short-range communication.
  • NFC Near Field Communication
  • the NFC module can be based on Radio Frequency Identification (RFID) technology, Infrared Data Association (Infrared Data Association, IrDA) technology, Ultra Wide Band (UWB) technology, Bluetooth (Bluetooth, BT) technology and other technologies. Technology to achieve.
  • RFID Radio Frequency Identification
  • IrDA Infrared Data Association
  • UWB Ultra Wide Band
  • Bluetooth Bluetooth, BT
  • the electronic device 1000 may be configured by one or more application specific integrated circuits (ASIC), digital signal processors (Digital Signal Processor, DSP), and digital signal processing equipment (Digital Signal Process, DSPD), programmable logic device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor or other electronic components, configured to perform the above methods .
  • ASIC application specific integrated circuits
  • DSP Digital Signal Processor
  • DSPD digital signal processing equipment
  • PLD programmable logic device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor or other electronic components, configured to perform the above methods .
  • a non-volatile computer-readable storage medium such as a memory 1004 including computer program instructions, which can be executed by the processor 1020 of the electronic device 1000 to complete the foregoing method.
  • Fig. 11 shows a block diagram of an electronic device according to an embodiment of the present application.
  • the electronic device 1100 may be provided as a server.
  • the electronic device 1100 includes a processing component 1122, which further includes one or more processors, and a memory resource represented by a memory 1132, configured to store instructions executable by the processing component 1122, such as an application program.
  • the application program stored in the memory 1132 may include one or more modules each corresponding to a set of instructions.
  • the processing component 1122 is configured to execute instructions to perform the above-mentioned method.
  • the electronic device 1100 may also include a power component 1126 configured to perform power management of the electronic device 1100, a wired or wireless network interface 1150 configured to connect the electronic device 1100 to a network, and an input output (I/O) interface 1158 .
  • the electronic device 1100 can operate based on an operating system stored in the memory 1132, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
  • a non-volatile computer-readable storage medium such as a memory 1132 including computer program instructions, which can be executed by the processing component 1122 of the electronic device 1100 to complete the foregoing method.
  • the embodiments of the present application may be systems, methods, and/or computer program products.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present application.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer disks, hard disks, random-access memory (Random-Access Memory, RAM), read-only memory (ROM), erasable programmable Read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (Compact Disc Read-Only Memory, CD-ROM), digital multi-function disc (Digital Video Disc, DVD), memory Sticks, floppy disks, mechanical encoding devices, such as punch cards on which instructions are stored or raised structures in the grooves, and any suitable combination of the above.
  • RAM Random-Access Memory
  • ROM read-only memory
  • EPROM or flash memory erasable programmable Read-only memory
  • SRAM static random access memory
  • portable compact disc read-only memory Compact Disc Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • DVD digital multi-function disc
  • memory Sticks floppy disks
  • mechanical encoding devices such as punch cards on which instructions are stored or raised structures in the grooves, and
  • the computer-readable storage medium used here is not interpreted as the instantaneous signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or through wires Transmission of electrical signals.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of the embodiments of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or a combination of Or source code or object code written in any combination of multiple programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as "C" language or similar programming Language.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer (for example, Use an Internet service provider to connect via the Internet).
  • the electronic circuit is personalized by using the state information of the computer-readable program instructions, such as programmable logic circuit, Field Programmable Gate Array (FPGA), or Programmable Logic Array (Programmable Logic). Array, PLA), the electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present application.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that makes these instructions when executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner. Thus, the computer-readable medium storing the instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction contains one or more components for realizing the specified logical function.
  • Executable instructions may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the computer program product can be specifically implemented by hardware, software, or a combination thereof.
  • the computer program product is specifically embodied as a computer storage medium.
  • the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
  • SDK software development kit
  • the embodiments of the present application provide a network training, image processing method and device, electronic equipment, and storage medium.
  • the method includes: using a segmentation network to perform feature extraction on sample images included in training samples using an attention mechanism in a preset dimension , The feature extraction result is obtained, wherein the preset dimensions include: spatial dimension, channel dimension, and scale dimension; the training sample also includes segmentation and labeling information corresponding to the sample image; and according to the feature extraction result, the Image segmentation is performed on the sample image to obtain an image segmentation result; and the segmentation network is trained according to the image segmentation result and the segmentation label information.
  • the embodiment of the present application can realize the training of the segmentation network, and can perform image segmentation processing through the segmentation network obtained by the training.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé et un appareil d'apprentissage de réseau, un procédé et un appareil de traitement d'images et un dispositif électronique et un support de stockage. Le procédé d'apprentissage de réseau consiste à : au moyen d'un réseau de segmentation, faire appel à un mécanisme d'attention à des dimensions prédéfinies pour effectuer une extraction de caractéristiques sur une image d'échantillon comprise dans un échantillon d'apprentissage, de manière à obtenir un résultat d'extraction de caractéristiques, les dimensions prédéfinies comprenant une dimension spatiale, une dimension de canal et une dimension d'échelle et l'échantillon d'apprentissage comprenant en outre des informations d'annotation de segmentation correspondant à l'image d'échantillon ; procéder au traitement de segmentation d'image sur l'image d'échantillon selon le résultat d'extraction de caractéristiques, de façon à obtenir un résultat de segmentation d'image ; et instruire le réseau de segmentation en fonction du résultat de segmentation d'image et des informations d'annotation de segmentation.
PCT/CN2020/100723 2020-01-20 2020-07-07 Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage WO2021147257A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217034486A KR20210140757A (ko) 2020-01-20 2020-07-07 네트워크 훈련 방법, 이미지 처리 방법 및 장치, 전자 기기 및 저장 매체
JP2021539612A JP2022521130A (ja) 2020-01-20 2020-07-07 ネットワークトレーニング、画像処理方法および電子機器、記憶媒体並びにコンピュータプログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010065998.9 2020-01-20
CN202010065998.9A CN111310764B (zh) 2020-01-20 2020-01-20 网络训练、图像处理方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021147257A1 true WO2021147257A1 (fr) 2021-07-29

Family

ID=71146977

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/100723 WO2021147257A1 (fr) 2020-01-20 2020-07-07 Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage

Country Status (5)

Country Link
JP (1) JP2022521130A (fr)
KR (1) KR20210140757A (fr)
CN (1) CN111310764B (fr)
TW (1) TWI743931B (fr)
WO (1) WO2021147257A1 (fr)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989593A (zh) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 图像处理方法、检索方法、训练方法、装置、设备及介质
CN114119351A (zh) * 2021-11-08 2022-03-01 清华大学 图像处理方法、装置、电子设备及存储介质
CN114418069A (zh) * 2022-01-19 2022-04-29 腾讯科技(深圳)有限公司 一种编码器的训练方法、装置及存储介质
CN114429548A (zh) * 2022-01-28 2022-05-03 北京百度网讯科技有限公司 图像处理方法、神经网络及其训练方法、装置和设备
CN114596370A (zh) * 2022-03-04 2022-06-07 深圳万兴软件有限公司 视频色彩转换方法、装置、计算机设备及存储介质
CN114764858A (zh) * 2022-06-15 2022-07-19 深圳大学 复制粘贴图像识别方法、装置、计算机设备及存储介质
CN114782440A (zh) * 2022-06-21 2022-07-22 杭州三坛医疗科技有限公司 医学图像分割方法及电子设备
CN115034375A (zh) * 2022-08-09 2022-09-09 北京灵汐科技有限公司 数据处理方法及装置、神经网络模型、设备、介质
CN115330808A (zh) * 2022-07-18 2022-11-11 广州医科大学 一种分割引导的磁共振图像脊柱关键参数自动测量方法
WO2023116507A1 (fr) * 2021-12-22 2023-06-29 北京沃东天骏信息技术有限公司 Procédé et appareil de formation de modèle de détection de cible, et procédé et appareil de détection de cible
CN116402779A (zh) * 2023-03-31 2023-07-07 北京长木谷医疗科技有限公司 基于深度学习注意力机制的颈椎图像分割方法及装置
CN116704666A (zh) * 2023-06-21 2023-09-05 合肥中科类脑智能技术有限公司 售卖方法及计算机可读存储介质、自动售卖机
CN116955965A (zh) * 2023-09-20 2023-10-27 山东鑫泰莱光电股份有限公司 一种基于太阳能数据故障预测方法、设备以及存储介质
CN117351183A (zh) * 2023-10-09 2024-01-05 广州医科大学附属第一医院(广州呼吸中心) 子宫内膜癌淋巴结转移智能识别方法及系统
CN117437463A (zh) * 2023-10-19 2024-01-23 上海策溯科技有限公司 基于图像处理的医学影像数据处理方法及处理平台

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111310764B (zh) * 2020-01-20 2024-03-26 上海商汤智能科技有限公司 网络训练、图像处理方法及装置、电子设备和存储介质
CN112102251B (zh) * 2020-08-20 2023-10-31 上海壁仞智能科技有限公司 一种分割影像的方法及装置、电子设备和存储介质
CN112183507B (zh) * 2020-11-30 2021-03-19 北京沃东天骏信息技术有限公司 图像分割方法、装置、设备、存储介质
CN112733886A (zh) * 2020-12-24 2021-04-30 西人马帝言(北京)科技有限公司 样本图像的处理方法、装置、设备及存储介质
CN113223730B (zh) * 2021-03-30 2023-06-06 武汉市疾病预防控制中心 基于人工智能的疟疾分类方法及设备
CN113377986B (zh) * 2021-06-23 2023-11-07 泰康保险集团股份有限公司 图像检索方法和装置
CN114267443B (zh) * 2021-11-08 2022-10-04 东莞市人民医院 基于深度学习的胰腺肿瘤纤维化程度预测方法及相关装置
WO2023101276A1 (fr) * 2021-11-30 2023-06-08 삼성전자 주식회사 Appareil de traitement d'image et son procédé de fonctionnement
CN115430066A (zh) * 2022-09-13 2022-12-06 苏州雷泰医疗科技有限公司 超声装置、包括该超声装置的放射治疗设备及其工作方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410216A (zh) * 2018-09-14 2019-03-01 北京市商汤科技开发有限公司 一种缺血性脑卒中图像区域分割方法及装置
CN110176012A (zh) * 2019-05-28 2019-08-27 腾讯科技(深圳)有限公司 图像中的目标分割方法、池化方法、装置及存储介质
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network
CN111310764A (zh) * 2020-01-20 2020-06-19 上海商汤智能科技有限公司 网络训练、图像处理方法及装置、电子设备和存储介质

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW514513B (en) * 1996-02-06 2002-12-21 Deus Technologies Inc Method for the detection of lung nodule in radiological images using digital image processing and artificial neural network
US10049279B2 (en) * 2016-03-11 2018-08-14 Qualcomm Incorporated Recurrent networks with motion-based attention for video understanding
US10558750B2 (en) * 2016-11-18 2020-02-11 Salesforce.Com, Inc. Spatial attention model for image captioning
CN108830157B (zh) * 2018-05-15 2021-01-22 华北电力大学(保定) 基于注意力机制和3d卷积神经网络的人体行为识别方法
CN109446970B (zh) * 2018-10-24 2021-04-27 西南交通大学 一种基于深度学习的变电站巡检机器人道路场景识别方法
CN109614991A (zh) * 2018-11-19 2019-04-12 成都信息工程大学 一种基于Attention的多尺度扩张性心肌的分割分类方法
CN109829501B (zh) * 2019-02-01 2021-02-19 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN110188765B (zh) * 2019-06-05 2021-04-06 京东方科技集团股份有限公司 图像语义分割模型生成方法、装置、设备及存储介质
CN110648334A (zh) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 一种基于注意力机制的多特征循环卷积显著性目标检测方法
CN110633755A (zh) * 2019-09-19 2019-12-31 北京市商汤科技开发有限公司 网络训练方法、图像处理方法及装置、电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410216A (zh) * 2018-09-14 2019-03-01 北京市商汤科技开发有限公司 一种缺血性脑卒中图像区域分割方法及装置
CN110176012A (zh) * 2019-05-28 2019-08-27 腾讯科技(深圳)有限公司 图像中的目标分割方法、池化方法、装置及存储介质
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network
CN111310764A (zh) * 2020-01-20 2020-06-19 上海商汤智能科技有限公司 网络训练、图像处理方法及装置、电子设备和存储介质

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989593A (zh) * 2021-10-29 2022-01-28 北京百度网讯科技有限公司 图像处理方法、检索方法、训练方法、装置、设备及介质
CN114119351A (zh) * 2021-11-08 2022-03-01 清华大学 图像处理方法、装置、电子设备及存储介质
WO2023116507A1 (fr) * 2021-12-22 2023-06-29 北京沃东天骏信息技术有限公司 Procédé et appareil de formation de modèle de détection de cible, et procédé et appareil de détection de cible
CN114418069A (zh) * 2022-01-19 2022-04-29 腾讯科技(深圳)有限公司 一种编码器的训练方法、装置及存储介质
CN114429548A (zh) * 2022-01-28 2022-05-03 北京百度网讯科技有限公司 图像处理方法、神经网络及其训练方法、装置和设备
CN114596370A (zh) * 2022-03-04 2022-06-07 深圳万兴软件有限公司 视频色彩转换方法、装置、计算机设备及存储介质
CN114764858B (zh) * 2022-06-15 2022-11-01 深圳大学 一种复制粘贴图像识别方法、装置、计算机设备及存储介质
CN114764858A (zh) * 2022-06-15 2022-07-19 深圳大学 复制粘贴图像识别方法、装置、计算机设备及存储介质
CN114782440A (zh) * 2022-06-21 2022-07-22 杭州三坛医疗科技有限公司 医学图像分割方法及电子设备
CN115330808A (zh) * 2022-07-18 2022-11-11 广州医科大学 一种分割引导的磁共振图像脊柱关键参数自动测量方法
CN115034375A (zh) * 2022-08-09 2022-09-09 北京灵汐科技有限公司 数据处理方法及装置、神经网络模型、设备、介质
CN116402779A (zh) * 2023-03-31 2023-07-07 北京长木谷医疗科技有限公司 基于深度学习注意力机制的颈椎图像分割方法及装置
CN116704666A (zh) * 2023-06-21 2023-09-05 合肥中科类脑智能技术有限公司 售卖方法及计算机可读存储介质、自动售卖机
CN116955965A (zh) * 2023-09-20 2023-10-27 山东鑫泰莱光电股份有限公司 一种基于太阳能数据故障预测方法、设备以及存储介质
CN116955965B (zh) * 2023-09-20 2024-02-02 山东鑫泰莱光电股份有限公司 一种基于太阳能数据故障预测方法、设备以及存储介质
CN117351183A (zh) * 2023-10-09 2024-01-05 广州医科大学附属第一医院(广州呼吸中心) 子宫内膜癌淋巴结转移智能识别方法及系统
CN117351183B (zh) * 2023-10-09 2024-06-04 广州医科大学附属第一医院(广州呼吸中心) 子宫内膜癌淋巴结转移智能识别方法及系统
CN117437463A (zh) * 2023-10-19 2024-01-23 上海策溯科技有限公司 基于图像处理的医学影像数据处理方法及处理平台
CN117437463B (zh) * 2023-10-19 2024-05-24 上海策溯科技有限公司 基于图像处理的医学影像数据处理方法及处理平台

Also Published As

Publication number Publication date
CN111310764B (zh) 2024-03-26
TW202129543A (zh) 2021-08-01
CN111310764A (zh) 2020-06-19
TWI743931B (zh) 2021-10-21
KR20210140757A (ko) 2021-11-23
JP2022521130A (ja) 2022-04-06

Similar Documents

Publication Publication Date Title
WO2021147257A1 (fr) Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage
CN111368923B (zh) 神经网络训练方法及装置、电子设备和存储介质
TWI755853B (zh) 圖像處理方法、電子設備和電腦可讀儲存介質
WO2022151755A1 (fr) Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique
WO2022036972A1 (fr) Procédé et appareil de segmentation d'image, dispositif électronique et support de stockage
WO2020211284A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support de stockage
TWI755175B (zh) 圖像分割方法、電子設備和儲存介質
WO2020211293A1 (fr) Procédé et appareil de segmentation d'image, dispositif électronique et support d'informations
CN112767329B (zh) 图像处理方法及装置、电子设备
EP3998579A1 (fr) Procédé, appareil et dispositif de traitement d'images médicales, support et endoscope
WO2021259391A2 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support d'enregistrement
CN113222038B (zh) 基于核磁图像的乳腺病灶分类和定位方法及装置
CN114820584B (zh) 肺部病灶定位装置
WO2021259390A2 (fr) Procédé et appareil de détection de plaques calcifiées sur des artères coronaires
WO2022022350A1 (fr) Procédé et appareil de traitement d'images, dispositif électronique, support d'enregistrement, et produit programme d'ordinateur
WO2022011984A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique, support de stockage, et produit-programme
WO2021082517A1 (fr) Procédé et appareil d'entraînement de réseau neuronal, procédé et appareil de segmentation d'image, dispositif, support et programme
CN113470029A (zh) 训练方法及装置、图像处理方法、电子设备和存储介质
KR20220012407A (ko) 이미지 분할 방법 및 장치, 전자 기기 및 저장 매체
CN115099293B (zh) 一种模型训练方法及装置、电子设备和存储介质
CN117036750A (zh) 膝关节病灶检测方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021539612

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915339

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20217034486

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915339

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 31.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20915339

Country of ref document: EP

Kind code of ref document: A1