CN116385427A - Image processing method and device - Google Patents
Image processing method and device Download PDFInfo
- Publication number
- CN116385427A CN116385427A CN202310494634.6A CN202310494634A CN116385427A CN 116385427 A CN116385427 A CN 116385427A CN 202310494634 A CN202310494634 A CN 202310494634A CN 116385427 A CN116385427 A CN 116385427A
- Authority
- CN
- China
- Prior art keywords
- image
- segmented
- image data
- segmentation
- category
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 23
- 230000011218 segmentation Effects 0.000 claims abstract description 86
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000013145 classification model Methods 0.000 claims abstract description 39
- 238000012545 processing Methods 0.000 claims description 51
- 238000012549 training Methods 0.000 claims description 38
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 14
- 239000002775 capsule Substances 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 13
- 238000003709 image segmentation Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 9
- 210000002307 prostate Anatomy 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000009877 rendering Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 abstract description 10
- 210000004907 gland Anatomy 0.000 description 15
- 206010020718 hyperplasia Diseases 0.000 description 13
- 238000000605 extraction Methods 0.000 description 9
- 230000002390 hyperplastic effect Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000009467 reduction Effects 0.000 description 6
- 238000002271 resection Methods 0.000 description 5
- 206010004446 Benign prostatic hyperplasia Diseases 0.000 description 4
- 208000004403 Prostatic Hyperplasia Diseases 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 3
- 241000934136 Verruca Species 0.000 description 3
- 208000000260 Warts Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 201000010153 skin papilloma Diseases 0.000 description 3
- 238000001356 surgical procedure Methods 0.000 description 3
- 208000037062 Polyps Diseases 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 208000002847 Surgical Wound Diseases 0.000 description 1
- 230000000740 bleeding effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011471 prostatectomy Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000009834 vaporization Methods 0.000 description 1
- 230000008016 vaporization Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/225—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30081—Prostate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30204—Marker
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the invention discloses an image processing method and device, wherein the method can determine the target category of an object to be segmented in image data in a plurality of preset categories based on a pre-trained classification model, determine a target segmentation model corresponding to the target category in a plurality of pre-trained segmentation models, generate a feature mask diagram corresponding to the object to be segmented based on the target segmentation model, and further identify the pixel region of the object to be segmented in the image data according to the feature mask diagram, so that a doctor can be helped to improve the accuracy and the efficiency of boundary identification of a part to be cut by identifying the pixel region of the object to be segmented.
Description
Technical Field
The present invention relates to the field of image processing, and in particular, to an image processing method and apparatus.
Background
At present, benign prostatic hyperplasia (Benign Prostatic Hyperplasia, BPH) has become one of the most common benign diseases affecting the quality of life of middle-aged and elderly men. Transurethral prostatectomy (Trans Urethral Resection Prostate, TURP) is considered a common method for treating benign prostatic hyperplasia, and can be used for removing the prostatic envelope, the mons and the hyperplastic glands on the bladder neck by peeling, excision or vaporization and the like to achieve the aim of treating benign prostatic hyperplasia.
The prior art usually uses doctors to remove the prostatic capsule, the mons and the hyperplasia glands on the bladder neck according to experience, however, the judgment of the removal boundary of the hyperplasia glands in this way is too dependent on the experience of doctors, so that the problem of insufficient removal of the hyperplasia glands easily occurs, the symptoms of patients cannot be improved, or the prostatic capsule is ruptured during the removal of the hyperplasia glands, so that the venous plexus on the prostatic capsule is destroyed, bleeding and other complications are caused, and the health of the patients is seriously affected.
Disclosure of Invention
In view of this, the embodiments of the present invention provide an image processing method and apparatus to help a physician to improve the accuracy and efficiency of boundary recognition for a portion to be resected.
In a first aspect, an embodiment of the present invention provides an image processing method, including:
acquiring image data to be processed;
inputting the image data into a pre-trained classification model to determine the target category of an object to be segmented in the image data from a plurality of preset categories;
determining a target segmentation model corresponding to the target category from a plurality of segmentation models trained in advance, wherein each segmentation model is respectively used for carrying out image segmentation on an object to be segmented corresponding to a preset category;
inputting the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented;
and identifying the pixel region where the object to be segmented is located in the image data according to the feature mask map.
Further, the preset categories include mons verrucosus, prostate capsule and bladder neck.
Further, the method further comprises:
acquiring a plurality of sample images, wherein each sample image is provided with a corresponding area tag and a category tag, the area tag is used for marking a pixel area where an object to be segmented is located in the sample image, and the category tag is used for marking a category to which the object to be segmented belongs in the sample image;
training a first neural network based on the plurality of sample images to obtain the classification model.
Further, the method further comprises:
dividing the plurality of sample images into a plurality of sample image sets according to the category labels, wherein the sample images in each sample image set have the same category label;
for each set of sample images, training a second neural network based on the set of sample images to obtain a corresponding segmentation model.
Further, each segmentation model is obtained based on cross entropy loss function training;
wherein the cross entropy loss function L is calculated according to the following formula:
wherein N is used for representing the total number of pixel points in one image, yi is used for representing the label of the ith pixel in the image, W i Weight for representing ith pixel point in image, p i Used for representing the prediction probability of belonging to the segmentation target for the ith pixel point.
Further, the identifying the object to be segmented in the image data according to the feature mask map includes:
determining the outline of the object to be segmented in the image data according to the feature mask map;
and rendering the pixel areas in the outline.
Further, before training the first neural network or the second neural network, the method further comprises:
and performing image preprocessing on the plurality of sample images.
In a second aspect, an embodiment of the present invention provides an image processing apparatus, including:
an acquisition unit configured to acquire image data to be processed;
a category determining unit, configured to input the image data into a pre-trained classification model, so as to determine, from a plurality of preset categories, a target category to which an object to be segmented in the image data belongs;
the model selection unit is used for determining a target segmentation model corresponding to the target category from a plurality of segmentation models trained in advance, wherein each segmentation model is respectively used for carrying out image segmentation on an object to be segmented corresponding to a preset category;
a segmentation unit for inputting the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented;
and the identification unit is used for identifying the pixel region where the object to be segmented is located in the image data according to the feature mask map.
In a third aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method according to any of the first aspects.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
a memory for storing one or more computer program instructions;
a processor, the one or more computer program instructions being executed by the processor to implement the method of any of the first aspects.
According to the image processing method, the target category of the object to be segmented in the image data is determined in a plurality of preset categories based on the pre-trained classification model, the target segmentation model corresponding to the target category is determined in a plurality of pre-trained segmentation models, the feature mask map corresponding to the object to be segmented is generated based on the target segmentation model, and the pixel region where the object to be segmented in the image data is located is marked according to the feature mask map.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of an application system of an image processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of an image processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a classification model training method according to an embodiment of the invention;
FIG. 4 is a flowchart of a segmentation model training method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an identification process of image data according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The present invention is described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth in detail. The present invention will be fully understood by those skilled in the art without the details described herein. Well-known methods, procedures, flows, components and circuits have not been described in detail so as not to obscure the nature of the invention.
Moreover, those of ordinary skill in the art will appreciate that the drawings are provided herein for illustrative purposes and that the drawings are not necessarily drawn to scale.
Unless the context clearly requires otherwise, the words "comprise," "comprising," and the like in the description are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to".
In the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
Fig. 1 is a schematic diagram of an application system of an image processing method according to an embodiment of the present invention. As shown in fig. 1, the application system of the image processing method includes an acquisition device 11 and an image processing device 12.
Wherein the acquisition device 11 is used for acquiring video data. In this embodiment, the acquisition device 11 may be used to acquire video data of an object to be segmented by penetrating into the patient's body through a corresponding surgical incision under the control of a physician, and transmit the acquired video data to the image processing device 12 in real time. Alternatively, the collection device 11 may be an endoscope in particular.
It should be understood that the object to be segmented refers to a site where a desired cut-out portion grows. In this embodiment, the portion to be resected may be a hyperplastic gland, and the object to be segmented may be a verruca, a prostatic capsule, and a bladder neck, in which the hyperplastic gland is grown.
The image processing device 12 may be a general purpose computing device or a data processing device or a storage device. In this embodiment, the image processing device 12 may receive the video data transmitted by the acquisition device 11, and process the video data to identify, in the video data, a pixel area where the object to be segmented is located.
Specifically, the acquisition device 11 may be advanced into the patient under the control of a physician to acquire video data of the object to be segmented, and transmit the acquired video data to the image processing device 12 in real time. Upon receiving the video data, the image processing apparatus 12 extracts the video data frame by frame to acquire a plurality of pieces of image data. Further, for each image data, the image processing device 12 determines, from among a plurality of preset categories, a target category to which an object to be segmented belongs in the image data, determines, from among a plurality of segmentation models, a target segmentation model corresponding to the target category, generates a feature mask map corresponding to the object to be segmented based on the target segmentation model, and further identifies a pixel region in which the object to be segmented is located in the image data according to the feature mask map. After identifying the pixel region in each image data where the object to be segmented is located, the image processing device 12 reassembles each image data into video data and outputs the video data to be displayed to a doctor.
According to the method and the device for identifying the boundary between the object to be segmented and the hyperplastic gland, the boundary between the object to be segmented and the hyperplastic gland can be displayed to a doctor through identification of the pixel region where the object to be segmented is located in the video data, so that the doctor can quickly and accurately identify the removal boundary of the hyperplastic gland, and the problem of insufficient removal of the hyperplastic gland or the problem of rupture of the prostate capsule during the excision of the hyperplastic gland can be avoided.
Alternatively, the classification model and the plurality of segmentation models in the present embodiment may be pre-trained neural network models. The classification models are used for determining target categories to which the objects to be segmented belong in the image data in a plurality of preset categories, and each segmentation model is respectively used for carrying out image segmentation on the objects to be segmented in the corresponding preset categories, wherein the preset categories specifically refer to vernix, prostatic capsule and bladder neck. Specifically, in prostatoplasia gland resection surgery, the physician needs to remove the gland hyperplasia on the vernal mons, the prostate capsule and the bladder neck, respectively. Therefore, the segmentation model in this embodiment needs to be able to image-segment the verruca, the prostatic capsule and the bladder neck. However, if the same segmentation model is used to realize the image segmentation of the verruca, the prostatic capsule and the bladder neck at the same time, the image segmentation accuracy of the segmentation model is reduced, so that the recognition requirement is difficult to meet. In order to improve the image segmentation accuracy, the embodiment trains segmentation models corresponding to preset categories respectively, so as to segment images of objects to be segmented corresponding to the preset categories through the segmentation models. Correspondingly, the embodiment also trains a classification model to judge the target category to which the object to be segmented in the current image belongs from a plurality of preset categories through the classification model.
Optionally, for presenting the identified video data to the physician, the image processing device 12 may also be connected to a corresponding image output means, such as a visual screen or the like, by means of which the image data is presented to the physician.
Optionally, the acquisition device 11 and the image processing device 12 may be connected by using a wireless network, a wired network, or a combination of the wireless network and the wired network, so as to implement data interaction. The wireless network may include any one or a combination of a 5G mobile communication network technology (5 th-Generation, 5G) system, a Long term evolution (Long term evolution, LTE) system, a global system for mobile communication (Global System for Mobile Communication, GSM), bluetooth (BT), wireless fidelity (Wireless Fidelity, wi-Fi), a code division multiple access (Code Division Multiple Access, CDMA) network, a wideband code division multiple access (wideband code division multiple access, WCDMA) network, a Long Range (Long) technology, or Zigbee (Zigbee) technology. The wired network may include a fiber optic communication network or a network of coaxial cables, etc.
It should be understood that the image processing method in this embodiment is not limited to application in prostatoplasia gland resection surgery, and the image processing method in this embodiment can be applied in other related resection surgery as well by performing adaptation. The image processing method in the present embodiment can be applied to a tumor resection operation as well, for example, by determining a portion to be resected as a tumor, determining an object to be segmented as a portion where the tumor grows, and adjusting training data of the relevant model. Also, for example, by determining a portion to be resected as a polyp, determining an object to be segmented as a site where the polyp grows, and adjusting training data of the correlation model, the image processing method in the present embodiment can also be applied to polypectomy.
It should be understood that the image processing method in this embodiment is not limited to the applied surgical scene, and specifically, the image processing method may be applied in an open surgical scene or in a minimally invasive surgical scene.
Fig. 2 is a flowchart of an image processing method according to an embodiment of the present invention. As shown in fig. 2, the image processing method specifically may include the following steps:
it should be understood that the execution subject of the image processing method may specifically be the image processing apparatus in the above-described embodiment.
S100, acquiring image data to be processed.
Specifically, the image processing apparatus may acquire the video data acquired by the acquisition apparatus, and extract the video data frame by frame to acquire a plurality of pieces of image data to be processed in the video data.
It should be understood that the image processing method in this embodiment may also be used to identify the pixel region where the object to be segmented is located in the single image data, where the image processing apparatus may acquire the single image data to be processed in step S100.
Alternatively, when extracting video data frame by frame to acquire each image data, the image processing apparatus may also record a time stamp corresponding to each extracted image data so that each image data can be recombined into video data according to the time stamp in a subsequent step.
S200, inputting the image data into a pre-trained classification model to determine the target category of the object to be segmented in the image data from a plurality of preset categories.
The classification model may be a pre-trained neural network model, and the classification model is used for determining a category to which an object to be segmented in an input image belongs from a plurality of preset categories, where the preset categories may include mons, prostatic capsule and bladder neck.
Specifically, after acquiring each image data, the image processing apparatus may input the image data into a classification model for each image data to determine, from among a plurality of preset categories, a target category to which an object to be segmented in the current image data belongs, by the classification model.
Optionally, the present embodiment further provides a classification model training method, and the classification model training method may be used to train the classification model.
FIG. 3 is a flowchart of a classification model training method according to an embodiment of the invention. As shown in fig. 3, the classification model training method specifically may include the following steps:
s1000, acquiring a plurality of sample images.
The sample images are images containing the object to be segmented, the images are obtained in the history operation process, each sample image can be provided with a corresponding area label and a category label, the area labels are used for marking pixel areas where the object to be segmented is located in the sample images, and the category labels are used for marking categories where the object to be segmented belongs in the sample images.
Specifically, the image processing apparatus may use an image including an object to be segmented acquired during a history operation as an original sample image, label a category to which the object to be segmented belongs in each original sample image as a category label based on self experience by a doctor, and label a pixel region in each original sample image in which the object to be segmented is located as a region label, thereby obtaining a sample image that can be used for training a classification model.
Alternatively, the sample image acquired in step S1000 may be a sample image acquired and marked by a person on another device.
Alternatively, the present embodiment may annotate each original sample image by a physician through an image annotator. In particular, the physician may import the original sample Image in Image format into the Image annotator. In the image annotator, a doctor can annotate the category to which the image to be segmented belongs in the original sample image, and can annotate the outline of the image to be segmented in the original sample image. After the labeling is completed, the physician may output the labeled sample image in json format through the image annotator. It should be understood that the category marked by the doctor is the category label of the original sample image, and the position information of each contour coordinate marked by the doctor is the area label of the original sample image. The class label information and the area label information marked by a doctor can be stored in the json-format sample image, so that the json-format sample image can be used as training data of a classification model.
Alternatively, the image annotator may specifically employ a VGG-16 image annotator. The VGG-16 is a deep learning image annotation tool based on a deep convolutional neural network (Dynamic Convolution Neural Network, DCNN).
And S2000, training the first neural network based on the plurality of sample images to obtain the classification model.
Specifically, after the image samples are acquired and labeled, the image processing device may perform multiple rounds of training on the first neural network based on the plurality of sample images to obtain the classification model.
Alternatively, the training process of the model is a process of learning and updating parameters of the model. In step S2000, the image processing apparatus may retain the model parameters once every time after completing one round of training. After the training of the preset number of rounds is completed, the image processing device can evaluate the classification effect of the classification model under each model parameter, and takes the model parameter which can enable the classification effect of the classification model to reach the optimal degree in a plurality of model parameters as the final model parameter of the classification model according to the evaluation result, so that the classification effect of the trained model can be improved.
Alternatively, the image processing apparatus may determine the classification effect of the classification model by calculating the Global Accuracy (GA) of the classification model, the Classification Accuracy (CA) for each Class, and the average Accuracy (Mean Accuracy, macc). The global accuracy refers to the ratio of the sample images to the total number of the sample images, the classification accuracy refers to the ratio of the number of the sample images in the current category to the number of the actual sample images in the category, and the average accuracy refers to the ratio of the sum of the classification accuracy of each category to the number of the categories.
S300, determining a target segmentation model corresponding to the target category from a plurality of segmentation models trained in advance.
The segmentation models can be pre-trained neural network models, and each segmentation model is used for carrying out image segmentation on the object to be segmented corresponding to the preset category.
Specifically, after determining a target class to which an object to be segmented belongs in the current image data, the image processing apparatus may determine a target segmentation model corresponding to the target class among a plurality of segmentation models.
S400, inputting the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented.
Specifically, after determining the target segmentation model, the image processing apparatus may input the current image data into the target segmentation model to image-segment an object to be segmented in the current image data through the target segmentation model to generate a feature mask map corresponding to the object to be segmented.
Further, an encoder portion and a decoder portion may be included in the target segmentation model. Wherein the encoder section is for performing feature extraction on an input image, and the decoder section is for performing feature restoration on an encoded feature extraction map. Specifically, after the current image data is input into the target segmentation model, the encoder part in the target segmentation model performs multiple rolling and pooling operations on the input image to perform shallow-to-deep feature extraction on the input image, so as to obtain a corresponding feature extraction graph. After the feature extraction map is obtained, the decoder part in the target segmentation model performs a plurality of upsampling operations on the feature extraction map to perform feature reduction on the feature extraction map, thereby obtaining a feature reduction map with the same size as the input image. After the feature reduction image is obtained, a convolution prediction layer in a target segmentation model traverses each pixel point in an image according to the feature reduction image to determine the probability that each pixel point belongs to the foreground, then determines the pixel point with the probability larger than or equal to a preset threshold value as a foreground pixel point, and determines the pixel point with the probability smaller than the preset threshold value as a background pixel point, so that the feature mask image can be obtained.
It should be appreciated that the resulting feature mask map includes background pixels with gray values of 255 and foreground pixels with gray values of 0. The foreground pixel points are used for representing pixel points belonging to the object to be segmented, and the background pixel points are used for representing pixel points not belonging to the object to be segmented.
Optionally, a residual structure may be further established between the decoder and the corresponding layer of the encoder of the segmentation model, where the residual structure may superimpose the feature extraction map of the corresponding layer with the current feature reduction map during feature reduction, so as to ensure that features in the image are not lost during feature extraction.
Optionally, the present embodiment further provides a segmentation model training method, and the segmentation model training method may be used for training the segmentation model.
Fig. 4 is a flowchart of a segmentation model training method according to an embodiment of the present invention. As shown in fig. 4, the segmentation model training method specifically may include the following steps:
s3000, dividing the plurality of sample images into a plurality of sample image sets according to the category labels.
The sample images are the same as those of the training classification model, and not described in detail herein, and the sample images in each sample image set have the same class label.
Specifically, for each sample image, the image processing apparatus may determine sample images having the same category label into the same sample image set according to the category label, whereby the image processing apparatus may divide the plurality of sample images into a plurality of sample image sets.
S4000, training a second neural network based on the sample image sets for each sample image set to obtain a corresponding segmentation model.
In particular, for each set of sample images, the image processing device may train the second neural network multiple times with the set of sample images to obtain a corresponding segmentation model.
It should be understood that, similar to the training method of the classification model, the image processing apparatus may retain the model parameters after the completion of each round of training when training the segmentation model, evaluate the segmentation effect of the segmentation model under each model parameter after the completion of training of the preset round number, and use the model parameters that can optimize the segmentation effect of the segmentation model among the plurality of model parameters according to the evaluation result as the final model parameters of the segmentation model.
The image processing apparatus can determine the division effect of the division model by calculating not only the global accuracy of the division model, the classification accuracy for each class, and the average accuracy, but also the similarity metric function of the division model, that is, the Dice coefficient, the intersection ratio (Intersection over Union, ioU), and the average intersection ratio (Mean Intersection over Union, MIoU), as compared to the classification model.
Wherein the accuracy of the segmentation model is for each pixel point compared to the classification model. Specifically, the global accuracy of the segmentation model refers to the ratio of the pixel points to the total number of the pixel points, the classification accuracy refers to the ratio of the pixel points accurately predicted to be the current class to the actual pixel points of the class, the average accuracy refers to the sum of the classification accuracy of the foreground pixel points and the classification accuracy of the background pixel points divided by two, the Dice coefficient refers to the ratio of the sum of two times of the pixel points accurately predicted to be the current class to the sum of the actual pixel points of the class and the predicted class, and the intersection ratio refers to the ratio of the intersection of the actual pixel points of the class and the predicted pixel points of the class to the union, and the average intersection ratio is the sum of the intersection ratio of the pixel points of the foreground and the background pixel points divided by two.
Alternatively, a loss function is often used to compare a metric value of a difference between a predicted output of the machine learning model for a sample and a true value of the sample (which may also be referred to as a supervision value), i.e., to measure a difference between the predicted output of the machine learning model for the sample and the true value of the sample, and a specific value of the loss function may be used to determine an adjustment amplitude of a model parameter in model training.
In this embodiment, the loss function (loss function) of each of the partition models may employ a cross entropy loss function, where the cross entropy loss function L is calculated according to the following formula:
wherein N is used for representing the total number of pixel points in one image, yi is used for representing the label of the ith pixel in the image, W i Weight for representing ith pixel point in image, p i Used for representing the prediction probability of belonging to the segmentation target for the ith pixel point.
Optionally, the W i Specifically, the reciprocal of the actual pixel number of the category to which the ith pixel belongs may be used.
Optionally, the image processing device may further perform image preprocessing on the plurality of sample images before training the first neural network or the second neural network. In particular, in order to increase the adaptation ability of the trained model to enhance its robustness, the image processing apparatus may randomly adjust the training data of the model in advance before training the model. The image preprocessing may include any one or combination of randomly adjusting the hue of the image, randomly adjusting the saturation and brightness (Hue Saturation Value, HSV) of the image, or randomly adjusting the flipping, scaling, and cropping of the image. It should be understood that, after performing image preprocessing, the image processing apparatus needs to resize the image of each sample image to a prescribed size, for example, 512 pixels by 512 pixels.
S500, identifying the pixel region where the object to be segmented is located in the image data according to the feature mask map.
Specifically, after determining the feature mask map, the image processing apparatus may identify, according to the feature mask map, a pixel region in the image data where the object to be segmented is located.
Alternatively, in step S500, since the feature mask map is the same size as the input image data, the image processing apparatus may determine the contour of the object to be segmented in the image data from the feature mask map. After determining the outline of the object to be segmented, the image processing device may render the pixel region within the outline to identify the pixel region in which the object to be segmented is located in the image data.
Fig. 5 is a schematic diagram of an identification process of image data according to an embodiment of the present invention. As shown in fig. 5, the feature mask graphs 511, 521, and 531 are feature mask graphs output by the respective segmentation models.
The classes to which the objects to be segmented in the feature mask maps 511, 521 and 531 belong are the prostate capsule, the vernix and the bladder neck, respectively. The white areas composed of foreground pixels in the feature mask graphs 511, 521 and 531 are used for representing the areas where the objects to be segmented are located, and the black areas composed of background pixels are used for representing the background areas except the objects to be segmented.
After obtaining the feature mask map, the image processing apparatus may determine, according to the feature mask map, contours of the object to be segmented in the corresponding image data, as shown by contours 5121, 5221, and 5321 in the image data 512, 522, and 532. After determining the contour of the object to be segmented in the image data, the image processing device may identify the pixel region in which the object to be segmented is located in the image data according to the contour, so that identified image data may be obtained, as shown by identified image data 513, 523 and 533.
Optionally, after identifying the pixel region where the object to be segmented is located in each image data, the image processing device may re-synthesize each identified image data into video data according to the timestamp corresponding to each image data and output and display the video data.
It should be understood that, when the image data is a single image data acquired by the image processing apparatus, the image processing apparatus may directly output and present the identified single image data.
According to the image processing method, the target category of the object to be segmented in the image data is determined in a plurality of preset categories based on the pre-trained classification model, the target segmentation model corresponding to the target category is determined in a plurality of pre-trained segmentation models, the feature mask map corresponding to the object to be segmented is generated based on the target segmentation model, and the pixel region where the object to be segmented in the image data is located is marked according to the feature mask map.
Fig. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention. As shown in fig. 6, the image processing apparatus of the embodiment of the present invention includes an acquisition unit 61, a category determination unit 62, a model selection unit 63, a segmentation unit 64, and an identification unit 65.
Specifically, the acquiring unit 61 is configured to acquire image data to be processed;
the class determining unit 62 is configured to input the image data into a pre-trained classification model, so as to determine, from a plurality of preset classes, a target class to which an object to be segmented in the image data belongs;
the model selecting unit 63 is configured to determine a target segmentation model corresponding to the target class from a plurality of segmentation models trained in advance, where each segmentation model is used for performing image segmentation on an object to be segmented corresponding to a preset class;
the segmentation unit 64 is configured to input the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented;
the identifying unit 65 is configured to identify, according to the feature mask map, a pixel region in which the object to be segmented is located in the image data.
According to the image processing device, the target category of the object to be segmented in the image data is determined in a plurality of preset categories based on the pre-trained classification model, the target segmentation model corresponding to the target category is determined in a plurality of pre-trained segmentation models, the feature mask map corresponding to the object to be segmented is generated based on the target segmentation model, and the pixel region where the object to be segmented in the image data is located is marked according to the feature mask map.
Fig. 7 is a schematic diagram of an electronic device according to an embodiment of the invention. As shown in fig. 7, the electronic device is a general-purpose data processing apparatus including a general-purpose computer hardware structure including at least a processor 71 and a memory 72. The processor 71 and the memory 72 are connected by a bus 73. The memory 72 is adapted to store instructions or programs executable by the processor 71. The processor 71 may be a separate microprocessor or a collection of one or more microprocessors. Thus, the processor 71 performs the process flow of the embodiment of the present invention described above to realize the processing of data and the control of other devices by executing the instructions stored in the memory 72. Bus 73 connects the above components together, as well as to display controller 74 and display devices and input/output (I/O) devices 75. Input/output (I/O) devices 75 may be a mouse, keyboard, modem, network interface, touch input device, somatosensory input device, printer, and other devices known in the art. Typically, an input/output device 75 is connected to the system through an input/output (I/O) controller 76.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, apparatus (device) or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may employ a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each of the flows in the flowchart may be implemented by computer program instructions.
These computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows.
Another embodiment of the present invention is directed to a non-volatile storage medium storing a computer readable program for causing a computer to perform some or all of the method embodiments described above.
That is, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the embodiments described above may be implemented by specifying relevant hardware by a program, where the program is stored in a storage medium, and includes several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps in the methods of the embodiments described herein. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations may be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. An image processing method, the method comprising:
acquiring image data to be processed;
inputting the image data into a pre-trained classification model to determine the target category of an object to be segmented in the image data from a plurality of preset categories;
determining a target segmentation model corresponding to the target category from a plurality of segmentation models trained in advance, wherein each segmentation model is respectively used for carrying out image segmentation on an object to be segmented corresponding to a preset category;
inputting the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented;
and identifying the pixel region where the object to be segmented is located in the image data according to the feature mask map.
2. The method of claim 1, wherein the predetermined categories include mons tendineus, prostate capsule, and bladder neck.
3. The method according to claim 1, wherein the method further comprises:
acquiring a plurality of sample images, wherein each sample image is provided with a corresponding area tag and a category tag, the area tag is used for marking a pixel area where an object to be segmented is located in the sample image, and the category tag is used for marking a category to which the object to be segmented belongs in the sample image;
training a first neural network based on the plurality of sample images to obtain the classification model.
4. A method according to claim 3, characterized in that the method further comprises:
dividing the plurality of sample images into a plurality of sample image sets according to the category labels, wherein the sample images in each sample image set have the same category label;
for each set of sample images, training a second neural network based on the set of sample images to obtain a corresponding segmentation model.
5. The method of claim 4, wherein each of the segmentation models is obtained based on cross entropy loss function training;
wherein the cross entropy loss function L is calculated according to the following formula:
wherein N is used for representing the total number of pixel points in one image, yi is used for representing the label of the ith pixel in the image, W i Weight for representing ith pixel point in image, p i Used for representing the prediction probability of belonging to the segmentation target for the ith pixel point.
6. The method of claim 1, wherein the identifying the object to be segmented in the image data according to the feature mask map comprises:
determining the outline of the object to be segmented in the image data according to the feature mask map;
and rendering the pixel areas in the outline.
7. The method of any one of claims 3 or 4, wherein prior to training the first neural network or the second neural network, the method further comprises:
and performing image preprocessing on the plurality of sample images.
8. An image processing apparatus, characterized in that the apparatus comprises:
an acquisition unit configured to acquire image data to be processed;
a category determining unit, configured to input the image data into a pre-trained classification model, so as to determine, from a plurality of preset categories, a target category to which an object to be segmented in the image data belongs;
the model selection unit is used for determining a target segmentation model corresponding to the target category from a plurality of segmentation models trained in advance, wherein each segmentation model is respectively used for carrying out image segmentation on an object to be segmented corresponding to a preset category;
a segmentation unit for inputting the image data into the target segmentation model to generate a feature mask map corresponding to the object to be segmented;
and the identification unit is used for identifying the pixel region where the object to be segmented is located in the image data according to the feature mask map.
9. A computer readable storage medium, on which computer program instructions are stored, which computer program instructions, when executed by a processor, implement the method of any of claims 1-7.
10. An electronic device, the device comprising:
a memory for storing one or more computer program instructions;
a processor, the one or more computer program instructions being executed by the processor to implement the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310494634.6A CN116385427A (en) | 2023-05-05 | 2023-05-05 | Image processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310494634.6A CN116385427A (en) | 2023-05-05 | 2023-05-05 | Image processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116385427A true CN116385427A (en) | 2023-07-04 |
Family
ID=86967604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310494634.6A Pending CN116385427A (en) | 2023-05-05 | 2023-05-05 | Image processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116385427A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180315188A1 (en) * | 2017-04-21 | 2018-11-01 | General Electric Company | Automated organ risk segmentation machine learning methods and systems |
CN110059697A (en) * | 2019-04-29 | 2019-07-26 | 上海理工大学 | A kind of Lung neoplasm automatic division method based on deep learning |
CN110570432A (en) * | 2019-08-23 | 2019-12-13 | 北京工业大学 | CT image liver tumor segmentation method based on deep learning |
CN112330731A (en) * | 2020-11-30 | 2021-02-05 | 深圳开立生物医疗科技股份有限公司 | Image processing apparatus, image processing method, image processing device, ultrasound system, and readable storage medium |
CN115761365A (en) * | 2022-11-28 | 2023-03-07 | 首都医科大学附属北京友谊医院 | Intraoperative hemorrhage condition determination method and device and electronic equipment |
-
2023
- 2023-05-05 CN CN202310494634.6A patent/CN116385427A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180315188A1 (en) * | 2017-04-21 | 2018-11-01 | General Electric Company | Automated organ risk segmentation machine learning methods and systems |
CN110059697A (en) * | 2019-04-29 | 2019-07-26 | 上海理工大学 | A kind of Lung neoplasm automatic division method based on deep learning |
CN110570432A (en) * | 2019-08-23 | 2019-12-13 | 北京工业大学 | CT image liver tumor segmentation method based on deep learning |
CN112330731A (en) * | 2020-11-30 | 2021-02-05 | 深圳开立生物医疗科技股份有限公司 | Image processing apparatus, image processing method, image processing device, ultrasound system, and readable storage medium |
CN115761365A (en) * | 2022-11-28 | 2023-03-07 | 首都医科大学附属北京友谊医院 | Intraoperative hemorrhage condition determination method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4931027B2 (en) | Medical image diagnosis support apparatus and method, and program | |
JP2023520846A (en) | Image processing method, image processing apparatus, computer program and computer equipment based on artificial intelligence | |
CN110929728B (en) | Image region-of-interest dividing method, image segmentation method and device | |
US20040264749A1 (en) | Boundary finding in dermatological examination | |
CN110263755B (en) | Eye ground image recognition model training method, eye ground image recognition method and eye ground image recognition device | |
CN107464234B (en) | Lung nodule image deep learning identification system based on RGB channel superposition method and method thereof | |
CN108062749B (en) | Identification method and device for levator ani fissure hole and electronic equipment | |
CN111815606B (en) | Image quality evaluation method, storage medium, and computing device | |
CN110097557B (en) | Medical image automatic segmentation method and system based on 3D-UNet | |
CN111508016B (en) | Vitiligo region chromaticity value and area calculation method based on image processing | |
CN113313680B (en) | Colorectal cancer pathological image prognosis auxiliary prediction method and system | |
KR20220001985A (en) | Apparatus and method for diagnosing local tumor progression using deep neural networks in diagnostic images | |
CN114723739B (en) | Blood vessel segmentation model training data labeling method and device based on CTA image | |
CN113570619A (en) | Computer-aided pancreas pathology image diagnosis system based on artificial intelligence | |
WO2010035518A1 (en) | Medical image processing apparatus and program | |
CN117409002A (en) | Visual identification detection system for wounds and detection method thereof | |
WO2022160731A1 (en) | Image processing method and apparatus, electronic device, storage medium, and program | |
CN116779093B (en) | Method and device for generating medical image structured report and computer equipment | |
CN111862118B (en) | Pressure sore staging training method, staging method and staging system | |
JP2022147713A (en) | Image generation device, learning device, and image generation method | |
CN116385427A (en) | Image processing method and device | |
CN110910409B (en) | Gray image processing method, device and computer readable storage medium | |
CN115147360B (en) | Plaque segmentation method and device, electronic equipment and readable storage medium | |
CN111292299A (en) | Mammary gland tumor identification method and device and storage medium | |
KR20210060895A (en) | Apparatus for diagnosis of chest X-ray employing Artificial Intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |