WO2021174739A1 - 神经网络训练方法及装置、电子设备和存储介质 - Google Patents
神经网络训练方法及装置、电子设备和存储介质 Download PDFInfo
- Publication number
- WO2021174739A1 WO2021174739A1 PCT/CN2020/100715 CN2020100715W WO2021174739A1 WO 2021174739 A1 WO2021174739 A1 WO 2021174739A1 CN 2020100715 W CN2020100715 W CN 2020100715W WO 2021174739 A1 WO2021174739 A1 WO 2021174739A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image block
- feature extraction
- sample image
- target area
- block
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 116
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 108
- 238000000034 method Methods 0.000 title claims abstract description 95
- 238000003860 storage Methods 0.000 title claims abstract description 31
- 238000003062 neural network model Methods 0.000 claims abstract description 14
- 238000000605 extraction Methods 0.000 claims description 225
- 238000012545 processing Methods 0.000 claims description 95
- 230000011218 segmentation Effects 0.000 claims description 92
- 208000009956 adenocarcinoma Diseases 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 30
- 238000002059 diagnostic imaging Methods 0.000 claims description 20
- 201000007490 Adenocarcinoma in Situ Diseases 0.000 claims description 17
- 230000007170 pathology Effects 0.000 claims description 17
- 230000001575 pathological effect Effects 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 13
- 206010028980 Neoplasm Diseases 0.000 claims description 8
- 201000011510 cancer Diseases 0.000 claims description 8
- 238000011065 in-situ storage Methods 0.000 claims description 7
- 208000003200 Adenoma Diseases 0.000 claims description 6
- 206010001233 Adenoma benign Diseases 0.000 claims description 6
- 206010020718 hyperplasia Diseases 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 210000004907 gland Anatomy 0.000 claims 2
- 210000004072 lung Anatomy 0.000 description 35
- 206010054107 Nodule Diseases 0.000 description 32
- 230000003902 lesion Effects 0.000 description 21
- 238000010586 diagram Methods 0.000 description 20
- 238000005516 engineering process Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 15
- 238000002372 labelling Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 238000001514 detection method Methods 0.000 description 8
- 238000010606 normalization Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000009471 action Effects 0.000 description 5
- 230000004913 activation Effects 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 5
- 230000008034 disappearance Effects 0.000 description 4
- 208000020816 lung neoplasm Diseases 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 239000005337 ground glass Substances 0.000 description 3
- 201000005202 lung cancer Diseases 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 206010035664 Pneumonia Diseases 0.000 description 2
- 206010056342 Pulmonary mass Diseases 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000002591 computed tomography Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003211 malignant effect Effects 0.000 description 2
- 229910044991 metal oxide Inorganic materials 0.000 description 2
- 150000004706 metal oxides Chemical class 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004195 computer-aided diagnosis Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 231100000517 death Toxicity 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000005251 gamma ray Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 231100000915 pathological change Toxicity 0.000 description 1
- 230000036285 pathological change Effects 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 230000002685 pulmonary effect Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Definitions
- This application relates to the field of computer technology, in particular to a neural network training method and device, electronic equipment and storage medium.
- Machine learning are widely used in the field of image processing. For example, they can be applied to the classification and image detection of ordinary images or three-dimensional images. For example, in the processing of medical images, machine learning methods can be used to determine the type of disease and detect the diseased area.
- lung medical images for example, lung computed tomography (Computed Tomography, CT)
- CT computed Tomography
- GGN ground-glass nodules
- the embodiments of the present application provide a neural network training method and device, electronic equipment, and storage medium.
- the embodiment of the application provides a neural network training method.
- the neural network training method is used to train a neural network model and classify images according to the neural network model obtained by training.
- the method includes: obtaining the position of the target region in the sample image Information and category information; according to the position information of the target area in the sample image, at least one sample image block is obtained by segmentation; according to the category information, the at least one sample image block is classified to obtain N types of sample image blocks, N It is an integer and N ⁇ 1; the N-type sample image blocks are input into the neural network for training.
- fine classification of sample image blocks can be obtained, and the neural network can be trained, so that the neural network can perform fine classification of images, and improve classification efficiency and accuracy.
- the sample image is a medical imaging picture.
- the acquiring the location information and category information of the target area in the sample image includes: locating the target area on the medical imaging picture to obtain the location information of the target area; A pathology picture associated with a medical image picture, where the pathology picture is a diagnosed picture that includes pathological information; according to the pathological information of each target area on the pathology picture, the target area on the medical image picture is determined Category information.
- inputting the N types of sample image blocks into the neural network for training includes: inputting any sample image block into the neural network for processing to obtain the category prediction information of the sample image block and Predict the target area; determine the classification loss according to at least the category prediction information and the category information of the sample image block; determine the segmentation loss according to the location information of the prediction target area and the sample image block; according to the classification loss And the segmentation loss, training the neural network.
- determining the classification loss according to the category prediction information and the category information of the sample image block includes: determining the first category according to the category prediction information and the category information of the sample image block Loss; Determine the second classification loss according to the category prediction information and the category information of the class center of the category to which the sample image block belongs; Perform weighted summation processing on the first classification loss and the second classification loss to obtain The classification loss.
- the category features of the sample image blocks of the same category can be more concentrated during training, and the feature distance between the category information of the sample image blocks of different categories can be larger, which helps to improve the classification performance and improve the classification accuracy. .
- determining the segmentation loss according to the position information of the prediction target area and the sample image block includes: according to the number of pixels of the prediction target area in the sample image block. A ratio, determining the first weight of the prediction target area and the second weight of the sample background area in the sample image block; according to the first weight, the second weight, the prediction target area, and the sample image block The location information to determine the segmentation loss.
- the first weight of the prediction target area and the sample background in the sample image block are determined according to the first proportion of the number of pixels of the prediction target area in the sample image block
- the second weight of the region includes: determining the second proportion of the sample background area in the sample image block according to the first proportion of the number of pixels of the prediction target region in the sample image block; The second ratio is determined as the first weight, and the first ratio is determined as the second weight.
- the error of the target area and the error of the non-target area can be balanced, which is conducive to the optimization of network parameters, and improves the training efficiency and training effect.
- the category information includes: pre-invasive adenocarcinoma atypical adenoma nodules, in situ adenocarcinoma nodules, minimally invasive adenocarcinoma nodules, and invasive adenocarcinoma nodules.
- the neural network includes a shared feature extraction network, a classification network, and a segmentation network.
- the method further includes: inputting the image block to be processed into the shared feature extraction network for processing to obtain the to-be-processed image block
- the target feature of the image block where the shared feature extraction network includes M shared feature extraction blocks, the input feature of the i-th shared feature extraction block includes the output features of the first i-1 shared feature extraction blocks, i and M are Integer and 1 ⁇ i ⁇ M; input the target feature into the classification network for classification processing to obtain the category information of the image block to be processed; input the target feature into the segmentation network for segmentation processing to obtain the The target area in the image block to be processed.
- the shared feature extraction block of the shared feature extraction network can obtain the output features of all previous shared feature extraction blocks, and input its own output features to all subsequent shared feature extractions. Piece. It can strengthen the gradient flow in the network, alleviate the phenomenon of gradient disappearance, and improve the feature extraction and learning capabilities at the same time, which is conducive to finer classification and segmentation of the input image blocks to be processed. In addition, finer category information and target areas of the image blocks to be processed can be obtained, which improves image processing efficiency.
- inputting the image block to be processed into the shared feature extraction network for processing to obtain the target feature of the image block to be processed includes: performing a first feature extraction process on the image block to be processed, Obtain the first feature of the image block to be processed; input the first feature into the first shared feature extraction block, obtain the output feature of the first shared feature extraction block, and combine the first shared feature
- the output features of the extraction block are output to the subsequent M-1 shared feature extraction blocks; the output features of the first j-1 shared feature extraction blocks are input to the j-th shared feature extraction block to obtain the j-th shared feature extraction block
- the output feature of the block where j is an integer and 1 ⁇ j ⁇ M; the output feature of the M-th shared feature extraction block is subjected to the second feature extraction process to obtain the second feature of the image block to be processed;
- the second feature is pooled to obtain the target feature.
- the method further includes: preprocessing the image to be processed to obtain the first image; positioning the target area on the first image to determine the target area in the first image Location information; according to the location information of the target area in the first image, at least one image block to be processed is obtained by segmentation.
- An embodiment of the application provides a neural network training device, the neural network training device is used to train a neural network model, and classify images according to the neural network model obtained by the training, the device includes: an acquisition module configured to acquire a sample image The location information and category information of the target area in the sample image; the first segmentation module is configured to segment to obtain at least one sample image block according to the location information of the target area in the sample image; The at least one sample image block is classified to obtain N types of sample image blocks, where N is an integer and N ⁇ 1; the training module is configured to input the N types of sample image blocks into the neural network for training.
- An embodiment of the present application provides an electronic device, including: a processor; a memory configured to store a computer program executable by the processor; wherein the processor is configured to execute the above neural network training method through the computer program.
- An embodiment of the present application provides a storage medium in which a computer program is stored, and the computer program is configured to execute the above neural network training method when running.
- An embodiment of the present application provides a computer program, including computer-readable code, and when the computer-readable code runs in an electronic device, the processor execution in the electronic device is configured to implement the neural network training method described above.
- FIG. 1 is a schematic diagram of a system architecture of a neural network training method provided by an embodiment of the present application
- Fig. 2 is an implementation flowchart of a neural network training method provided by an embodiment of the present application
- FIG. 3 is a schematic diagram of an application of a neural network training method provided by an embodiment of the present application
- FIG. 4 is a schematic diagram of a neural network training device provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of an electronic device provided by an embodiment of the present application.
- Fig. 6 is a schematic diagram of another electronic device provided by an embodiment of the present application.
- methods such as machine learning are widely used in the field of image processing. For example, they can be applied to the classification and image detection of ordinary images or three-dimensional images.
- Lung cancer is one of the most common malignant tumors in my country. Its mortality rate is the first among all cancer deaths, whether in urban or rural areas, males or females. Among them, adenocarcinoma accounts for about 40% of all lung cancers. Using medical images (for example, lung CT and low-dose spiral CT) for screening, more and more early lung adenocarcinomas are found and manifested as Ground-Glass Nodule (GGN).
- GGN Ground-Glass Nodule
- Adenocarcinoma is divided into Atypical Adenomatous Hyperplasia Of Preinvasive Adenocarcinoma (AAHOPA), Adenocarcinoma In Situ (AIS), Minimally Invasive Adenocarcinoma (MIAcarcinoma) and Invasive Adenocarcinoma (Invasive Adenocarcinoma) , IA).
- the GGN categories of adenocarcinoma include pre-invasive adenocarcinoma atypical adenocarcinoma nodules, in situ adenocarcinoma nodules, minimally invasive adenocarcinoma nodules and invasive adenocarcinoma nodules.
- the survival period will decrease significantly, which indicates that early detection and diagnosis is an effective and vital method to reduce patient mortality. Therefore, early detection of aggressive features before surgery will be clinically important and can provide guidance for clinical decision-making.
- lung medical images for example, lung CT
- machine learning and other methods can be used to determine the type of disease and detect the diseased area. For example, it is possible to predict whether the image of the input nodule is a malignant tumor or a benign tumor. However, there is no detailed classification of the prediction results in the related technology. .
- computer-aided diagnosis based on artificial intelligence is a more effective method to assess the invasiveness of nodules, and is expected to play an important role in clinical evaluation tasks.
- FIG. 1 is a schematic diagram of a system architecture of a neural network training method provided by an embodiment of the present application.
- the system architecture includes a CT instrument 100, a server 200, a network 300, and a terminal device 400.
- the CT instrument 100 can be connected to the terminal device 400 through the network 300, and the terminal device 400 is connected to the server 200 through the network 300.
- the CT instrument 100 can be used to collect CT images, for example, an X-ray CT instrument or a gamma-ray CT instrument, etc.
- a terminal that can scan a certain thickness of a certain part of the human body.
- the terminal device 400 may be a device with a screen display function, such as a notebook computer, a tablet computer, a desktop computer, or a dedicated message device.
- the network 300 may be a wide area network or a local area network, or a combination of the two, and uses wireless links to implement data transmission.
- the server 200 can cut each pathologically proven lung nodule area in the acquired training medical image into a small image block through the designed three-dimensional classification framework, and then the image The blocks are classified, the training data is obtained, and the training data is input to the neural network for training, so that the neural network finely classifies the training medical image pictures, and the trained neural network model is obtained after the training is completed.
- the medical image picture may be a CT image of the lungs of a patient or a medical examiner collected by the CT instrument 100 of a hospital, a medical examination center, and the like.
- the server 200 may obtain the medical image picture collected by the CT machine 100 from the terminal device 400 as a training medical image picture, may also obtain a training medical image picture from the CT machine, and may also obtain a training medical image picture from the Internet.
- the server 200 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server based on cloud technology.
- Cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and network within a wide area network or a local area network to realize the calculation, storage, processing, and sharing of data.
- the provided artificial intelligence cloud service may include a neural network model, and the neural network is trained based on the finely classified training data, so that the neural network can finely classify medical image pictures.
- the server 200 receives the medical image picture to be processed (for example, a lung CT image)
- the medical image picture is classified and segmented according to the trained neural network to obtain a finely classified lesion area. Then, the server 200 returns the obtained finely classified lesion area to the terminal device 400 for display, so that the medical staff can view it.
- the trained neural network can be sent to the terminal device 400, and the terminal device 400 classifies the collected medical image pictures (eg, lung CT images) , Segmentation, etc., to obtain a finely classified lesion area, and display the obtained finely classified lesion area on its own display screen for medical staff to view.
- the collected medical image pictures eg, lung CT images
- Segmentation etc.
- the system architecture of the neural network training method includes the CT instrument 100, the network 300, and the terminal device 400.
- the terminal device 400 trains the training medical image pictures to obtain a trained neural network, and then The terminal device 400 performs classification, segmentation and other processing on the collected medical image pictures (such as lung CT images) to obtain a finely classified lesion area, and performs the finely classified lesion area on its own display screen. Display for medical staff to view.
- the embodiment of the application provides a neural network training method, the method is applied to a neural network training device, the neural network training device may be a server, used to train a neural network model, and classify images according to the trained neural network model .
- the method provided in the embodiment of the present application may be implemented by a computer program, and when the computer program is executed, each step in the neural network training method provided in the embodiment of the present application is completed.
- the computer program may be executed by a processor.
- Fig. 2 is an implementation flowchart of a neural network training method provided by an embodiment of the present application. As shown in Fig. 2, the method includes:
- Step S11 acquiring location information and category information of the target area in the sample image
- Step S12 According to the position information of the target area in the sample image, at least one sample image block is obtained by segmentation;
- Step S13 Classify the at least one sample image block according to the category information to obtain N types of sample image blocks, where N is an integer and N ⁇ 1;
- Step S14 Input the N-type sample image blocks into the neural network for training.
- fine classification of sample image blocks can be obtained, and the neural network can be trained, so that the neural network can perform fine classification of images, and improve classification efficiency and accuracy.
- the neural network training method may be executed by terminal equipment or other processing equipment, where the terminal equipment may be User Equipment (UE), mobile equipment, user terminal, terminal, cellular phone, Cordless phones, personal digital assistants (PDAs), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
- UE User Equipment
- PDA personal digital assistant
- the neural network training method may be implemented by a processor calling a computer program stored in a memory.
- the sample image is a medical imaging picture, for example, a lung CT image.
- the sample image block may be an image block including the target area in the sample image.
- the sample image may be a three-dimensional medical image that has been annotated (for example, category annotation and segmentation annotation), and the sample image block may be an image block containing nodules in the three-dimensional medical image.
- the position information and category information of the target area in the sample image can be determined to obtain sample image blocks for training the neural network, and label the sample image blocks.
- Step S11 may include: locating a target area on a medical imaging picture to obtain position information of the target area; obtaining pathology pictures associated with the medical imaging picture; The pathological information determines the category information of the target area on the medical image picture.
- the pathology picture is a diagnosed picture that includes pathology information, which can be obtained from a medical image database, or sent to the neural network training device after being manually annotated by a professional such as a doctor on the terminal.
- the sample image may be resampled to obtain a three-dimensional image with a resolution of 1 ⁇ 1 ⁇ 1. And segment the three-dimensional image.
- segment the three-dimensional image.
- the target area for example, the lesion area
- the target area in the normalized three-dimensional image can be located to obtain the position information of the target area.
- the location information of the target area may be determined by the convolutional neural network used for positioning, or the location information of the target area may be confirmed by professionals such as doctors, etc.
- the embodiment of the present application does not limit the positioning method.
- the medical imaging picture may have related pathological pictures, which can be used to determine the type of the lesion in the medical imaging picture.
- the type of the lesion may include Ground-Glass Nodule (GGN).
- GGN Ground-Glass Nodule
- Adenocarcinoma is divided into Atypical Adenomatous Hyperplasia Of Preinvasive Adenocarcinoma (AAHOPA), Adenocarcinoma In situ (AIS), Minimally Invasive Adenocarcinoma (MIA) and Invasive Adenocarcinoma Cancer (Invasive Adenocarcinoma, IA), the examples of this application do not limit the types of lesions.
- pathological information of each target area can be obtained based on pathological pictures.
- pathological pictures may be pictures after professional diagnosis, may have analysis descriptions of each lesion, and may be based on pathological pictures. Obtain the pathological information of each target area, and then determine the category information of each target area on the medical imaging picture.
- image blocks including the lesion area may be cropped from the medical image picture, that is, sample image blocks are cropped, and N types of sample image blocks are obtained according to the category information of the target area. For example, after statistics on the size of the nodules, the size of the sample image block can be determined as 64 ⁇ 64 ⁇ 64, and after cropping and classification, four types (AAHOPA, AIS, MIA, and IA) sample image blocks can be obtained.
- the sample image block can be rotated, translated, mirrored, zoomed, etc., and the number of samples can be amplified. Moreover, using the amplified sample image block to train the neural network can improve the generalization ability of the neural network and prevent overfitting. In some embodiments of the present application, the positive and negative samples can also be balanced.
- pre-invasive adenocarcinoma pre-invasive adenocarcinoma, atypical adenoma hyperplasia, in situ adenocarcinoma, minimally invasive adenocarcinoma, and other benign nodules, and invasive adenocarcinoma, and other malignant nodules
- a large gap between the number of samples and a smaller number of samples can be amplified by the above method to balance the number of positive and negative samples.
- the embodiment of the application does not limit the manner of amplifying the number of samples.
- sample image blocks may be input to the neural network in batches.
- step S14 may include: inputting any sample image block into the neural network for processing to obtain category prediction information and prediction target area of the sample image block; at least according to the category prediction information and the category of the sample image block Information, determine the classification loss; determine the segmentation loss according to the location information of the prediction target area and the sample image block; and train the neural network according to the classification loss and the segmentation loss.
- the neural network may include a shared feature extraction network, a classification network, and a segmentation network.
- the sample image block can be feature extracted through the shared feature extraction network to obtain the sample target feature of the sample image block, and the category prediction information of the sample image block can be obtained through the classification network.
- the category prediction information may have errors, and the category information and The category labeling information of the sample image block determines the classification loss of the neural network.
- determining the classification loss according to the category prediction information and the label information of the sample image block includes: determining the first category according to the category prediction information and the label information of the sample image block Loss; Determine the second classification loss according to the category prediction information and the category information of the class center of the category to which the sample image block belongs; Perform weighted summation processing on the first classification loss and the second classification loss to obtain The classification loss.
- the labeling information of the sample image block may include category labeling information.
- the category labeling information may be information indicating the category of the nodule in the sample image block.
- the category prediction information may be category information expressed in the form of a vector, etc. The probability distribution of the image block to be processed represented by the vector belonging to each category can be determined through a probability dictionary, etc., and then the category of the image block to be processed can be determined .
- the vector of category prediction information may directly represent the probability of the image block to be processed.
- each element of the vector represents the probability of the category of the image block to be processed.
- the first classification loss can be determined according to the category prediction information and the category labeling information of the sample image block.
- the feature distance between the vector of category prediction information and the vector of category labeling information can be determined (for example, Euclidean distance, cosine distance, etc.), and determines a first classification according to the characteristic distance L sm loss, e.g., loss can be calculated according to the first classification softmaxloss loss function L sm.
- the first classification loss L sm can be determined by the following formula (1):
- x i represents the category prediction information of the i-th sample image block
- y i represents the category to which the i-th sample image block belongs
- n represents the number of categories.
- m represents the number of sample image blocks input to the neural network in each batch
- b j represents the offset item of the j-th category.
- the above-mentioned first classification loss is used for training, which can expand the feature distance between classes of class information of different classes, so that the classification network can distinguish sample image blocks of different classes.
- the difference between multiple types of nodules in the lung is not obvious (for example, the shape of the nodules of in situ adenocarcinoma and minimally invasive adenocarcinoma is not very different), and the shapes of the two nodules of the same type are different (for example, , Invasive adenocarcinoma and other malignant nodules have different shapes), which results in small feature distances between category information and large intra-class feature distances, resulting in poor classification results of the classification network trained using only the first classification loss L sm .
- the classification network can be trained through the second classification loss.
- the category information of the class centers of each category in the multiple sample image blocks can be determined.
- the category information of the class centers of the multiple sample image blocks can be weighted average, or the category information of the sample image blocks can be aggregated.
- the embodiment of the present application does not limit the class information of the class center.
- the second classification loss may be determined according to the category prediction information of the sample image block and the category label information of the category center of the category to which it belongs.
- the distance between the features may be determined category information category and class prediction information center and determines second classification according to the characteristic distance L ct loss, e.g., loss can be calculated according to a second classification centerloss loss function L ct.
- Training the classification network through the second classification loss L ct can reduce the intra-class feature distance of the class information of similar sample image blocks, so that the same feature information is more concentrated in the feature space, which is beneficial to determine the class of the sample image block.
- the second classification loss L ct can be determined by the following formula (2):
- the first classification loss and the second classification loss can be used to jointly determine the classification loss.
- the first classification loss and the second classification loss can be weighted and summed to obtain the classification loss.
- the weight ratio of the first classification loss and the second classification loss is 1:0.8, and the classification loss can be obtained after the weighted summation is performed according to the above weight ratio.
- the embodiment of the present application does not limit the weight ratio.
- the category features of the sample image blocks of the same category can be more concentrated in the training, so that the distance between the category information of the sample image blocks of different categories is larger, which helps to improve the classification performance and improve the classification accuracy.
- the sample target feature can be segmented through a segmentation network to obtain the prediction target region in the sample image block.
- the prediction target area may have an error
- the segmentation loss can be determined according to the error between the prediction target area and the labeled target area of the sample image block, and then training is performed through the segmentation loss.
- determining the segmentation loss according to the annotation information of the prediction target area and the sample image block includes: according to the number of pixels of the prediction target area in the sample image block. A ratio, determining the first weight of the prediction target area and the second weight of the sample background area in the sample image block; according to the first weight, the second weight, the prediction target area, and the sample image block The labeling information of, determines the segmentation loss.
- the labeling information includes the labelled segmentation area
- the segmentation loss can be determined directly according to the error between the prediction target area and the labelled segmentation area.
- the diameter of the nodules is usually between 5 millimeters (mm) and 30mm.
- the area where the nodules are located and other areas in the sample image block has a large difference in proportion, resulting in pixels between the target area and the non-target area.
- the imbalance of the number can make the error of the prediction target area account for a small proportion of the segmentation loss, which is not conducive to the optimization and adjustment of the neural network, resulting in low training efficiency and poor training effect.
- the pixels in the target area and the pixels in the non-target area may be weighted according to the weighting process.
- the first weight of the prediction target area and the second weight of the sample background area in the sample image block may be determined according to the first proportion of the number of pixels of the prediction target area in the sample image block.
- the pixels of the above two regions are weighted to balance the loss of the target region and the loss of the non-target region.
- the first weight of the prediction target area and the sample background in the sample image block are determined according to the first proportion of the number of pixels of the prediction target area in the sample image block
- the second weight of the region includes: determining the second proportion of the sample background area in the sample image block according to the first proportion of the number of pixels of the prediction target region in the sample image block; The second ratio is determined as the first weight, and the first ratio is determined as the second weight.
- the sample image block may include a prediction target area and a background area, and the proportion of the number of pixels in the prediction target area can be counted to determine the proportion of the sample background area. For example, if the first ratio of the number of pixels in the prediction target area is 0.2, the second ratio of the number of pixels in the sample background area is 0.8. The embodiment of the application does not limit the first ratio and the second ratio.
- the second ratio is determined as the first weight of the predicted target area, and the first ratio is determined as the second weight of the sample background area. For example, if the first ratio of the number of pixels in the prediction target area is 0.2, the first weight of the prediction target area is 0.8, and the second ratio of the number of pixels in the sample background area is 0.8, and the second weight of the sample background area is 0.2.
- the segmentation loss may be determined according to the first weight, the second weight, the prediction target area, and the labeling target area of the sample image block.
- the segmentation loss can be determined based on the difference between the predicted target area and the target area in the labeling information.
- the pixels in the predicted target area can be weighted, the weight is the first weight, and the sample background area
- the pixels are weighted, the weight is the second weight, and the weighted segmentation loss L dc is determined .
- the segmentation loss L dc can be calculated according to the weightedDiceloss loss function.
- the segmentation loss L dc can be determined by the following formula (3):
- the error of the target area and the error of the non-target area can be balanced, which is conducive to the optimization of network parameters, and improves the training efficiency and training effect.
- the comprehensive network loss of the shared feature extraction network, the segmentation network, and the classification network may be determined according to the classification loss and the segmentation loss.
- the classification loss and the segmentation loss can be weighted and summed to obtain the comprehensive network loss.
- the comprehensive network loss L total can be determined according to the following formula (4):
- L total ⁇ 1 L sm + ⁇ 2 L ct + ⁇ 3 L dc (4);
- ⁇ 1 represents the weight of L sm
- ⁇ 2 represents the weight of L ct
- ⁇ 3 represents the weight of L dc .
- ⁇ 1 1.2
- ⁇ 2 0.8
- ⁇ 3 2.
- the network parameters of the aforementioned neural network can be adjusted inversely through the integrated network loss.
- the network parameters can be adjusted through the gradient descent method to optimize the network parameters and improve the accuracy of segmentation and classification.
- the foregoing training method may be iteratively executed multiple times, and training is performed according to a set learning rate.
- the learning rate in the first 20 training cycles, the learning rate of 0.001*1.1 x (where x represents the training cycle) can be used for training, and in the subsequent training, it can be used in the 40th, 80th, and 120th training cycles respectively.
- Medium makes the learning rate halved.
- the training efficiency can be improved, so that the network parameters can be greatly optimized, and the learning rate can be gradually reduced in the subsequent training, the network parameters can be fine-tuned, the accuracy of the neural network can be improved, and the accuracy of classification processing and segmentation processing can be improved.
- the training can be completed when the training conditions are met, and the trained shared feature extraction network, segmentation network, and classification network can be obtained.
- the training condition may include the number of training times, that is, the training condition is satisfied when the preset number of training times is reached.
- the training condition may include that the integrated network loss is less than or equal to a preset threshold or converges to a preset interval, that is, when the integrated network loss is less than or equal to a preset threshold or converges to a preset interval, the accuracy of the neural network can be considered to satisfy the use Upon request, training can be completed.
- the embodiment of the application does not limit the training conditions.
- the trained neural network can be tested after the training is completed.
- the three-dimensional image block including the nodule area in the lung three-dimensional medical image can be input to the above neural network, and the output segmentation result and the accuracy of the classification result can be counted, for example, compared with the annotation information of the three-dimensional image block to determine the segmentation
- the accuracy of the result and the classification result can determine the training effect of the neural network. If the accuracy rate is higher than the preset threshold, it can be considered that the training effect is good, and the neural network performance is good, which can be used in the process of obtaining the category of the image block to be processed and segmenting the target area. If the accuracy rate does not reach the preset threshold, the training effect can be considered poor, and other sample image blocks can be used to continue training.
- the trained neural network can obtain the category and target area of the image block to be processed when the target area and category in the image block to be processed are unknown, or it can be in the category of the image block to be processed. In a known situation, only the target area in the image block to be processed is acquired, or when the target area in the image block to be processed is known, the category of the image block to be processed can also be acquired.
- the embodiment of the application does not limit the use method of the neural network.
- the neural network trained by the above training method can be used to determine the lesion area and the lesion category in the image block to be processed.
- the neural network includes a shared feature extraction network, a classification network, and a segmentation network.
- the method further includes: inputting the image block to be processed into the shared feature extraction network for processing to obtain the target feature of the image block to be processed, wherein the shared feature
- the extraction network includes M shared feature extraction blocks.
- the input feature of the i-th shared feature extraction block includes the output features of the first i-1 shared feature extraction block.
- i and M are integers and 1 ⁇ i ⁇ M; the target The feature input classification network performs classification processing to obtain the category information of the image block to be processed; the target feature is input to the segmentation network for segmentation processing to obtain the target area in the image block to be processed.
- the shared feature extraction network is used to obtain the target feature.
- the shared feature extraction block of the shared feature extraction network can obtain the output features of all previous shared feature extraction blocks, and input its own output features to all subsequent shared feature extraction blocks . It can strengthen the gradient flow in the network, alleviate the phenomenon of gradient disappearance, and improve the feature extraction and learning capabilities at the same time, which is conducive to finer classification and segmentation of the input image blocks to be processed. In addition, finer category information and target areas of the image blocks to be processed can be obtained, which improves image processing efficiency.
- the image block to be processed may be a partial area in the image to be processed.
- a partial area can be cropped from the image to be processed, for example, an area including the target object can be cropped.
- the image to be processed is a medical image picture, and the area including the lesion can be cropped in the medical image picture.
- the image to be processed may be a three-dimensional medical image of the lung (for example, a CT image of the lung), and the image block to be processed may be a three-dimensional image block of a lesion area (for example, an area with nodules) cut out in the image to be processed.
- the embodiments of the present application do not impose restrictions on the types of images to be processed and image blocks to be processed.
- a medical image picture for example, a three-dimensional medical image of the lung
- the size and resolution of the medical image picture are relatively high, and in the medical image picture, there are many areas of normal tissue, so it can be Preprocess the medical imaging pictures, and process the cropped areas including the lesions to improve the processing efficiency.
- the method further includes: preprocessing the image to be processed to obtain the first image; positioning the target area on the first image to determine the position information of the target area in the first image ; According to the location information of the target area in the first image, segment to obtain at least one image block to be processed.
- the image to be processed may be preprocessed first to improve processing efficiency.
- preprocessing such as resampling and normalization can be performed.
- the three-dimensional medical image of the lung can be resampled to obtain a three-dimensional image with a resolution of 1 ⁇ 1 ⁇ 1 (that is, each pixel represents the content of a 1mm ⁇ 1mm ⁇ 1mm cube).
- the size of the resampled three-dimensional image can be cropped.
- there may be some non-pulmonary areas, and the lung area can be cropped to save calculations and improve processing efficiency.
- the cropped three-dimensional image can be normalized, and the pixel value of each pixel in the three-dimensional image can be normalized to a value range of 0 to 1, so as to improve processing efficiency.
- the first image is obtained.
- the embodiment of the present application does not limit the pretreatment method.
- the target area in the first image can be detected.
- the target area in the first image can be detected by a convolutional neural network for position detection.
- a convolutional neural network may be used to detect a region including nodules in a three-dimensional medical image of the lung.
- the target area may be cropped to obtain the image block to be processed.
- the area including nodules in the three-dimensional medical image of the lung may be cropped to obtain the image block to be processed.
- the size of the image block to be processed can be determined according to the size of the nodule, and cut. Crop to obtain one or more image blocks to be processed.
- the neural network can be used to determine the category information of the image block to be processed, and segment the target area.
- the image block to be processed is a three-dimensional medical image of the lung that includes nodules. Image block.
- the type of nodule in the image block to be processed (for example, AAHOPA, AIS, MIA, and IA) can be determined through a neural network, and the area where the nodule is located can be segmented.
- the target feature of the image block to be processed can be extracted through a shared feature extraction network for classification and segmentation processing.
- Inputting the image block to be processed into the shared feature extraction network for processing to obtain the target feature of the image block to be processed may include: performing a first feature extraction process on the image block to be processed to obtain the first feature of the image block to be processed; Input the first shared feature extraction block, obtain the output feature of the first shared feature extraction block, and output the output feature of the first shared feature extraction block to the subsequent M-1 shared feature extraction blocks; the first j- The output feature of a shared feature extraction block is input to the j-th shared feature extraction block to obtain the output feature of the j-th shared feature extraction block; the output feature of the M-th shared feature extraction block is subjected to the second feature extraction process to obtain The second feature of the image block to be processed; pooling is performed on the second feature to obtain the target feature.
- the first feature extraction process can be performed first, for example, through a network including a three-dimensional convolutional layer (Three Dimensional Convolutional Layer), a batch normalization layer (Normalization) and an activation layer (Activiation Layer)
- the module performs the first feature extraction process to obtain the first feature.
- the embodiment of the present application does not limit the network level where the first feature extraction process is performed.
- the shared feature extraction network may include multiple shared feature extraction blocks, and the shared feature extraction block may include multiple network levels, for example, convolutional layers, activation layers, etc.
- the embodiments of the present application extract shared features
- the network level included in the block is not limited.
- the first feature can be processed through multiple shared feature extraction blocks.
- the number of shared feature extraction blocks is M
- the first feature can be input into the first shared feature extraction block, that is, the first shared feature extraction block can use the first feature as the input feature and compare the input feature Perform feature extraction processing to obtain output features.
- the output feature of the first shared feature extraction block can be shared by all subsequent shared feature extraction blocks, that is, the output feature of the first shared feature extraction block can be shared with subsequent M-1 shared features
- the extraction block is used as the input feature of the subsequent M-1 shared feature extraction blocks.
- the input feature of the second shared feature extraction block is the output feature of the first shared feature extraction block.
- the second shared feature extraction block After the second shared feature extraction block performs feature extraction processing on its input features, it can be The output features are output to the subsequent 3rd to Mth shared feature extraction blocks as the input features of the 3rd to Mth shared feature extraction blocks.
- the input feature of the third shared feature extraction block is the output feature of the first shared feature extraction block and the output feature of the second shared feature extraction block, and the output of the first shared feature extraction block
- the features and the output features of the second shared feature extraction block can be input to the third shared feature extraction block (ie, The input feature of the third shared feature extraction block can be the output feature of the first shared feature extraction block and the output feature of the second shared feature extraction block), or the third shared feature extraction block can be directly Both the output feature of the first shared feature extraction block and the output feature of the second shared feature extraction block are used as input features (for example, the third shared feature extraction block may include a feature fusion layer, which can be used for feature fusion processing at this level, Or retain all feature channels, and directly perform subsequent processing on the features of all feature channels, that is, perform subsequent processing on the output features of the first shared feature extraction block and the output features of the second shared feature extraction block), and Perform feature extraction processing on the input features (for example, directly perform feature extraction processing on the features of all feature channels
- the output features of the first j-1 shared feature extraction block can be used as input features to be input to the first shared feature extraction block.
- the fused features can be used as the input features of the j-th shared feature extraction block, or directly output the first j-1 shared feature extraction blocks
- the feature is used as the input feature of the j-th shared feature extraction block (for example, fusion is performed in the j-th shared feature extraction block, or subsequent processing is directly performed on the features of all feature channels, that is, the first j-1 shared features are extracted
- the output characteristics of the block are processed later).
- the jth shared feature extraction block can perform feature extraction processing on its input features to obtain the output feature of the jth shared feature extraction block, and use the output feature as the input of the j+1th to the Mth shared feature extraction block feature.
- the M-th shared feature extraction block may obtain the output feature of the M-th shared feature extraction block according to the output features of the first M-1 shared feature extraction block.
- the second feature extraction process can be performed through the subsequent network level of the shared feature extraction network.
- the output of the Nth shared feature extraction block can be performed through a network module including a three-dimensional convolution layer, a batch normalization layer, and an activation layer.
- the second feature extraction process is performed on the feature to obtain the second feature.
- the embodiment of the present application does not limit the network level for performing the second feature extraction process.
- the second feature may be pooled.
- the second feature may be pooled through an average pooling layer to obtain the target feature.
- the embodiment of the application does not limit the type of pooling processing.
- the foregoing processing may be performed multiple times, for example, may include multiple shared feature extraction networks.
- the first shared feature extraction network can take the first feature as the input feature.
- the second feature extraction process and the pooling process of the shared feature extraction block the output feature of the first shared feature extraction network is obtained.
- the two shared feature extraction networks can use the output features of the first shared feature extraction network as input features.
- the second shared feature extraction can be obtained
- the output feature of the network can be processed by multiple shared feature extraction networks, and the output feature of the last (for example, the fourth) shared feature extraction network is used as the target feature.
- the embodiment of the present application does not limit the number of shared feature extraction networks.
- the shared feature extraction block of the shared feature extraction network can obtain the output features of all previous shared feature extraction blocks, and input its own output features to all subsequent shared feature extractions. Piece. It can strengthen the gradient flow in the network, alleviate the phenomenon of gradient disappearance, and improve the feature extraction and learning capabilities at the same time, which is conducive to finer classification and segmentation of the input image blocks to be processed.
- the category information of the image block to be processed can be determined according to the target feature.
- the image block to be processed is an image block including nodules and other lesions in the three-dimensional medical image of the lung, and the nodule can be determined according to the target feature.
- the category of the section it can be determined whether the type of nodule is pre-invasive adenocarcinoma atypical adenoma hyperplasia, adenocarcinoma in situ, minimally invasive adenocarcinoma, or invasive adenocarcinoma.
- the target feature can be classified through the classification network to obtain the category information of the image block to be processed.
- the classification network may include multiple network levels, such as convolutional layer, global average pooling layer (Global Average Pooling), and fully connected layer (Fully Connected Layer), etc.
- the above network levels can classify target features.
- And can output category information.
- the category information may be category information expressed in the form of a vector or the like, and the probability distribution of the image block to be processed represented by the vector belonging to each category can be determined through a probability dictionary or the like, and then the category information of the image block to be processed can be determined.
- the vector of category information can directly represent the probability of the image block to be processed.
- each element of the vector represents the probability of the category of the image block to be processed.
- (0.8, 0.1, 0.1) can represent the image to be processed
- the probability of a block belonging to the first category is 0.8
- the probability of belonging to the second category is 0.1
- the probability of belonging to the third category is 0.1
- the category with the highest probability can be determined as the category of the image block to be processed, that is, Determine the category information of the image block to be processed as the first category.
- the embodiment of this application does not limit the representation method of category information.
- the category information of the image block to be processed can be determined according to the target feature.
- the image block to be processed is an image block including nodules and other lesions in the three-dimensional medical image of the lung, and the nodule can be determined according to the target feature. The location of the section, and divide the area where it is located.
- segmentation processing can be performed through a segmentation network to obtain the target area in the image block to be processed, for example, the target area can be segmented.
- the segmentation network may include multiple network levels, for example, an upsampling layer (Upsample), a fully connected layer, and so on.
- the target feature is a feature map obtained by performing feature extraction and pooling on the image block to be processed in the shared feature extraction network, and the resolution of the target feature may be lower than that of the image block to be processed.
- Upsampling can be performed through the upsampling layer to reduce the number of feature channels of the target feature and increase the resolution, so that the feature map output by the segmentation network is consistent with the resolution of the image block to be processed. For example, if the shared feature extraction network performs four pooling processing, four upsampling processing can be performed through the upsampling layer, so that the output feature map of the segmentation network is consistent with the resolution of the image block to be processed.
- the target area can be segmented in the feature map output by the segmentation network, for example, the target area where the nodule is located is determined by the contour line or the contour surface. The embodiment of the present application does not limit the network hierarchy of the segmented network.
- the position of the target area in the image to be processed can also be determined.
- the position of the target area in the image to be processed can be restored according to the position of the image block to be processed in the image to be processed and the position of the target area in the image block to be processed.
- the position of the nodule in the image block to be processed in the lung medical image, can be segmented, and the position of the nodule in the lung medical image can be restored.
- fine classification of sample image blocks can be obtained, and the neural network can be trained, so that the neural network can finely classify images, improve classification efficiency and accuracy; and can share features Extract the network to obtain the target feature.
- the shared feature extraction block of the shared feature extraction network can obtain the output features of all previous shared feature extraction blocks, and input its own output features to all subsequent shared feature extraction blocks to strengthen the gradient flow in the network.
- FIG. 3 is a schematic diagram of an application of the neural network training method provided by an embodiment of the present application.
- image blocks As shown in FIG. Nodules) image blocks.
- the sample image block may have category annotations.
- the sample image block may include four categories: AAHOPA, AIS, MIA, and IA.
- the sample image block 32 may be input to the neural network 33.
- the shared feature extraction network 331 included in the neural network 33 performs feature extraction on each batch of sample image blocks to obtain sample target features of the sample image blocks, and pass
- the classification network 332 included in the neural network 33 obtains the category prediction information of the sample image block, and the classification loss of the neural network can be determined by formula (1) and formula (2).
- the segmentation network 333 included in the neural network 33 can obtain the prediction target area in the sample image block 32, and can determine the segmentation loss of the neural network according to formula (3).
- the weighted sum of the segmentation loss and the classification loss can be used to obtain the comprehensive network loss of the neural network, and the neural network can be trained through the comprehensive network loss.
- the trained neural network can be used to determine the focus area and focus category in the image block of the medical image.
- the image to be processed may be a three-dimensional lung medical image (for example, a lung CT image), and the image block to be processed may be a case region (for example, an area with nodules) that is cut out of the image to be processed. ) Of the three-dimensional image block.
- the three-dimensional medical image can be resampled to obtain a three-dimensional image with a resolution of 1 ⁇ 1 ⁇ 1, and the area where the lungs are located can be cropped, and then the area where the lungs are located can be normalized .
- the area where the nodule is located in the area where the lung is located can be detected, and a plurality of image blocks to be processed including the area where the nodule is located can be cropped according to a size of 64 ⁇ 64 ⁇ 64.
- multiple image blocks to be processed may be subjected to feature extraction processing in batches to obtain target features of the image blocks to be processed.
- the first feature extraction process may be performed first, for example, the first feature extraction process may be performed through a network module including a three-dimensional convolution layer, a batch normalization layer, and an activation layer to obtain the first feature.
- the first feature may be input to the shared feature extraction network.
- the shared feature extraction network may include multiple shared feature extraction blocks.
- the number of shared feature extraction blocks is M
- the first feature can be input to the first shared feature extraction block for processing
- the output features of the first shared feature extraction block can be to the subsequent M-1 shared features Extract the block.
- the input feature of the second shared feature extraction block is the output feature of the first shared feature extraction block
- the second shared feature extraction block can output its output features to the subsequent 3rd to Mth shared features Extract the block.
- the input features of the third shared feature extraction block are the output features of the first shared feature extraction block and the output features of the second shared feature extraction block, and the output features of the third shared feature extraction block can be output to the fourth To the M-th shared feature extraction block.
- the output features of the first j-1 shared feature extraction block can be input to the j-th shared feature extraction block, and the output feature of the j-th shared feature extraction block can be output to the j+1 to M-th shared Feature extraction block.
- the M-th shared feature extraction block can obtain the output features of the M-th shared feature extraction block according to the output features of the previous M-1 shared feature extraction blocks, and perform the second feature extraction process, for example, by including three-dimensional convolution
- the network modules of the layer, batch normalization layer, and activation layer perform a second feature extraction process on the output feature of the Nth shared feature extraction block to obtain the second feature.
- the second feature may be pooled (for example, average pooling) processing to obtain the target feature.
- the foregoing processing may be performed multiple times (for example, 4 times), for example, multiple shared feature extraction networks may be included.
- multiple shared feature extraction networks may be included.
- the classification network may perform classification processing on the target feature to obtain category information of the image block to be processed.
- the classification network can obtain the category information of the image block to be processed through the convolutional layer, the global average pooling layer, and the fully connected layer.
- the segmentation network can perform segmentation processing on the target feature to obtain the target area (ie, the area where the nodule is located).
- the segmentation network performs four upsampling processing through the upsampling layer, so that the output feature map of the segmentation network is consistent with the resolution of the image block to be processed, and the target area can be segmented in the feature map output of the segmentation network .
- the aforementioned neural network can obtain the category and target area of the image block to be processed when the target area and category in the image block to be processed are unknown (for example, the area where the nodule is located can be segmented, and Obtain the category of the nodule), or when the category of the image block to be processed is known, only obtain the target area in the image block to be processed (for example, segment the area where the nodule is located), or it can be used in the image block to be processed.
- the category of the image block to be processed is obtained (for example, the category of the nodule is determined).
- the image processing method can be used to segment and classify case regions in medical images such as lung CT images, improve clinical work efficiency, reduce missed diagnosis and misdiagnosis, and can also be used to classify other images
- the embodiments of the present application do not limit the application field of the image processing method.
- FIG. 4 is a schematic diagram of a neural network training device provided by an embodiment of the present application.
- the device includes: an acquisition module 11 configured to acquire position information and category information of a target area in a sample image; first The segmentation module 12 is configured to segment to obtain at least one sample image block according to the position information of the target area in the sample image; the classification module 13 is configured to classify the at least one sample image block according to the category information to obtain N-type sample image blocks, where N is an integer, and N ⁇ 1; the training module 14 is configured to input the N-type sample image blocks into the neural network for training.
- the sample image is a medical imaging picture.
- the acquisition module 11 is further configured to: locate a target area on a medical image picture to obtain location information of the target area; obtain a pathology picture associated with the medical image picture;
- the pathology picture is a diagnosed picture that includes pathology information; according to the pathology information of each target area on the pathology picture, the category information of the target area on the medical imaging picture is determined.
- the training module 14 is further configured to: input any sample image block into the neural network for processing to obtain category prediction information and prediction target area of the sample image block; at least according to the category Determine the classification loss according to the prediction information and the category information of the sample image block; determine the segmentation loss according to the location information of the prediction target area and the sample image block; and train the segmentation loss according to the classification loss and the segmentation loss Neural Networks.
- the training module 14 is further configured to: determine the first classification loss according to the category prediction information and the category information of the sample image block; according to the category prediction information and the sample image Determine the second classification loss based on the category information of the class center of the category to which the block belongs; perform weighted summation processing on the first classification loss and the second classification loss to obtain the classification loss.
- the training module 14 is further configured to determine the first weight of the prediction target area according to the first proportion of the number of pixels of the prediction target area in the sample image block And the second weight of the sample background area in the sample image block; determine the segmentation loss according to the first weight, the second weight, the prediction target area, and the position information of the sample image block.
- the training module 14 is further configured to determine the sample background area in the sample image block according to the first proportion of the number of pixels of the prediction target area in the sample image block The second ratio; the second ratio is determined as the first weight, and the first ratio is determined as the second weight.
- the category information includes: pre-invasive adenocarcinoma atypical adenoma nodules, in situ adenocarcinoma nodules, minimally invasive adenocarcinoma nodules, and invasive adenocarcinoma nodules.
- the neural network includes a shared feature extraction network, a classification network, and a segmentation network.
- the device further includes: an obtaining module configured to input the image block to be processed into the shared feature extraction network for processing, Obtain the target feature of the image block to be processed, wherein the shared feature extraction network includes M shared feature extraction blocks, and the input feature of the i-th shared feature extraction block includes the output features of the first i-1 shared feature extraction blocks, i And M are integers and 1 ⁇ i ⁇ M; a classification module configured to input the target feature into the classification network for classification processing to obtain category information of the image block to be processed; a segmentation module configured to divide the target The features are input into the segmentation network to perform segmentation processing to obtain the target region in the image block to be processed.
- the obtaining module is further configured to: perform a first feature extraction process on the image block to be processed to obtain the first feature of the image block to be processed; and input the first feature into the first feature A shared feature extraction block, the output feature of the first shared feature extraction block is obtained, and the output feature of the first shared feature extraction block is output to the subsequent M-1 shared feature extraction blocks;
- the output features of j-1 shared feature extraction blocks are input to the j-th shared feature extraction block to obtain the output features of the j-th shared feature extraction block, where j is an integer and 1 ⁇ j ⁇ M;
- the second feature extraction process is performed on the output features of the two shared feature extraction blocks to obtain the second feature of the image block to be processed; the second feature is pooled to obtain the target feature.
- the device further includes: a preprocessing module configured to preprocess the image to be processed to obtain a first image; a positioning module configured to locate a target area on the first image, The location information of the target area in the first image is determined; the second segmentation module is configured to segment to obtain at least one image block to be processed according to the location information of the target area in the first image.
- the functions or modules included in the apparatus provided in the embodiments of the present application may be configured to execute the methods described in the above method embodiments, and for implementation, refer to the description of the above method embodiments.
- the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and the computer program is configured to execute the above method when running.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium.
- An embodiment of the present application further provides an electronic device, including: a processor; a memory configured to store a computer program executable by the processor; wherein the processor is configured to execute the above method through the computer program.
- the electronic device can be provided as a terminal, server or other form of device.
- the embodiments of the present application also provide a computer program product, including computer readable code.
- the processor in the device executes the instructions of the neural network training method provided by any of the above embodiments. .
- the embodiments of the present application also provide another computer program product configured to store computer-readable instructions, which when executed, cause the computer to perform the operations of the neural network training method provided in any of the foregoing embodiments.
- Fig. 5 is a schematic diagram of an electronic device provided by an embodiment of the present application.
- the electronic device 800 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.
- the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, and an input/output (Input/Output, I/O) interface 812 ,
- the sensor component 814, and the communication component 816 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.
- the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component
- the processing component 802 generally controls the overall operations of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the foregoing method.
- the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
- the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
- the memory 804 is configured to store various types of data to support operations in the electronic device 800. Examples of these data include instructions for any application or method to operate on the electronic device 800, contact data, phone book data, messages, pictures, videos, etc.
- the memory 804 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (Static Random Access Memory, SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory). Programmable Read Only Memory, EEPROM, Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (Read Only Memory, ROM) , Magnetic memory, flash memory, magnetic disk or optical disk.
- SRAM static random access memory
- EPROM Erasable Programmable Read Only Memory
- PROM Programmable Read Only Memory
- Read Only Memory Read Only Memory
- the power supply component 806 provides power for various components of the electronic device 800, and may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.
- the multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and the user.
- the screen may include a liquid crystal display (Liquid Crystal Display, LCD) and a touch panel (TouchPanel, TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
- the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 810 is configured to output and/or input audio signals.
- the audio component 810 includes a microphone (Microphone, MIC).
- the microphone is configured to receive an external audio signal.
- the received audio signal may be stored in the memory 804 or transmitted via the communication component 816.
- the audio component 810 further includes a speaker configured to output audio signals.
- the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module.
- the above-mentioned peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: home button, volume button, start button, and lock button.
- the sensor component 814 includes one or more sensors configured to provide the electronic device 800 with various aspects of state evaluation.
- the sensor component 814 can detect the on/off status of the electronic device 800 and the relative positioning of the components.
- the component is the display and the keypad of the electronic device 800.
- the sensor component 814 can also detect the electronic device 800 or the electronic device 800.
- the position of the component changes, the presence or absence of contact between the user and the electronic device 800, the orientation or acceleration/deceleration of the electronic device 800, and the temperature change of the electronic device 800.
- the sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
- the sensor component 814 may also include a light sensor, such as a complementary metal oxide semiconductor (Complementary Metal Oxide Semiconductor, CMOS) or a charge coupled device (Charge Coupled Device, CCD) image sensor, which can be used in imaging applications.
- CMOS Complementary Metal Oxide Semiconductor
- CCD Charge Coupled Device
- the sensor component 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- the communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices.
- the electronic device 800 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.
- the communication component 816 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel.
- the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communication.
- NFC Near Field Communication
- the NFC module can be based on Radio Frequency Identification (RFID) technology, Infrared Data Association (Infrared Data Association, IrDA) technology, Ultra Wide Band (UWB) technology, Bluetooth (Bluetooth, BT) technology and other technologies. Technology to achieve.
- RFID Radio Frequency Identification
- IrDA Infrared Data Association
- UWB Ultra Wide Band
- Bluetooth Bluetooth, BT
- the electronic device 800 may be used by one or more application specific integrated circuits (ASIC), digital signal processors (Digital Signal Process, DSP), and digital signal processing equipment (Digital Signal Process).
- ASIC application specific integrated circuits
- DSP digital signal processors
- Digital Signal Process Digital Signal Process
- DSPD Digital Signal Process
- PLD Programmable Logic Device
- FPGA Field Programmable Gate Array
- controller microcontroller, microprocessor or other electronic components, configured to execute The above method.
- non-volatile computer-readable storage medium such as the memory 804 including computer program instructions, which can be executed by the processor 820 of the electronic device 800 to complete the foregoing method.
- Fig. 6 is a schematic diagram of another electronic device provided by an embodiment of the present application.
- the electronic device 1900 may be provided as a server. 6
- the electronic device 1900 includes a processing component 1922, and may also include one or more processors, and a memory resource represented by the memory 1932, configured to store instructions executable by the processing component 1922, such as application programs.
- the application program stored in the memory 1932 may include one or more modules each corresponding to a set of instructions.
- the processing component 1922 is configured to execute instructions to perform the above-described methods.
- the electronic device 1900 may also include a power supply component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to the network, and an input output (I/O) interface 1958 .
- the electronic device 1900 can operate based on an operating system stored in the memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
- a non-volatile computer-readable storage medium such as the memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to complete the foregoing method.
- the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions configured to enable a processor to implement various aspects of the present application.
- the computer-readable storage medium may be a tangible device that holds and stores instructions used by the instruction execution device.
- the computer-readable storage medium may be, for example, but not limited to: an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- the computer-readable storage medium may include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), ROM, erasable programmable read-only memory (EPROM or flash memory), SRAM, portable compact disk read-only memory (Compact Disk-Read Only Memory, CD-ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanical coding devices, such as punch cards with instructions stored on them or protrusions in the grooves Structure, and any suitable combination of the above.
- the computer-readable storage medium used here is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (such as light pulses through fiber optic cables), or through wires Transmission of electrical signals.
- the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
- the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
- the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
- the computer program instructions configured to perform the operations of this application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, status setting data, or one or more Source code or object code written in any combination of two programming languages, including object-oriented programming languages, such as Smalltalk, C++, etc., and conventional procedural programming languages, such as "C" language or similar programming languages.
- Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server implement.
- the remote computer may be connected to the user's computer through any kind of network (including a local area network or a wide area network), or may be connected to an external computer (for example, using an Internet service provider to connect through the Internet).
- electronic circuits such as programmable logic circuits, FPGAs, or programmable logic arrays (Programmable Logic Array, PLA), can be customized by using the status information of computer-readable program instructions. Read the program instructions to realize all aspects of this application.
- These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that makes these instructions when executed by the processor of the computer or other programmable data processing device , A device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner. Thus, the computer-readable medium storing the instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
- each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction includes one or more components configured to implement the specified logic function.
- Executable instructions may also occur in a different order from the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
- each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
- the computer program product can be implemented by hardware, software or a combination thereof.
- the computer program product may be embodied as a computer storage medium.
- the computer program product may be embodied as a software product, such as a software development kit (SDK), etc. Wait.
- SDK software development kit
- This application relates to a neural network training method and device, electronic equipment, and storage medium.
- the method includes: obtaining position information and category information of a target area in a sample image; and segmenting to obtain at least one sample image block according to the position information of the target area ; According to the category information, classify at least one sample image block to obtain N-type sample image blocks; input the N-type sample image blocks into the neural network for training.
- fine classification of sample image blocks can be obtained, and the neural network can be trained, so that the neural network can finely classify images, and improve classification efficiency and accuracy.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
Description
Claims (25)
- 一种神经网络训练方法,所述神经网络训练方法用于训练神经网络模型,根据训练得到的神经网络模型对图像进行分类,所述方法包括:获取样本图像中目标区域的位置信息及类别信息;根据所述样本图像中目标区域的位置信息,分割得到至少一个样本图像块;根据所述类别信息,将所述至少一个样本图像块进行分类,得到N类样本图像块,N为整数,且N≥1;将所述N类样本图像块输入至神经网络中进行训练。
- 根据权利要求1所述的方法,其中,所述样本图像为医学影像图片。
- 根据权利要求2所述的方法,其中,所述获取样本图像中目标区域的位置信息及类别信息包括:对所述医学影像图片上的目标区域进行定位,得到所述目标区域的位置信息;获取与所述医学影像图片关联的病理学图片,所述病理学图片为经过诊断的包括病理信息的图片;根据所述病理学图片上的各目标区域的病理信息,确定所述医学影像图片上的目标区域的类别信息。
- 根据权利要求1所述的方法,其中,将所述N类样本图像块输入至神经网络中进行训练,包括:将任一的样本图像块输入所述神经网络进行处理,获得样本图像块的类别预测信息和预测目标区域;至少根据所述类别预测信息和所述样本图像块的类别信息,确定分类损失;根据所述预测目标区域和所述样本图像块的位置信息,确定分割损失;根据所述分类损失和所述分割损失,训练所述神经网络。
- 根据权利要求4所述的方法,其中,根据所述类别预测信息和所述样本图像块的类别信息,确定分类损失,包括:根据所述类别预测信息和所述样本图像块的类别信息,确定第一分类损失;根据所述类别预测信息和所述样本图像块所属类别的类中心的类别信息,确定第二分类损失;对所述第一分类损失和所述第二分类损失进行加权求和处理,获得所述分类损失。
- 根据权利要求4所述的方法,其中,根据所述预测目标区域和所述样本图像块的位置信息,确定分割损失,包括:根据所述预测目标区域的像素数量在所述样本图像块中所占的第一比例,确定所述预测目标区域的第一权重和所述样本图像块中样本背景区域的第二权重;根据所述第一权重、第二权重、所述预测目标区域和所述样本图像块的位置信息,确定所述分割损失。
- 根据权利要求6所述的方法,其中,根据所述预测目标区域的像素数量在所述样本图像块中所占的第一比例,确定所述预测目标区域的第一权重和所述样本图像块中 样本背景区域的第二权重,包括:根据所述预测目标区域的像素数量在所述样本图像块中所占的第一比例,确定所述样本图像块中样本背景区域的第二比例;将所述第二比例确定为所述第一权重,并将所述第一比例确定为第二权重。
- 根据权利要求1至7任意一项所述的方法,其中,所述类别信息包括:浸润前腺癌非典型腺瘤增生结节、原位腺癌结节、微创腺癌结节和浸润性腺癌结节。
- 根据权利要求1至8任意一项所述的方法,其中,所述神经网络包括共享特征提取网络、分类网络和分割网络;所述方法还包括:将待处理图像块输入所述共享特征提取网络进行处理,获得所述待处理图像块的目标特征,其中,所述共享特征提取网络包括M个共享特征提取块,第i个共享特征提取块的输入特征包括前i-1个共享特征提取块的输出特征,i和M为整数且1<i≤M;将所述目标特征输入所述分类网络进行分类处理,获得所述待处理图像块的类别信息;将所述目标特征输入所述分割网络进行分割处理,获得所述待处理图像块中的目标区域。
- 根据权利要求9所述的方法,其中,将待处理图像块输入所述共享特征提取网络进行处理,获得所述待处理图像块的目标特征,包括:对所述待处理图像块进行第一特征提取处理,获得所述待处理图像块的第一特征;将所述第一特征输入第一个共享特征提取块,获得所述第一个共享特征提取块的输出特征,并将所述第一个共享特征提取块的输出特征输出至后续的M-1个共享特征提取块;将前j-1个共享特征提取块的输出特征输入至第j个共享特征提取块,获得所述第j个共享特征提取块的输出特征,其中,j为整数且1<j<M;将第M个共享特征提取块的输出特征进行第二特征提取处理,获得所述待处理图像块的第二特征;对所述第二特征进行池化处理,获得所述目标特征。
- 根据权利要求9所述的方法,其中,所述方法还包括:对待处理图像进行预处理,获得第一图像;对所述第一图像上的目标区域进行定位,确定所述第一图像中的目标区域的位置信息;根据所述第一图像中的目标区域的位置信息,分割得到至少一个所述待处理图像块。
- 一种神经网络训练装置,所述神经网络训练装置用于训练神经网络模型,根据训练得到的神经网络模型对图像进行分类,所述装置包括:获取模块,配置为获取样本图像中目标区域的位置信息及类别信息;第一分割模块,配置为根据所述样本图像中目标区域的位置信息,分割得到至少一个样本图像块;分类模块,配置为根据所述类别信息,将所述至少一个样本图像块进行分类,得到 N类样本图像块,N为整数,且N≥1;训练模块,配置为将所述N类样本图像块输入至神经网络中进行训练。
- 根据权利要求12所述的装置,其中,所述样本图像为医学影像图片。
- 根据权利要求13所述的装置,其中,所述获取模块还配置为:对所述医学影像图片上的目标区域进行定位,得到所述目标区域的位置信息;获取与所述医学影像图片关联的病理学图片,所述病理学图片为经过诊断的包括病理信息的图片;根据所述病理学图片上的各目标区域的病理信息,确定所述医学影像图片上的目标区域的类别信息。
- 根据权利要求12所述的装置,其中,所述训练模块还配置为:将任一的样本图像块输入所述神经网络进行处理,获得样本图像块的类别预测信息和预测目标区域;至少根据所述类别预测信息和所述样本图像块的类别信息,确定分类损失;根据所述预测目标区域和所述样本图像块的位置信息,确定分割损失;根据所述分类损失和所述分割损失,训练所述神经网络。
- 根据权利要求15所述的装置,其中,所述训练模块还配置为:根据所述类别预测信息和所述样本图像块的类别信息,确定第一分类损失;根据所述类别预测信息和所述样本图像块所属类别的类中心的类别信息,确定第二分类损失;对所述第一分类损失和所述第二分类损失进行加权求和处理,获得所述分类损失。
- 根据权利要求15所述的装置,其中,所述训练模块还配置为:根据所述预测目标区域的像素数量在所述样本图像块中所占的第一比例,确定所述预测目标区域的第一权重和所述样本图像块中样本背景区域的第二权重;根据所述第一权重、第二权重、所述预测目标区域和所述样本图像块的位置信息,确定所述分割损失。
- 根据权利要求17所述的装置,其中,所述训练模块还配置为:根据所述预测目标区域的像素数量在所述样本图像块中所占的第一比例,确定所述样本图像块中样本背景区域的第二比例;将所述第二比例确定为所述第一权重,并将所述第一比例确定为第二权重。
- 根据权利要求12至18任意一项所述的装置,其中,所述类别信息包括:浸润前腺癌非典型腺瘤增生结节、原位腺癌结节、微创腺癌结节和浸润性腺癌结节。
- 根据权利要求12至19任意一项所述的装置,其中,所述神经网络包括共享特征提取网络、分类网络和分割网络;所述装置还包括:获得模块,配置为将待处理图像块输入所述共享特征提取网络进行处理,获得所述待处理图像块的目标特征,其中,所述共享特征提取网络包括M个共享特征提取块,第i个共享特征提取块的输入特征包括前i-1个共享特征提取块的输出特征,i和M为整数且1<i≤M;分类模块,配置为将所述目标特征输入所述分类网络进行分类处理,获得所述待处 理图像块的类别信息;分割模块,配置为将所述目标特征输入所述分割网络进行分割处理,获得所述待处理图像块中的目标区域。
- 根据权利要求20所述的装置,其中,所述获得模块还配置为:对所述待处理图像块进行第一特征提取处理,获得所述待处理图像块的第一特征;将所述第一特征输入第一个共享特征提取块,获得所述第一个共享特征提取块的输出特征,并将所述第一个共享特征提取块的输出特征输出至后续的M-1个共享特征提取块;将前j-1个共享特征提取块的输出特征输入至第j个共享特征提取块,获得所述第j个共享特征提取块的输出特征,其中,j为整数且1<j<M;将第M个共享特征提取块的输出特征进行第二特征提取处理,获得所述待处理图像块的第二特征;对所述第二特征进行池化处理,获得所述目标特征。
- 根据权利要求20所述的装置,其中,所述装置还包括:预处理模块,配置为对待处理图像进行预处理,获得第一图像;定位模块,配置为对所述第一图像上的目标区域进行定位,确定所述第一图像中的目标区域的位置信息;第二分割模块,配置为根据所述第一图像中的目标区域的位置信息,分割得到至少一个所述待处理图像块。
- 一种电子设备,包括:处理器;配置为存储处理器可执行计算机程序的存储器;其中,所述处理器被配置为:通过所述计算机程序执行权利要求1至11中任意一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被配置为运行时执行权利要求1至11中任意一项所述的方法。
- 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行被配置为实现权利要求1至11任意一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021574781A JP2022537974A (ja) | 2020-03-05 | 2020-07-07 | ニューラルネットワーク訓練方法及び装置、電子機器並びに記憶媒体 |
KR1020217041454A KR20220009451A (ko) | 2020-03-05 | 2020-07-07 | 신경망 훈련 방법 및 장치, 전자 기기 및 저장 매체 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010148544.8 | 2020-03-05 | ||
CN202010148544.8A CN111368923B (zh) | 2020-03-05 | 2020-03-05 | 神经网络训练方法及装置、电子设备和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021174739A1 true WO2021174739A1 (zh) | 2021-09-10 |
Family
ID=71208701
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/100715 WO2021174739A1 (zh) | 2020-03-05 | 2020-07-07 | 神经网络训练方法及装置、电子设备和存储介质 |
Country Status (5)
Country | Link |
---|---|
JP (1) | JP2022537974A (zh) |
KR (1) | KR20220009451A (zh) |
CN (1) | CN111368923B (zh) |
TW (1) | TWI770754B (zh) |
WO (1) | WO2021174739A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113793323A (zh) * | 2021-09-16 | 2021-12-14 | 云从科技集团股份有限公司 | 一种元器件检测方法、系统、设备及介质 |
CN113989845A (zh) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | 姿态分类方法和姿态分类模型的训练方法、装置 |
CN113989407A (zh) * | 2021-12-30 | 2022-01-28 | 青岛美迪康数字工程有限公司 | Ct影像中肢体部位识别模型训练方法及系统 |
CN113989721A (zh) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | 目标检测方法和目标检测模型的训练方法、装置 |
CN114037925A (zh) * | 2021-09-27 | 2022-02-11 | 北京百度网讯科技有限公司 | 目标检测模型的训练、检测方法、装置及电子设备 |
US20220084677A1 (en) * | 2020-09-14 | 2022-03-17 | Novocura Tech Health Services Private Limited | System and method for generating differential diagnosis in a healthcare environment |
CN114612824A (zh) * | 2022-03-09 | 2022-06-10 | 清华大学 | 目标识别方法及装置、电子设备和存储介质 |
CN116077066A (zh) * | 2023-02-10 | 2023-05-09 | 北京安芯测科技有限公司 | 心电信号分类模型的训练方法、装置及电子设备 |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111368923B (zh) * | 2020-03-05 | 2023-12-19 | 上海商汤智能科技有限公司 | 神经网络训练方法及装置、电子设备和存储介质 |
CN111767708A (zh) * | 2020-07-09 | 2020-10-13 | 北京猿力未来科技有限公司 | 解题模型的训练方法及装置、解题公式生成方法及装置 |
CN112017162B (zh) * | 2020-08-10 | 2022-12-06 | 上海杏脉信息科技有限公司 | 病理图像处理方法、装置、存储介质和处理器 |
CN112241760A (zh) * | 2020-08-25 | 2021-01-19 | 浙江大学 | 网络小额贷款服务中的黑中介自动挖掘方法与系统 |
CN112328398B (zh) * | 2020-11-12 | 2024-09-27 | 清华大学 | 任务处理方法及装置、电子设备和存储介质 |
CN112561893B (zh) * | 2020-12-22 | 2024-09-06 | 平安银行股份有限公司 | 图片匹配方法、装置、电子设备及存储介质 |
CN112785565B (zh) * | 2021-01-15 | 2024-01-05 | 上海商汤智能科技有限公司 | 目标检测方法及装置、电子设备和存储介质 |
CN112749801A (zh) * | 2021-01-22 | 2021-05-04 | 上海商汤智能科技有限公司 | 神经网络训练和图像处理方法及装置 |
CN112907517B (zh) * | 2021-01-28 | 2024-07-19 | 上海商汤善萃医疗科技有限公司 | 一种图像处理方法、装置、计算机设备及存储介质 |
CN112925938A (zh) * | 2021-01-28 | 2021-06-08 | 上海商汤智能科技有限公司 | 一种图像标注方法、装置、电子设备及存储介质 |
US11967084B2 (en) * | 2021-03-09 | 2024-04-23 | Ping An Technology (Shenzhen) Co., Ltd. | PDAC image segmentation method, electronic device and storage medium |
CN113139471A (zh) * | 2021-04-25 | 2021-07-20 | 上海商汤智能科技有限公司 | 目标检测方法及装置、电子设备和存储介质 |
AU2021204563A1 (en) * | 2021-06-17 | 2023-01-19 | Sensetime International Pte. Ltd. | Target detection methods, apparatuses, electronic devices and computer-readable storage media |
CN113702719B (zh) * | 2021-08-03 | 2022-11-29 | 北京科技大学 | 一种基于神经网络的宽带近场电磁定位方法及装置 |
CN113688975A (zh) * | 2021-08-24 | 2021-11-23 | 北京市商汤科技开发有限公司 | 神经网络的训练方法、装置、电子设备及存储介质 |
CN114049315B (zh) * | 2021-10-29 | 2023-04-18 | 北京长木谷医疗科技有限公司 | 关节识别方法、电子设备、存储介质及计算机程序产品 |
CN114332547B (zh) * | 2022-03-17 | 2022-07-08 | 浙江太美医疗科技股份有限公司 | 医学目标分类方法和装置、电子设备和存储介质 |
CN114743055A (zh) * | 2022-04-18 | 2022-07-12 | 北京理工大学 | 一种使用分区决策机制提高图像分类准确率的方法 |
CN114839340A (zh) * | 2022-04-27 | 2022-08-02 | 芯视界(北京)科技有限公司 | 水质生物活性检测方法及装置、电子设备和存储介质 |
KR20240018229A (ko) * | 2022-08-02 | 2024-02-13 | 김민구 | 시내퍼 모델을 이용한 자연어 처리 시스템 및 방법 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034788A1 (en) * | 2014-07-30 | 2016-02-04 | Adobe Systems Incorporated | Learning image categorization using related attributes |
CN107330263A (zh) * | 2017-06-26 | 2017-11-07 | 成都知识视觉科技有限公司 | 一种计算机辅助乳腺浸润性导管癌组织学分级的方法 |
CN108520518A (zh) * | 2018-04-10 | 2018-09-11 | 复旦大学附属肿瘤医院 | 一种甲状腺肿瘤超声图像识别方法及其装置 |
CN109919961A (zh) * | 2019-02-22 | 2019-06-21 | 北京深睿博联科技有限责任公司 | 一种用于颅内cta图像中动脉瘤区域的处理方法及装置 |
CN111368923A (zh) * | 2020-03-05 | 2020-07-03 | 上海商汤智能科技有限公司 | 神经网络训练方法及装置、电子设备和存储介质 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6031921B2 (ja) * | 2012-09-28 | 2016-11-24 | ブラザー工業株式会社 | 画像処理装置およびプログラム |
CN108229267B (zh) * | 2016-12-29 | 2020-10-16 | 北京市商汤科技开发有限公司 | 对象属性检测、神经网络训练、区域检测方法和装置 |
US10423861B2 (en) * | 2017-10-16 | 2019-09-24 | Illumina, Inc. | Deep learning-based techniques for training deep convolutional neural networks |
CN108335313A (zh) * | 2018-02-26 | 2018-07-27 | 阿博茨德(北京)科技有限公司 | 图像分割方法及装置 |
CN109285142B (zh) * | 2018-08-07 | 2023-01-06 | 广州智能装备研究院有限公司 | 一种头颈部肿瘤检测方法、装置及计算机可读存储介质 |
CN109447169B (zh) * | 2018-11-02 | 2020-10-27 | 北京旷视科技有限公司 | 图像处理方法及其模型的训练方法、装置和电子系统 |
CN110245657B (zh) * | 2019-05-17 | 2021-08-24 | 清华大学 | 病理图像相似性检测方法及检测装置 |
CN110210535B (zh) * | 2019-05-21 | 2021-09-10 | 北京市商汤科技开发有限公司 | 神经网络训练方法及装置以及图像处理方法及装置 |
CN110705555B (zh) * | 2019-09-17 | 2022-06-14 | 中山大学 | 基于fcn的腹部多器官核磁共振图像分割方法、系统及介质 |
CN110705626A (zh) * | 2019-09-26 | 2020-01-17 | 北京市商汤科技开发有限公司 | 一种图像处理方法及装置、电子设备和存储介质 |
CN110796656A (zh) * | 2019-11-01 | 2020-02-14 | 上海联影智能医疗科技有限公司 | 图像检测方法、装置、计算机设备和存储介质 |
-
2020
- 2020-03-05 CN CN202010148544.8A patent/CN111368923B/zh active Active
- 2020-07-07 JP JP2021574781A patent/JP2022537974A/ja not_active Withdrawn
- 2020-07-07 WO PCT/CN2020/100715 patent/WO2021174739A1/zh active Application Filing
- 2020-07-07 KR KR1020217041454A patent/KR20220009451A/ko active Search and Examination
-
2021
- 2021-01-04 TW TW110100180A patent/TWI770754B/zh active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034788A1 (en) * | 2014-07-30 | 2016-02-04 | Adobe Systems Incorporated | Learning image categorization using related attributes |
CN107330263A (zh) * | 2017-06-26 | 2017-11-07 | 成都知识视觉科技有限公司 | 一种计算机辅助乳腺浸润性导管癌组织学分级的方法 |
CN108520518A (zh) * | 2018-04-10 | 2018-09-11 | 复旦大学附属肿瘤医院 | 一种甲状腺肿瘤超声图像识别方法及其装置 |
CN109919961A (zh) * | 2019-02-22 | 2019-06-21 | 北京深睿博联科技有限责任公司 | 一种用于颅内cta图像中动脉瘤区域的处理方法及装置 |
CN111368923A (zh) * | 2020-03-05 | 2020-07-03 | 上海商汤智能科技有限公司 | 神经网络训练方法及装置、电子设备和存储介质 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220084677A1 (en) * | 2020-09-14 | 2022-03-17 | Novocura Tech Health Services Private Limited | System and method for generating differential diagnosis in a healthcare environment |
CN113793323A (zh) * | 2021-09-16 | 2021-12-14 | 云从科技集团股份有限公司 | 一种元器件检测方法、系统、设备及介质 |
CN114037925A (zh) * | 2021-09-27 | 2022-02-11 | 北京百度网讯科技有限公司 | 目标检测模型的训练、检测方法、装置及电子设备 |
CN113989845A (zh) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | 姿态分类方法和姿态分类模型的训练方法、装置 |
CN113989721A (zh) * | 2021-10-29 | 2022-01-28 | 北京百度网讯科技有限公司 | 目标检测方法和目标检测模型的训练方法、装置 |
CN113989407A (zh) * | 2021-12-30 | 2022-01-28 | 青岛美迪康数字工程有限公司 | Ct影像中肢体部位识别模型训练方法及系统 |
CN114612824A (zh) * | 2022-03-09 | 2022-06-10 | 清华大学 | 目标识别方法及装置、电子设备和存储介质 |
CN116077066A (zh) * | 2023-02-10 | 2023-05-09 | 北京安芯测科技有限公司 | 心电信号分类模型的训练方法、装置及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
KR20220009451A (ko) | 2022-01-24 |
CN111368923B (zh) | 2023-12-19 |
TWI770754B (zh) | 2022-07-11 |
CN111368923A (zh) | 2020-07-03 |
JP2022537974A (ja) | 2022-08-31 |
TW202133787A (zh) | 2021-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021174739A1 (zh) | 神经网络训练方法及装置、电子设备和存储介质 | |
WO2022151755A1 (zh) | 目标检测方法及装置、电子设备、存储介质、计算机程序产品和计算机程序 | |
WO2021147257A1 (zh) | 网络训练、图像处理方法及装置、电子设备和存储介质 | |
CN112767329B (zh) | 图像处理方法及装置、电子设备 | |
WO2021051965A1 (zh) | 图像处理方法及装置、电子设备、存储介质和计算机程序 | |
CN109886243B (zh) | 图像处理方法、装置、存储介质、设备以及系统 | |
WO2020211284A1 (zh) | 图像处理方法及装置、电子设备和存储介质 | |
WO2020211293A1 (zh) | 一种图像分割方法及装置、电子设备和存储介质 | |
JP2022537866A (ja) | 画像分類方法、画像分類装置、画像処理方法、医療用電子機器、画像分類機器、及びコンピュータプログラム | |
Marostica et al. | Development of a histopathology informatics pipeline for classification and prediction of clinical outcomes in subtypes of renal cell carcinoma | |
CN113222038B (zh) | 基于核磁图像的乳腺病灶分类和定位方法及装置 | |
US12118739B2 (en) | Medical image processing method, apparatus, and device, medium, and endoscope | |
CN114820584B (zh) | 肺部病灶定位装置 | |
WO2023050691A1 (zh) | 图像处理方法及装置、电子设备、存储介质和程序 | |
CN112508918A (zh) | 图像处理方法及装置、电子设备和存储介质 | |
WO2021259390A2 (zh) | 一种冠脉钙化斑块检测方法及装置 | |
WO2023142532A1 (zh) | 一种推理模型训练方法及装置 | |
TW202346826A (zh) | 影像處理方法 | |
CN113902730A (zh) | 图像处理和神经网络训练方法及装置 | |
CN115170464A (zh) | 肺图像的处理方法、装置、电子设备和存储介质 | |
Wang et al. | Breast cancer pre-clinical screening using infrared thermography and artificial intelligence: a prospective, multicentre, diagnostic accuracy cohort study | |
Saptasagar et al. | Diagnosis and Prediction of Lung Tumour Using Combined ML Techniques | |
Mikos et al. | An android-based pattern recognition application for the characterization of epidermal melanoma | |
CN113487537A (zh) | 乳腺癌超声高回声晕的信息处理方法、装置及存储介质 | |
Kumar et al. | Automating cancer diagnosis using advanced deep learning techniques for multi-cancer image classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20923436 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021574781 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20217041454 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20923436 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 27.03.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20923436 Country of ref document: EP Kind code of ref document: A1 |