CN112102285B - Bone age detection method based on multi-modal countermeasure training - Google Patents
Bone age detection method based on multi-modal countermeasure training Download PDFInfo
- Publication number
- CN112102285B CN112102285B CN202010962917.5A CN202010962917A CN112102285B CN 112102285 B CN112102285 B CN 112102285B CN 202010962917 A CN202010962917 A CN 202010962917A CN 112102285 B CN112102285 B CN 112102285B
- Authority
- CN
- China
- Prior art keywords
- dimensional
- bone age
- training
- bone
- convolution operation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 210000000988 bone and bone Anatomy 0.000 title claims abstract description 82
- 238000012549 training Methods 0.000 title claims abstract description 50
- 238000001514 detection method Methods 0.000 title claims abstract description 41
- 238000012545 processing Methods 0.000 claims description 11
- 230000017105 transposition Effects 0.000 claims description 11
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 5
- 238000005286 illumination Methods 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 4
- 210000000693 hamate bone Anatomy 0.000 claims description 2
- 238000000034 method Methods 0.000 description 36
- 238000004458 analytical method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 208000004434 Calcinosis Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 208000017701 Endocrine disease Diseases 0.000 description 1
- 206010020564 Hyperadrenocorticism Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002308 calcification Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000003010 carpal bone Anatomy 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002124 endocrine Effects 0.000 description 1
- 208000030172 endocrine system disease Diseases 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000004553 finger phalanx Anatomy 0.000 description 1
- 208000037824 growth disorder Diseases 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012966 insertion method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 210000000236 metacarpal bone Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000011164 ossification Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 210000000623 ulna Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10116—X-ray image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30008—Bone
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Public Health (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Quality & Reliability (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Image Analysis (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention provides a bone age detection method based on multi-modal countermeasure training, which comprises the steps of constructing a bone age prediction data set; constructing a bone age detection model based on multi-modal countermeasure training for training; in the prediction stage, only a discriminator of the bone age detection model is reserved, softmax is added in the last layer, and the weight of the discrimination model with optimal training is loaded to perform result prediction. The bone age detection method based on the multi-mode countermeasure training adopts the mode of the countermeasure training to improve the accuracy of model prediction; performing multi-mode bone age detection by using the medical images and the text medical records; training by using the X-ray data set of the Chinese teenagers to obtain a bone age identification result which accords with the Chinese teenagers; the bone age identification of the multi-mode data is realized, and the identification result is more accurate by combining text information.
Description
Technical Field
The invention relates to the technical field of bone age detection, in particular to a bone age detection method based on multi-mode countermeasure training.
Background
Bone age analysis is an important index of growth and development degree, and plays an important role in the fields of medicine, sports, judicial identification and the like. The degree of skeletal calcification in children is a determinant of bone age. The bone age can accurately reflect the development level of people at all ages in the growth process. Measuring the bone age of children is typically determined by radiologists comparing X-ray films of children's hands with standard states of their corresponding ages.
The bone age evaluation of teenagers plays an important role in diagnosis of pediatric endocrine problems and childhood growth disorders, is commonly used for screening symptoms such as pediatric endocrine disorder, growth and development delay, congenital adrenal cortex hyperplasia and the like, and can evaluate the intervention effect of technical use. In addition, bone age can also be used to identify the true age of minors, which is used in both young criminal cases and in sports matches to confirm player age.
X-ray irradiation treatment is carried out on radius, palm, cephalic bone, uncinate bone, phalangeal bone and the like of a tester, so that X-ray irradiation images of three different directions of far, middle and near are obtained. The obtained X-ray irradiation image and necessary information related to the inspector (such as the height of parents of the inspector, the disease history of the inspector, the birth date of the inspector, the height of the inspector, etc. can be dictated by voice recognition or manually filled in) are subjected to data processing to obtain multi-source mode data.
Standard methods for assessing skeletal maturity have existed for nearly 100 years. The Tanner-Whitehouse (TW) process and the Greulich-Pyle (GP) process are two common processes. G-P atlas was developed in the last 30-50 years, based on the current time of the children in the upper family of the United states, from birth to adulthood. The publication was successfully made by Greulich and Pyle in 1950, revised in 1959, and then widely spread and used until now. There are two ways to evaluate bone age by G-P mapping. One is to evaluate the bone age of each bone separately, and finally obtain the average bone age value of the whole bone, and the other is to compare the bone age sheet to be tested with the atlas one by one, and take the nearest bone age as the bone age, if between two adjacent age atlas, take the average value to estimate, called the whole bone matching method (i.e. the insertion method). The TW method was originally established in the 1930 s for white european children. The Tanner-Whitehouse method second edition (TW 2) was published in 1983 based on data of 1950 s and 1960 s, and updated in 2001 as the Tanner-Whitehouse method third edition (TW 3). The estimated bone age by the TW3 method is slightly lower than that estimated by the TW2 method. The TW method calculates the scores of radius, ulna and short bones, and each major bone of the hand will account for the total score. One meta analysis suggests that the TW3 method predicts the age of caucasians more accurately than the TW2 or GP method, whereas for caucasian children, both TW3 and TW2 methods are more accurate than the GP method. The TW method takes 7.9 minutes to assess bone age and is the recommended method for European endocrinologists.
The main method for detecting bone age at present is to obtain X-ray images through X-ray irradiation of finger bones, metacarpal bones and carpal bones of left and right hands of a tester, and evaluate the images through a Greulich and Pyle (G-P) method and a Tanner-Whitehouse (TW) method, wherein the G-P method can lead to different analysis conclusion due to subjective factors of an analyzer, while the TW method eliminates the subjective factors of the analyzer, but the detection process is relatively time-consuming, and the bone age analysis conclusion is difficult to obtain in a short time. In addition, the existing bone age detection equipment cannot accurately and rapidly obtain a bone age detection conclusion, most detection standards are based on the defects that white children are not suitable for other Asian children such as China, detection information can only depend on a single information source, and the like.
The traditional bone age detection method can only process image information, and semantic information such as parent height of a detector, age of the detector and the like is not fully combined. The conventional bone age detection method only uses Convolutional Neural Network (CNN), and has poor network expandability when processing images obtained by different testers and different machines.
Disclosure of Invention
Aiming at the technical problems, the invention aims to provide a multi-mode countermeasure training-based bone age detection method, which utilizes medical images and text medical records to carry out multi-mode bone age detection and adopts a countermeasure training mode to improve the accuracy of model prediction.
In order to achieve the above purpose, the invention provides a bone age detection method based on multi-modal countermeasure training, comprising the following steps:
s1: constructing a bone age prediction data set;
s2: constructing a bone age detection model based on multi-modal countermeasure training for training;
s3: in the prediction stage, only a discriminator of the bone age detection model is reserved, softmax is added in the last layer, and the weight of the discrimination model with optimal training is loaded to perform result prediction.
Optionally, the dataset in step S1 includes one-to-one correspondence between X-ray pictures and case text summaries.
Preferably, the X-ray illumination picture comprises X-ray irradiation treatment of radius, palm, cephalic bone, uncinate bone and phalange of a tester in sequence to obtain X-ray irradiation images of three different directions of far, middle and near;
different gesture images of the tester obtained by the X-rays and text information obtained by inquiring the electronic medical record database are sent into a bone age detection model based on multi-modal countermeasure training for prediction.
Further, the bone age detection model in step S2 includes a generator for generating new sample data according to a distribution rule of the real data and a discriminator for identifying whether the data is from the real data or from the newly generated sample data.
Further, the steps of constructing the generator are as follows:
firstly, transpose convolution operation is performed, wherein three super parameters, namely the number of batch processing samples, noise dimension and pixels of initial noise samples, are given, and the three parameters form four-dimensional tensor required by a generation model; secondly, converting the dimension into a four-dimensional tensor of (1, 512,4,4) after a two-dimensional transposition convolution operation with a primary convolution kernel of 4*4 and a step length of 1; the dimension is converted into a four-dimensional tensor of (1, 64, 32, 32) after three-dimensional convolution kernel 4*4 and two-dimensional transposition convolution operation with the step length of 2; finally, performing a two-dimensional transposition convolution operation with a convolution kernel 5*5 and a step length of 3, and outputting tensors (1, 64, 96, 96);
the second step is downsampling operation, and the dimension is converted into four-dimensional tensors (1, 256, 24, 24) after two-dimensional convolution operation with the convolution kernel 4*4 and the step length of 2 is carried out twice;
thirdly, residual network operation is performed, and the dimension output after passing through a residual network consisting of 6 residual blocks is kept unchanged;
the fourth step is up sampling operation, two-dimensional transpose convolution operation with step length of 2 and three-dimensional transpose convolution operation with step length of 1 is performed respectively, the dimensions are converted into four-dimensional tensors of (1, 64, 102, 102), and finally three-channel pictures of 102 x 102 pixels are output through two-dimensional convolution operation with step length of 1 with one convolution core of 7*7.
Further, the step of constructing the arbiter is as follows:
firstly, inputting a piece of sample data, and converting the sample data into four-dimensional tensors of (1, 3, 102, 102);
then, converting the dimension into a four-dimensional tensor of (1, 64, 102, 102) after 1*1 convolution operation;
the dimension is converted into a four-dimensional tensor of (1, 1024,1,1) after convolution operation with four convolution kernels of 4*4 and a step length of 2;
finally, a convolution operation with a convolution kernel of 6*6 and a step length of 1 is converted into a scalar and output.
From above, the bone age detection method based on multi-modal countermeasure training of the invention has at least the following beneficial effects:
(1) The model prediction accuracy is improved by adopting a countermeasure training mode;
(2) Performing multi-mode bone age detection by using the medical images and the text medical records;
(3) Training by using the X-ray data set of the Chinese teenagers to obtain a bone age identification result which accords with the Chinese teenagers;
(4) The bone age identification of the multi-mode data is realized, and the identification result is more accurate by combining text information.
Drawings
FIG. 1 is a flow chart of a multi-modal challenge training based bone age detection method of the present invention;
FIG. 2 is a network structure diagram of a bone age detection model of a bone age detection method based on multimodal challenge training of the present invention;
FIG. 3 is a network architecture diagram of an image processing portion of the multi-modal challenge training based bone age detection method of the present invention;
fig. 4 is a diagram of a data set generation sample of the multi-modal challenge training based bone age detection method of the present invention, (a) being an image and (b) being text.
Detailed Description
The bone age detection method based on the multi-modal countermeasure training according to the present invention will be described in detail with reference to fig. 1 to 4.
As shown in fig. 1, the invention combines technical ideas such as multi-mode countermeasure training to construct a bone age detection method based on multi-mode countermeasure training, and the method firstly constructs a bone age prediction data set which comprises one-to-one correspondence of X-ray illumination pictures and case text summaries; secondly, constructing a bone age detection model based on multi-modal countermeasure training for training; and finally, only a model discriminator is reserved in the prediction stage, softmax is added in the last layer, and the weight of the optimal discrimination model is loaded for result prediction. Specifically, the invention mainly utilizes the attention-based CGAN network structure to realize multi-mode prediction of medical images and text medical records, and the network structure has stronger expandability, and the following is detailed:
(1) First, the inspector needs to fill in necessary information such as the height of parents of the inspector, the disease history of the inspector, the birth date of the inspector, the height of the inspector, etc.
The said speech is converted into text and the technology used is speech recognition technology. The following is an overview of speech recognition techniques:
speech recognition technology (Automatic Speech Recognition, ASR): the problem to be solved by speech recognition is to let a computer "understand" human speech, converting it into text. The voice recognition is a front-edge array for realizing intelligent man-machine interaction, and is a precondition for realizing machine translation, natural language understanding and the like.
(2) Then, the radius, palm, cephalic bone, hamate bone, phalange bone and the like of the detected person are sequentially subjected to X-ray irradiation treatment, so that X-ray irradiation images of three different directions of far, middle and near are obtained. X-rays shown in the following graph obtain images of the distal, middle and near three different orientations of the phalanges of the patient, so that the ossification center can be clearly obtained, and the smooth and continuous edges are provided;
(3) Different gesture images of the tester obtained by the X-ray and text information obtained by inquiring the electronic medical record database are sent into a bone age detection model based on multi-mode countermeasure training for prediction, and the prediction result is displayed in a short time;
(4) Samples in the prediction process, especially samples with prediction errors, can be used as a data set for the next training, so that the accuracy of the prediction of the model is gradually improved.
The model prediction accuracy can be improved by adopting the countermeasure training mode, so the model is built based on the CGAN network to realize the countermeasure training, the network framework of a specific model is shown in a figure 2, wherein the upper half part of the figure 2 is a text information processing part which supports the search of the age, sex, height, father height, mother height of a detector from an electronic medical record database, and whether keyword information such as a certain disease is affected or not is taken as Con in the model. The lower half part is a processing part of Image information, and the part mainly combines with a CGAN model to form countermeasure training, wherein Image represents medical images detected by a real detector and corresponds to medical text medical records of the upper half part; z represents modeling a random noise distribution; g represents a generator; d represents a discriminator; in the training process, G is continuously learned through feedback given by D until Z can fit the distribution curve of Image data.
As shown in fig. 3, the upper half of the image is constructed by a generator, the first step of constructing the generator is a transposed convolution operation, three super parameters, namely, the number of batch samples, the noise dimension and the pixels of the initial noise samples are firstly given, and the three parameters form four-dimensional tensors required for generating a model, and the four-dimensional tensors are defaulted (1, 200,1,1), wherein 1 represents the number of batch samples, 200 represents the dimension of input noise (including Condition100 dimension), and 1*1 represents the pixels of the initial noise samples; secondly, converting the dimension into a four-dimensional tensor of (1, 512,4,4) after a two-dimensional transposition convolution operation with a primary convolution kernel of 4*4 and a step length of 1; the dimension is converted into a four-dimensional tensor of (1, 64, 32, 32) after three-dimensional convolution kernel 4*4 and two-dimensional transposition convolution operation with the step length of 2; and finally, performing a two-dimensional transposition convolution operation with a convolution kernel of 5*5 and a step length of 3, and outputting tensors of (1, 64, 96 and 96). In the second step of downsampling operation, the dimensions are converted into four-dimensional tensors (1, 256, 24, 24) after two-dimensional convolution operation with the convolution kernel of 4*4 and the step length of 2, the primary purpose of downsampling operation is to reduce parameters, speed up operation, and the feeling after sampling is more suitable for network extraction characteristics. And thirdly, carrying out residual network operation, wherein the dimension outputted after passing through a residual network consisting of 6 residual blocks is kept unchanged. And a fourth step of up-sampling operation, namely, performing two-dimensional transposition convolution operation with a step length of 2 and a three-dimensional transposition convolution operation with a step length of 1 by using a convolution kernel of 4*4 and a step length of 4*4 respectively, converting the dimensions into four-dimensional tensors of (1, 64, 102 and 102), and finally performing two-dimensional convolution operation with a convolution kernel of 7*7 and a step length of 1 to output a three-channel picture with 102 x 102 pixels.
The lower part is constructed by a discriminator, and the construction process of the depth residual generation type countermeasure network is contrary to the construction process of the generation type model network structure, and the discrimination model network structure of the depth residual generation type countermeasure network is used for converting a four-dimensional tensor into a scalar. Firstly, inputting a piece of sample data, and converting the sample data into four-dimensional tensors of (1, 3, 102, 102); then, converting the dimension into a four-dimensional tensor of (1, 64, 102, 102) after 1*1 convolution operation; the dimension is converted into a four-dimensional tensor of (1, 1024,1,1) after convolution operation with four convolution kernels of 4*4 and a step length of 2; finally, a convolution operation with a convolution kernel of 6*6 and a step length of 1 is converted into a scalar and output.
The middle of the image is used for guiding the generator to learn the sample distribution by searching text information in the electronic medical record database and guiding the discriminator to discriminate the difference in the sample.
In order to obtain higher accuracy of capturing tiny features of bone scales by the model, an image processing part respectively uses a depth residual error generation network and a depth residual error dense network to construct an countermeasure model, as shown in fig. 3, an upper half part of the model is a generator, random noise (Z) is connected with 10 tag dimensions into 200-dimension data as input, and a plurality of dense residual error blocks formed by a convolution network are used for generating samples; the lower half part of the model is a discriminator, and the discriminator judges whether an input photo is a real picture or a generated picture through a plurality of convolution residual blocks formed by a convolution network.
The residual network is characterized by easy optimization and can improve accuracy by increasing the appropriate depth. The residual blocks inside the deep neural network are connected in a jumping mode, and the gradient disappearance problem caused by depth increase in the deep neural network is relieved. However, the residual network does not fully exploit the information of each convolution layer, only with connections after partial convolution operations. In order to make full use of the acquired receptive field information by all layers in the generator and reduce unnecessary calculation amount, the generator is constructed by adopting a dense residual error module, and the discriminator is constructed by using a residual error network.
Model training the predicted overall process:
(1) Constructing a data set of one-to-one correspondence of images and text
As shown in FIG. 4, the left side is a detector X-ray illumination picture, the right side is a text abstract of the detector case, and the two are in correspondence with each other in many pairs to construct the bone age prediction data set of the invention. In order to improve the accuracy of the training model, the radius, palm, cephalic bone, uncinate bone and phalangeal bone of the tester are irradiated by using X-ray shooting in a multi-angle mode so as to obtain a plurality of pictures of far, middle and near and the like to obtain bone structure information (similar to Gaussian difference in SIFT idea in a traditional image algorithm) under different scales.
(2) The invention builds a model BAGAN based on the ideas of CGAN and InfoGAN, a generator of the BAGAN generates new sample data according to the distribution rule of real data, a discriminator is used for identifying whether the data is from the real data or the training of the newly generated sample data BAGAN is a process of a very small and very large game, the final aim of training is to enable the generator to completely capture the sample distribution rule in the real data, so that the sample is generated, the discriminator discriminates whether the generated sample is the real sample or not, the learning rate of the generator is slightly lower than the learning rate of the discriminator in the training process, and the training of the two models is performed in a crossing way, namely the discriminator is trained first and then the generator is trained in a crossing way. In order to make the model easier to train, the invention carries out the deconvolution operation on the generator and then carries out Instance Normalization processing on the generator, and in order to ensure the nonlinearity of the generator, carries out the LeakyRELU processing on the generator; performing convolution operation on the discriminator, and then performing Batch Normalization processing on the discriminator, and performing RELU processing on the discriminator in order to ensure nonlinearity; the method comprises the steps of searching text information in an electronic medical record database as a Condition in a model, and limiting probability distribution of a generator for learning a real sample under the limiting Condition.
(3) Prediction process
In the model prediction stage, only a discriminator part is needed to be reserved, the discriminator part loads an optimal model in the training process, and a softmax is added at the end of the network to output the model which is originally predicted to be true or false as a specific classification.
Model predictive overall results:
the BAGAN network of the invention uses the idea of resistance training and combines a model constructed by InfoGAN on the basis of a condition GAN. The use of a double time scale update rule (TTUR) allows the arbiter to capture the true sample distribution rules faster and more fully. And finally, connecting the pre-trained discriminator with a softmax layer as a classifier for classification in a prediction stage.
Inputting the multi-source mode data into a trained classifier, comparing the output of the classification which is matched with the bone characteristic information, determining the growth stage of the bone, determining the specific bone age of the detected person and analyzing the result.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (3)
1. A bone age detection method based on multi-modal countermeasure training is characterized by comprising the following steps:
s1: constructing a bone age prediction data set;
s2: constructing a bone age detection model based on multi-modal countermeasure training for training;
s3: in the prediction stage, only a discriminator of a bone age detection model is reserved, softmax is added in the last layer, and the weight of a discrimination model with optimal training is loaded for result prediction;
the bone age detection model in the step S2 comprises a generator for generating new sample data according to the distribution rule of the real data and a discriminator for identifying whether the data is from the real data or the newly generated sample data;
the steps of building the generator are as follows:
firstly, transpose convolution operation is performed, wherein three super parameters, namely the number of batch processing samples, noise dimension and pixels of initial noise samples, are given, and the three parameters form four-dimensional tensor required by a generation model; secondly, converting the dimension into a four-dimensional tensor of (1, 512,4,4) after a two-dimensional transposition convolution operation with a primary convolution kernel of 4*4 and a step length of 1; the dimension is converted into a four-dimensional tensor of (1, 64, 32, 32) after three-dimensional convolution kernel 4*4 and two-dimensional transposition convolution operation with the step length of 2; finally, performing a two-dimensional transposition convolution operation with a convolution kernel 5*5 and a step length of 3, and outputting tensors (1, 64, 96, 96);
the second step is downsampling operation, and the dimension is converted into four-dimensional tensors (1, 256, 24, 24) after two-dimensional convolution operation with the convolution kernel 4*4 and the step length of 2 is carried out twice;
thirdly, residual network operation is performed, and the dimension output after passing through a residual network consisting of 6 residual blocks is kept unchanged;
the fourth step is up sampling operation, two-dimensional transpose convolution operation with step length of 1 and two-dimensional transpose convolution operation with step length of 2 and three-dimensional transpose convolution operation with step length of 4*4 are respectively carried out, dimensions are converted into four-dimensional tensors of (1, 64, 102, 102), and finally three-channel pictures with 102 x 102 pixels are output through two-dimensional convolution operation with step length of 1 with one convolution core of 7*7;
the steps of constructing the arbiter are as follows:
firstly, inputting a piece of sample data, and converting the sample data into four-dimensional tensors of (1, 3, 102, 102);
then, converting the dimension into a four-dimensional tensor of (1, 64, 102, 102) after 1*1 convolution operation;
the dimension is converted into a four-dimensional tensor of (1, 1024,1,1) after convolution operation with four convolution kernels of 4*4 and a step length of 2;
finally, a convolution operation with a convolution kernel of 6*6 and a step length of 1 is converted into a scalar and output.
2. The bone age detection method based on multimodal challenge training according to claim 1, wherein the dataset in step S1 includes one-to-one correspondence of X-ray illumination pictures and case text summaries.
3. The bone age detection method based on multi-modal challenge training according to claim 2, wherein the X-ray illumination picture comprises sequentially performing X-ray irradiation treatment on radius, palm, cephalic bone, hamate bone, phalanx of the examiner, obtaining X-ray irradiation images of three different orientations of far, middle and near;
different gesture images of the tester obtained by the X-rays and text information obtained by inquiring the electronic medical record database are sent into a bone age detection model based on multi-modal countermeasure training for prediction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010962917.5A CN112102285B (en) | 2020-09-14 | 2020-09-14 | Bone age detection method based on multi-modal countermeasure training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010962917.5A CN112102285B (en) | 2020-09-14 | 2020-09-14 | Bone age detection method based on multi-modal countermeasure training |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112102285A CN112102285A (en) | 2020-12-18 |
CN112102285B true CN112102285B (en) | 2024-03-12 |
Family
ID=73750971
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010962917.5A Active CN112102285B (en) | 2020-09-14 | 2020-09-14 | Bone age detection method based on multi-modal countermeasure training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112102285B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112785559B (en) * | 2021-01-05 | 2022-07-12 | 四川大学 | Bone age prediction method based on deep learning and formed by mutually combining multiple heterogeneous models |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109345508A (en) * | 2018-08-31 | 2019-02-15 | 北京航空航天大学 | A kind of Assessing Standards For Skeletal method based on two stages neural network |
CN109544518A (en) * | 2018-11-07 | 2019-03-29 | 中国科学院深圳先进技术研究院 | A kind of method and its system applied to the assessment of skeletal maturation degree |
WO2019232960A1 (en) * | 2018-06-04 | 2019-12-12 | 平安科技(深圳)有限公司 | Automatic bone age prediction method and system, and computer device and storage medium |
CN110597628A (en) * | 2019-08-29 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Model distribution method and device, computer readable medium and electronic equipment |
CN111161254A (en) * | 2019-12-31 | 2020-05-15 | 上海体育科学研究所 | Bone age prediction method |
-
2020
- 2020-09-14 CN CN202010962917.5A patent/CN112102285B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019232960A1 (en) * | 2018-06-04 | 2019-12-12 | 平安科技(深圳)有限公司 | Automatic bone age prediction method and system, and computer device and storage medium |
CN109345508A (en) * | 2018-08-31 | 2019-02-15 | 北京航空航天大学 | A kind of Assessing Standards For Skeletal method based on two stages neural network |
CN109544518A (en) * | 2018-11-07 | 2019-03-29 | 中国科学院深圳先进技术研究院 | A kind of method and its system applied to the assessment of skeletal maturation degree |
CN110597628A (en) * | 2019-08-29 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Model distribution method and device, computer readable medium and electronic equipment |
CN111161254A (en) * | 2019-12-31 | 2020-05-15 | 上海体育科学研究所 | Bone age prediction method |
Non-Patent Citations (1)
Title |
---|
Bone Age Assessment Based on Rank-Monotonicity Enhanced Ranking CNN;Bo Liu 等;IEEE;20190909;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112102285A (en) | 2020-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490242B (en) | Training method of image classification network, fundus image classification method and related equipment | |
CN110490239B (en) | Training method, quality classification method, device and equipment of image quality control network | |
WO2016192612A1 (en) | Method for analysing medical treatment data based on deep learning, and intelligent analyser thereof | |
Montalbo | Diagnosing Covid-19 chest x-rays with a lightweight truncated DenseNet with partial layer freezing and feature fusion | |
Engels et al. | Plagiarism detection using feature-based neural networks | |
Dentamaro et al. | AUCO ResNet: an end-to-end network for Covid-19 pre-screening from cough and breath | |
McMaster et al. | Artificial intelligence and deep learning for rheumatologists | |
JP2018068752A (en) | Machine learning device, machine learning method and program | |
WO2021179514A1 (en) | Novel coronavirus patient condition classification system based on artificial intelligence | |
Koirala et al. | Deep learning for real-time malaria parasite detection and counting using YOLO-mp | |
CN110443105A (en) | The immunofluorescence image kenel recognition methods of autoimmunity antibody | |
CN116597916A (en) | Prediction method of antitumor compound prognosis efficacy based on organ chip and deep learning | |
CN112102285B (en) | Bone age detection method based on multi-modal countermeasure training | |
CN116664928A (en) | Diabetic retinopathy grading method and system based on CNN and transducer | |
CN116452592B (en) | Method, device and system for constructing brain vascular disease AI cognitive function evaluation model | |
Lin et al. | Combining collective and artificial intelligence for global health diseases diagnosis using crowdsourced annotated medical images | |
Mellal et al. | CNN Models Using Chest X-Ray Images for COVID-19 Detection: A Survey. | |
Jia et al. | Fine-grained precise-bone age assessment by integrating prior knowledge and recursive feature pyramid network | |
CN115017910A (en) | Entity relation joint extraction method, network, equipment and computer readable storage medium based on Chinese electronic medical record | |
Ummah et al. | Covid-19 and Tuberculosis Detection in X-Ray of Lung Images with Deep Convolutional Neural Network. | |
Vakili et al. | Multi-class primary morphology lesions classification using deep convolutional neural network | |
Shobha Rani et al. | Chronological age assessment based on wrist radiograph processing–Some novel approaches | |
Mannepalli et al. | An Early Detection of Pneumonia in CXR Images using Deep Learning Techniques | |
Liu et al. | Multi-site incremental image quality assessment of structural mri via consensus adversarial representation adaptation | |
Yadav et al. | Comparing Gradient Descent and Genetic Algorithm for Optimization In Regression Analysis For Handling Nonlinearity In Liver Cirrhosis: A Survey |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |