CN112102285A - Bone age detection method based on multi-modal confrontation training - Google Patents

Bone age detection method based on multi-modal confrontation training Download PDF

Info

Publication number
CN112102285A
CN112102285A CN202010962917.5A CN202010962917A CN112102285A CN 112102285 A CN112102285 A CN 112102285A CN 202010962917 A CN202010962917 A CN 202010962917A CN 112102285 A CN112102285 A CN 112102285A
Authority
CN
China
Prior art keywords
bone age
training
dimensional
modal
age detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010962917.5A
Other languages
Chinese (zh)
Other versions
CN112102285B (en
Inventor
陈吉
王星
林清水
杜伟
陈海涛
沈芷佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning Technical University
Original Assignee
Liaoning Technical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning Technical University filed Critical Liaoning Technical University
Priority to CN202010962917.5A priority Critical patent/CN112102285B/en
Publication of CN112102285A publication Critical patent/CN112102285A/en
Application granted granted Critical
Publication of CN112102285B publication Critical patent/CN112102285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a bone age detection method based on multi-modal confrontation training, which comprises the steps of constructing a bone age prediction data set; constructing a bone age detection model based on multi-modal confrontation training for training; in the prediction stage, only a discriminator of the bone age detection model is reserved, softmax is added into the last layer, and the optimal discrimination model weight is loaded and trained to predict the result. The bone age detection method based on multi-modal confrontation training adopts the mode of the confrontation training to improve the accuracy of model prediction; performing multi-mode bone age detection by using medical images and text medical records; training by using a Chinese teenager X-ray data set to obtain a bone age identification result which accords with the Chinese teenager; bone age recognition of multi-mode data is achieved, and recognition results are more accurate by combining text information.

Description

Bone age detection method based on multi-modal confrontation training
Technical Field
The invention relates to the technical field of bone age detection, in particular to a bone age detection method based on multi-modal confrontation training.
Background
Bone age analysis plays an important role in the fields of medicine, sports, judicial identification and the like as an important index of the growth and development degree. The degree of skeletal calcification in children is a determinant of bone age. The bone age can accurately reflect the development level of each age stage in the growth process of people. Measuring the bone age of a child is usually determined by a radiologist comparing the standard state of the child's hands against their corresponding age.
The evaluation of the adolescent bone age plays an important role in the aspects of pediatric endocrine problems and children growth disorder diagnosis, is commonly used for screening symptoms such as disorganized secretin, delayed growth and development, congenital adrenal cortical hyperplasia and the like in adolescents, and can also evaluate the intervention effect of technical use. In addition, bone age can also be used to identify the actual age of minors, both in juvenile crime cases and in sports competitions to identify the age of a player.
X-ray irradiation processing is carried out on the radius, palm, capitate bone, hamate bone, phalanx and the like of a detected person, and X-ray irradiation images of a far direction, a middle direction and a near direction in three different directions are obtained. The obtained X-ray irradiation image and necessary information (such as parent height of examiner, disease history of examiner, birth date of examiner, height of examiner, and the like) related to the examiner are processed in a data mode to obtain multi-source modal data.
Standard methods for assessing skeletal maturity have existed for nearly 100 years. The Tanner-Whitehouse (TW) and Greulich-Pyle (GP) processes are two common processes. The G-P atlas method is developed in the last 30-50 years, and is a longitudinal research result from birth to adult based on children of middle and upper-level families in the United states at that time. The development of Greulich and Pyle was successfully published in 1950, revised in 1959, and widely distributed later and extended to the present. There are two ways to evaluate bone age using G-P spectroscopy. The other method is to compare the bone age slices to be tested with the maps one by one, take the closest one as the bone age, and take the mean value to estimate if the bone age is between the two adjacent age maps, which is called as the whole-piece matching method (i.e. the insertion method). The TW method was initially established in the 1930 s for caucasian european children. The second version of the Tanner-Whitehouse process (TW2) was published in 1983 based on data in 1950 s and 1960 s, and was updated in 2001 to the third version of the Tanner-Whitehouse process (TW 3). The bone age estimated by the TW3 method was slightly lower than that estimated by the TW2 method. The TW method calculates the radial, ulnar and short bone scores, and the total score is calculated for each major skeleton of the hand. One meta analysis suggested that TW3 predicted caucasian age more accurately than TW2 or GP, while TW3 and TW2 both more accurately predicted caucasian age than GP for caucasian children. The TW method requires 7.9 minutes to assess bone age and is the method recommended by European Endocrinologists.
At present, the main method for detecting bone age is to irradiate X-ray to phalanges, metacarpals and carpals of left and right hands of a tester by X-ray to obtain X-ray images, and evaluate the images by a Greulich and Pyle (G-P) method and a Tanner-Whitehouse (TW) method, wherein the G-P method can lead to different analysis conclusions due to subjective factors of an analyst, while the TW method eliminates the subjective factors of the analyst, but the detection process is relatively time-consuming and the bone age analysis conclusion can not be obtained in a short time. In addition, the existing bone age detection equipment cannot accurately and quickly obtain a bone age detection conclusion, the detection standard is mostly based on the defects that white children are not suitable for other Asian children such as China, detection information can only depend on a single information source and the like.
The traditional bone age detection method can only process image information and does not fully combine semantic information such as parent height of a detector, age of the detector and the like. The traditional bone age detection method only uses a Convolutional Neural Network (CNN), and the network expandability is poor when images obtained by different detectors and different machines are processed.
Disclosure of Invention
In view of the above technical problems, an object of the present invention is to provide a bone age detection method based on multi-modal countertraining, which performs multi-modal bone age detection by using medical images and text medical records, and improves accuracy of model prediction by using countertraining mode.
In order to achieve the above object, the present invention provides a bone age detection method based on multi-modal confrontation training, comprising the following steps:
s1: constructing a bone age prediction data set;
s2: constructing a bone age detection model based on multi-modal confrontation training for training;
s3: in the prediction stage, only a discriminator of the bone age detection model is reserved, softmax is added into the last layer, and the optimal discrimination model weight is loaded and trained to predict the result.
Optionally, the data set in step S1 includes X-ray illumination pictures and case text summaries in one-to-one correspondence.
Preferably, the X-ray radiograph includes X-ray irradiation processing of radius, palm, capitate bone, hamate bone and phalanx of the examinee in sequence to obtain X-ray irradiation images of three different directions, namely far, middle and near;
different posture images of the examiner obtained by the X-ray and text information obtained by inquiring the electronic medical record database are sent to a bone age detection model based on multi-modal confrontation training together for prediction.
Further, the bone age detection model in step S2 includes a generator for generating new sample data according to the distribution rule of the real data and a discriminator for identifying whether the data is from the real data or from the newly generated sample data.
Further, the step of constructing the generator is as follows:
the first step is transposition convolution operation, firstly three hyper-parameters, namely the number of batch processing samples, noise dimensionality and pixels of an initial noise sample are given, and a four-dimensional tensor required by a generation model is formed by the three parameters; secondly, converting the dimensionality into a four-dimensional tensor of (1, 512, 4, 4) after a two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and step length of 1; converting the dimensionality into a four-dimensional tensor of (1, 64, 32, 32) after three times of two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and a step length of 2; finally, after the two-dimensional transposition convolution operation with 5 × 5 convolution kernels and 3 step sizes, the output tensor is (1, 64, 96, 96);
the second step is a down-sampling operation, and the dimensionality is converted into a four-dimensional tensor of (1, 256, 24, 24) after two-dimensional convolution operation with 4 x 4 convolution kernels and 2 step lengths;
thirdly, residual error network operation is carried out, and the dimension output after the residual error network consisting of 6 residual error blocks is kept unchanged;
and the fourth step is up-sampling operation, the dimension is converted into a four-dimensional tensor of (1, 64, 102, 102) after two-dimensional transposition convolution operation with two convolution kernels of 4 × 4, the step length of 2 and three convolution kernels of 4 × 4 and the step length of 1, and finally a three-channel picture of 102 × 102 pixels is output through two-dimensional convolution operation with 7 × 7 convolution kernels and the step length of 1.
Further, the steps of constructing the discriminator are as follows:
firstly, inputting a sample data, converting the sample data into a four-dimensional tensor of (1, 3, 102, 102) after digitalization;
secondly, converting the dimensionality into a four-dimensional tensor of (1, 64, 102, 102) after a 1-x 1 convolution operation;
converting the dimensionality into a four-dimensional tensor of (1, 1024, 1, 1) after convolution operation with four convolution kernels of 4 x 4 and step length of 2;
and finally, converting the convolution operation with a convolution kernel of 6 x 6 and step length of 1 into a scalar and outputting the scalar.
Therefore, the bone age detection method based on the multi-modal confrontation training has at least the following beneficial effects:
(1) improving the accuracy of model prediction by adopting a mode of antagonistic training;
(2) performing multi-mode bone age detection by using medical images and text medical records;
(3) training by using a Chinese teenager X-ray data set to obtain a bone age identification result which accords with the Chinese teenager;
(4) bone age recognition of multi-mode data is achieved, and recognition results are more accurate by combining text information.
Drawings
FIG. 1 is a flow chart of the bone age detection method of the present invention based on multi-modal opponent training;
FIG. 2 is a network structure diagram of a bone age detection model of the bone age detection method based on multi-modal confrontation training of the invention;
FIG. 3 is a network architecture diagram of the image processing portion of the bone age detection method based on multi-modal opponent training of the present invention;
fig. 4 is a data set sample chart of the bone age detection method based on multi-modal confrontation training of the present invention, wherein (a) is an image and (b) is a text.
Detailed Description
The bone age detection method based on multi-modal confrontation training according to the present invention will be described in detail with reference to fig. 1 to 4.
As shown in fig. 1, the bone age detection method based on multi-modal confrontation training is constructed by combining the technical ideas of multi-modal confrontation training and the like, and firstly, a bone age prediction data set is constructed, wherein the bone age prediction data set comprises X-ray illumination pictures and case text abstracts which correspond to one another one by one; secondly, constructing a bone age detection model based on multi-modal confrontation training for training; and in the final prediction stage, only the discriminator of the model is reserved, softmax is added into the last layer, and the optimal discrimination model weight is loaded and trained to predict the result. Specifically, the invention mainly utilizes an attention-based CGAN network structure to realize multi-modal prediction of medical images and text medical records, and the network structure has strong expandability, which is described in detail as follows:
(1) the examiner is required to fill in necessary information such as the examiner's parent height, the examiner's disease history, the examiner's birth date, the examiner's height, and the like.
The speech is converted into characters, and the used technology is a speech recognition technology. The following is an overview of speech recognition technology:
speech Recognition technology (ASR): the problem to be solved by speech recognition is to enable a computer to "understand" human speech and convert the speech into text. Speech recognition is a leading-edge place for realizing intelligent human-computer interaction, and is a precondition for realizing machine translation, natural language understanding and the like.
(2) Then, the radius, palm, skull, hamate bone, phalanx and the like of the examiner are sequentially subjected to X-ray irradiation treatment, and X-ray irradiation images of the far, middle and near three different directions are obtained. The X-ray shown in the following figure obtains images of the far, middle and near three different postures of the phalanx of the patient, so that the ossification center is clear and has smooth and continuous edges;
(3) different posture images of a detector obtained by X-rays and text information obtained by inquiring an electronic medical record database are sent to a bone age detection model based on multi-modal confrontation training together for prediction, and a prediction result is displayed in a short time;
(4) samples in the prediction process, especially samples with wrong prediction can be used as a data set for next round of training, so that the prediction accuracy of the model is gradually improved.
The accuracy of model prediction can be improved by adopting a mode of the antagonistic training, so the model is constructed based on the CGAN network to realize the antagonistic training, the network framework of the specific model is shown in figure 2, wherein the upper half part of the figure 2 is a text information processing part which supports searching keyword information of the age, the sex, the height of a father, the height of a mother, whether a certain disease is suffered or not and the like of a detector from an electronic medical record database to serve as Con in the model. The lower half part is an Image information processing part which is mainly combined with a CGAN model to form confrontation training, wherein the Image represents a medical Image detected by a real detector and corresponds to the medical text medical record of the upper half part; z represents the simulation of a random noise distribution; g represents a generator; d represents a discriminator; and G continuously learns through feedback given by D in the training process until Z can fit a distribution curve of the Image data.
As shown in fig. 3, the generator is constructed in the upper half of the image, the first step of constructing the generator is a transposed convolution operation, three hyper-parameters, namely, the number of batch processing samples, the noise dimension, and the pixels of the initial noise sample are given, and a four-dimensional tensor required by the generation model is formed by the three parameters, a default four-dimensional tensor (1, 200, 1, 1), where 1 represents the number of batch processing samples, 200 represents the dimension of the input noise (including Condition100 dimension), and 1 × 1 represents the pixels of the initial noise sample; secondly, converting the dimensionality into a four-dimensional tensor of (1, 512, 4, 4) after a two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and step length of 1; converting the dimensionality into a four-dimensional tensor of (1, 64, 32, 32) after three times of two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and a step length of 2; and finally, after the two-dimensional transposition convolution operation with the convolution kernel of 5 × 5 and the step size of 3, the output tensor is (1, 64, 96, 96). And the second step of downsampling operation is to convert the dimensionality into a four-dimensional tensor of (1, 256, 24 and 24) after two-dimensional convolution operation with 4 × 4 convolution kernels and 2 step length, the first purpose of downsampling operation is to reduce parameters and accelerate the operation speed, and in addition, the feeling after sampling is more suitable for network feature extraction. And thirdly, residual error network operation, namely, the dimension of output after the residual error network consisting of 6 residual error blocks is kept unchanged. And the fourth step of up-sampling operation is to convert the dimensionality into a four-dimensional tensor of (1, 64, 102, 102) after two-dimensional transposition convolution operation with two convolution kernels of 4 × 4, the step size of 2 and three convolution kernels of 4 × 4 and the step size of 1, and finally output a three-channel picture of 102 × 102 pixels through two-dimensional convolution operation with 7 × 7 convolution kernels and the step size of 1.
The lower half part of the structure is provided with a discriminator, the construction process of the discriminant model network structure is just opposite to that of the generation model network structure, and the discriminant model network structure of the depth residual generation type countermeasure network converts a four-dimensional tensor into a scalar. Firstly, inputting a sample data, converting the sample data into a four-dimensional tensor of (1, 3, 102, 102) after digitalization; secondly, converting the dimensionality into a four-dimensional tensor of (1, 64, 102, 102) after a 1-x 1 convolution operation; converting the dimensionality into a four-dimensional tensor of (1, 1024, 1, 1) after convolution operation with four convolution kernels of 4 x 4 and step length of 2; and finally, converting the convolution operation with a convolution kernel of 6 x 6 and step length of 1 into a scalar and outputting the scalar.
In the middle of the image, the generator is guided to learn sample distribution by searching text information in an electronic medical record database, and the discriminator is guided to discriminate the difference in the samples.
In order to enable the model to capture tiny characteristics of bone dirt to obtain higher accuracy, an image processing part respectively uses a depth residual generation network and a depth residual dense network to construct a countermeasure model, as shown in fig. 3, the upper part of the model is a generator, random noise (Z) and 10 label dimensions are connected into 200-dimensional data as input, and a plurality of dense residual blocks formed by a convolution network are used for generating samples; the lower half of the model is a discriminator which judges whether an input picture is a real picture or a generated picture through a plurality of convolution residual blocks formed by a convolution network.
The residual network is characterized by easy optimization and can improve the accuracy by adding proper depth. The inner residual block uses jump connection, and the problem of gradient disappearance caused by depth increase in a deep neural network is relieved. However, the residual network does not fully utilize the information of each convolutional layer, and only has connections after partial convolution operations. In order to make full use of the acquired receptive field information and reduce unnecessary calculation amount of all layers in the generator, the generator is constructed by adopting a dense residual error module, and the discriminator is constructed by using a residual error network.
The overall process of model training prediction:
(1) constructing data set with image and text in one-to-one correspondence
As shown in FIG. 4, the X-ray illumination picture of the examiner is shown on the left, the case text abstract of the examiner is shown on the right, and the X-ray illumination picture and the case text abstract of the examiner correspond to each other in many-to-one manner to construct the bone age prediction data set of the invention. In order to improve the accuracy of the training model, a multi-angle mode is adopted to irradiate the radius, palm, capitate bone, hamate bone and phalange of a detector by X-ray shooting so as to obtain a plurality of pictures of far, middle and near and the like to obtain bone structure information under different scales (similar to Gaussian difference in SIFT thought in a traditional image algorithm).
(2) The invention constructs a model BAGAN on the basis of the thought of CGAN and InfoGAN, a generator of the BAGAN generates new sample data according to the distribution rule of real data, a discriminator is used for identifying whether the data comes from the real data or from the newly generated sample data BAGAN, the training is a process of the minimum and maximum game, the final goal of the training is to enable the generator to completely capture the sample distribution rule in the real data, so that the generated sample is determined by the discriminator to judge whether the generated sample is the real sample, the learning rate of the generator in the training process is slightly lower than that of the discriminator, and the training of the two models is performed in a cross way, namely, the discriminator is trained firstly and then the generator is trained in the cross way. In order to make the model easier to train, the invention carries out deconvolution operation on the generator and then carries out Instance Normalization processing on the generator, and carries out LeakyRELU processing on the generator in order to ensure nonlinearity of the generator; after convolution operation is carried out on the discriminator, Batch Normalization processing is carried out on the discriminator, and RELU processing is carried out on the discriminator in order to ensure nonlinearity of the discriminator; and the text information in the electronic medical record database is searched as the Condition in the model to limit the probability distribution of the real sample learned by the generator under the limited Condition.
(3) Prediction process
In the stage of predicting the model, only a discriminator part is required to be reserved, the optimum model in the training process is loaded by the discriminator part, and softmax is added at the end of the network to output the model for predicting the truth and the falseness as specific classification.
The model predicts the overall result:
Figure BDA0002681171150000081
the BAGAN network of the invention uses the thought of antagonism training and combines the model constructed by InfoGAN on the basis of conditional GAN. The use of the dual-time scale update rule (TTUR) enables the discriminator to capture the real sample distribution rule more quickly and fully. And finally, in a prediction stage, the pre-trained discriminator is connected with a softmax layer as a classifier for classification.
And inputting the multi-source modal data into a trained classifier, outputting and comparing the classes according with the bone characteristic information, determining the growth stage of the bone, determining the specific bone age of the detected person and analyzing the result.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A bone age detection method based on multi-modal confrontation training is characterized by comprising the following steps:
s1: constructing a bone age prediction data set;
s2: constructing a bone age detection model based on multi-modal confrontation training for training;
s3: in the prediction stage, only a discriminator of the bone age detection model is reserved, softmax is added into the last layer, and the optimal discrimination model weight is loaded and trained to predict the result.
2. The bone age detection method based on multi-modal resistance training as claimed in claim 1, wherein the data set in step S1 comprises X-ray illumination pictures and case text summaries in a one-to-one correspondence.
3. The bone age detection method based on multi-modal confrontation training as claimed in claim 2, wherein the X-ray radiograph comprises the steps of sequentially subjecting the radius, palm, skull, hamate and phalanx of the examiner to X-ray irradiation to obtain X-ray irradiation images of three different directions, namely far, middle and near;
different posture images of the examiner obtained by the X-ray and text information obtained by inquiring the electronic medical record database are sent to a bone age detection model based on multi-modal confrontation training together for prediction.
4. The bone age detection method based on multi-modal resistance training as claimed in claim 1, wherein the bone age detection model in step S2 comprises a generator for generating new sample data according to the distribution rule of the real data and a discriminator for identifying whether the data is from the real data or from the newly generated sample data.
5. The bone age detection method based on multi-modal resistance training as claimed in claim 4, wherein the step of constructing the generator is as follows:
the first step is transposition convolution operation, firstly three hyper-parameters, namely the number of batch processing samples, noise dimensionality and pixels of an initial noise sample are given, and a four-dimensional tensor required by a generation model is formed by the three parameters; secondly, converting the dimensionality into a four-dimensional tensor of (1, 512, 4, 4) after a two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and step length of 1; converting the dimensionality into a four-dimensional tensor of (1, 64, 32, 32) after three times of two-dimensional transposition convolution operation with a convolution kernel of 4 x 4 and a step length of 2; finally, after the two-dimensional transposition convolution operation with 5 × 5 convolution kernels and 3 step sizes, the output tensor is (1, 64, 96, 96);
the second step is a down-sampling operation, and the dimensionality is converted into a four-dimensional tensor of (1, 256, 24, 24) after two-dimensional convolution operation with 4 x 4 convolution kernels and 2 step lengths;
thirdly, residual error network operation is carried out, and the dimension output after the residual error network consisting of 6 residual error blocks is kept unchanged;
and the fourth step is up-sampling operation, the dimension is converted into a four-dimensional tensor of (1, 64, 102, 102) after two-dimensional transposition convolution operation with two convolution kernels of 4 × 4, the step length of 2 and three convolution kernels of 4 × 4 and the step length of 1, and finally a three-channel picture of 102 × 102 pixels is output through two-dimensional convolution operation with 7 × 7 convolution kernels and the step length of 1.
6. The bone age detection method based on multi-modal opponent training as claimed in claim 5, wherein the step of constructing the discriminator is as follows:
firstly, inputting a sample data, converting the sample data into a four-dimensional tensor of (1, 3, 102, 102) after digitalization;
secondly, converting the dimensionality into a four-dimensional tensor of (1, 64, 102, 102) after a 1-x 1 convolution operation;
converting the dimensionality into a four-dimensional tensor of (1, 1024, 1, 1) after convolution operation with four convolution kernels of 4 x 4 and step length of 2;
and finally, converting the convolution operation with a convolution kernel of 6 x 6 and step length of 1 into a scalar and outputting the scalar.
CN202010962917.5A 2020-09-14 2020-09-14 Bone age detection method based on multi-modal countermeasure training Active CN112102285B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010962917.5A CN112102285B (en) 2020-09-14 2020-09-14 Bone age detection method based on multi-modal countermeasure training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010962917.5A CN112102285B (en) 2020-09-14 2020-09-14 Bone age detection method based on multi-modal countermeasure training

Publications (2)

Publication Number Publication Date
CN112102285A true CN112102285A (en) 2020-12-18
CN112102285B CN112102285B (en) 2024-03-12

Family

ID=73750971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010962917.5A Active CN112102285B (en) 2020-09-14 2020-09-14 Bone age detection method based on multi-modal countermeasure training

Country Status (1)

Country Link
CN (1) CN112102285B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785559A (en) * 2021-01-05 2021-05-11 四川大学 Bone age prediction method based on deep learning and formed by mutually combining multiple heterogeneous models

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345508A (en) * 2018-08-31 2019-02-15 北京航空航天大学 A kind of Assessing Standards For Skeletal method based on two stages neural network
CN109544518A (en) * 2018-11-07 2019-03-29 中国科学院深圳先进技术研究院 A kind of method and its system applied to the assessment of skeletal maturation degree
WO2019232960A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Automatic bone age prediction method and system, and computer device and storage medium
CN110597628A (en) * 2019-08-29 2019-12-20 腾讯科技(深圳)有限公司 Model distribution method and device, computer readable medium and electronic equipment
CN111161254A (en) * 2019-12-31 2020-05-15 上海体育科学研究所 Bone age prediction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019232960A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Automatic bone age prediction method and system, and computer device and storage medium
CN109345508A (en) * 2018-08-31 2019-02-15 北京航空航天大学 A kind of Assessing Standards For Skeletal method based on two stages neural network
CN109544518A (en) * 2018-11-07 2019-03-29 中国科学院深圳先进技术研究院 A kind of method and its system applied to the assessment of skeletal maturation degree
CN110597628A (en) * 2019-08-29 2019-12-20 腾讯科技(深圳)有限公司 Model distribution method and device, computer readable medium and electronic equipment
CN111161254A (en) * 2019-12-31 2020-05-15 上海体育科学研究所 Bone age prediction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BO LIU 等: "Bone Age Assessment Based on Rank-Monotonicity Enhanced Ranking CNN", IEEE, 9 September 2019 (2019-09-09) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785559A (en) * 2021-01-05 2021-05-11 四川大学 Bone age prediction method based on deep learning and formed by mutually combining multiple heterogeneous models
CN112785559B (en) * 2021-01-05 2022-07-12 四川大学 Bone age prediction method based on deep learning and formed by mutually combining multiple heterogeneous models

Also Published As

Publication number Publication date
CN112102285B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
CN111259982B (en) Attention mechanism-based premature infant retina image classification method and device
CN108830334B (en) Fine-grained target discrimination method based on antagonistic transfer learning
Meedeniya et al. Chest X-ray analysis empowered with deep learning: A systematic review
CN110490242B (en) Training method of image classification network, fundus image classification method and related equipment
WO2016192612A1 (en) Method for analysing medical treatment data based on deep learning, and intelligent analyser thereof
CN110490239B (en) Training method, quality classification method, device and equipment of image quality control network
JP2018068752A (en) Machine learning device, machine learning method and program
WO2021179514A1 (en) Novel coronavirus patient condition classification system based on artificial intelligence
Li et al. Explainable multi-instance and multi-task learning for COVID-19 diagnosis and lesion segmentation in CT images
Prayogo et al. Classification of pneumonia from X-ray images using siamese convolutional network
Hariri et al. COVID-19 and pneumonia diagnosis from chest X-ray images using convolutional neural networks
Ranjan et al. Transfer learning based approach for pneumonia detection using customized VGG16 deep learning model
CN112102285B (en) Bone age detection method based on multi-modal countermeasure training
CN116452592B (en) Method, device and system for constructing brain vascular disease AI cognitive function evaluation model
Jia et al. Fine-grained precise-bone age assessment by integrating prior knowledge and recursive feature pyramid network
CN113450306B (en) Method for providing fracture detection tool
Ummah et al. Covid-19 and Tuberculosis Detection in X-Ray of Lung Images with Deep Convolutional Neural Network.
Raghav et al. Autism Spectrum Disorder Detection in Children Using Transfer Learning Techniques
Shobha Rani et al. Chronological age assessment based on wrist radiograph processing–Some novel approaches
Mannepalli et al. An Early Detection of Pneumonia in CXR Images using Deep Learning Techniques
Diyasa et al. Classification of Pneumonia Using a Deep Learning Convolutional Neural Network
Rahimi et al. Intracranial Hemorrhage Classification using CBAM Attention Module and Convolutional Neural Networks
احمد منعم حسين Diagnosis of COVID-19 Virus Using Convolutional Neural Network Algorithm
Lu Convolutional Neural Network (CNN) for COVID-19 Lung CT Scans Classification Detection
Wang et al. Image Recognition from COVID-19 X-ray Images Utilizing Transfer Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant