CN115423779A - Method for predicting bone age of children - Google Patents
Method for predicting bone age of children Download PDFInfo
- Publication number
- CN115423779A CN115423779A CN202211076541.3A CN202211076541A CN115423779A CN 115423779 A CN115423779 A CN 115423779A CN 202211076541 A CN202211076541 A CN 202211076541A CN 115423779 A CN115423779 A CN 115423779A
- Authority
- CN
- China
- Prior art keywords
- picture
- pixel
- hand
- value
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 210000000988 bone and bone Anatomy 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 31
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims description 24
- 238000001914 filtration Methods 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 16
- 238000013135 deep learning Methods 0.000 claims description 15
- 230000002146 bilateral effect Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000003708 edge detection Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 10
- 210000003275 diaphysis Anatomy 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 5
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000006872 improvement Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 241000023320 Luma <angiosperm> Species 0.000 claims 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 5
- 238000002679 ablation Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000011164 ossification Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 210000000707 wrist Anatomy 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000003010 carpal bone Anatomy 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 210000002745 epiphysis Anatomy 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 210000004553 finger phalanx Anatomy 0.000 description 1
- 210000004247 hand Anatomy 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 210000000236 metacarpal bone Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 230000035938 sexual maturation Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/36—Applying a local operator, i.e. means to operate on image points situated in the vicinity of a given point; Non-linear local filtering operations, e.g. median filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10116—X-ray image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20028—Bilateral filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30008—Bone
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Nonlinear Science (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a method for predicting the bone age of children, which comprises the following steps: randomly selecting partial hand X-ray from a hand X-ray public data set with bone age labels to form a data set, and adjusting all pictures to a specified size; establishing a TENet model and simultaneously establishing a training set; and taking the training set as the input of the TENet model, training the TENet model by using an Adam optimizer, and obtaining the trained TENet model when the maximum iteration times is reached. The model of the invention can accurately predict the bone age of children.
Description
Technical Field
The invention relates to the field of computer image vision and big data medical image application, in particular to a method for predicting the bone age of children.
Background
The traditional bone age assessment is usually performed by taking photographs of the subject's hand and wrist, and then interpreted by a doctor according to the photographed conditions. The explanation method includes simple counting method, atlas method, scoring method and computerized bone age scoring system. The most common methods are G-P spectroscopy and TW3 scoring; under the influence of ethnic difference and long-term trend of growth and development, the bone age standard most suitable for contemporary children in China at present is CHN-05, and in 2006, the Chinese-05 bone age standard is compiled into the industry Standard of the people's republic of China, and becomes the currently only industry bone age standard in China.
Recently, deep neural networks have been widely used in the medical field because the use of deep learning methods to automatically estimate bone age is much faster than manual bone age determination, and the accuracy rate is far better than that of conventional methods. HyunkwangLee et al use google lenet as a backbone to determine final bone age by displaying three to five reference images of G & P atlas. ToanDucBui et al used the fast-RCNN and inclusion-v 4 networks, respectively, for region of interest (ROI) detection and classification. ChuanbinLiu et al introduced an attention agent and an identification agent for suggesting distinct skeletal parts and characteristic learning and age assessment, respectively.
Although these methods can achieve good performance in BAA, they cannot objectively and accurately evaluate the bone age of contemporary children in china in the course of actual testing because of the addition of chronological changes. Based on the above problems, there is a need to find a method for studying a standardized sample based on the china-05, the standard of the china-05 is composed of normal children of chinese urban chinese in 2005, the development trend of contemporary chinese children can be truly reflected, and the palm phalange edge and the palm phalange length of the chinese children are used as reference standards, so as to obtain the correct bone age.
Disclosure of Invention
Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: how to accurately predict the bone age of children.
In order to solve the technical problems, the invention adopts the following technical scheme: a method for pediatric bone age prediction comprising the steps of:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D; the edge feature enhancement module is an improved Canny edge detection algorithm, and the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and an Otsu algorithm;
s300: constructing a training set of a TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A;
s320: performing feature extraction processing on the pictures in the A by using a hand topology module to obtain a hand topology picture corresponding to each picture in the A;
s330: performing edge enhancement on the preprocessed image corresponding to the picture in the picture A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the picture in the picture A;
s340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a pretreatment picture corresponding to each picture in A, a hand topological picture corresponding to each picture in A and an edge feature enhancement picture corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample; s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in the training set are used as input of D, an Adam optimizer is used for training the D to update parameters of the D, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using the D' as a trained TENet model;
s500: inputting an X-ray film of a hand part of a child needing to predict the bone age of unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
Preferably, the structure of the deep learning network improved model D in S200 is that a deep learning network is used as a backbone network, the top-most neural network layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-link layer with 32 neurons are sequentially added after the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in a is blended into image features through the full-link layer.
Preferably, the specific step of obtaining the preprocessing chart corresponding to the picture in the picture a in S310 is as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Wherein p and q represent two pixel points on the c picture, I c (p) and I c (q) represents the gray level difference of p and q, respectively, d (p, q) represents the distance metric function between p and q, S α (. Cndot.) represents a luminance expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full domain representing L (p);
s314: repeating S312-S313, traversing all pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture;
the problem that the contrast of a background area and ROI of a hand X-ray film is low can be solved by preprocessing the picture.
Preferably, the specific step of obtaining the hand topological graph corresponding to the picture in the picture a in S320 is as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt The specific expression is as follows:
wherein, G xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Expressing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
wherein, G yt Representing the brightness difference approximate value of the t pixel point in the longitudinal direction;
s323: calculating the gradient approximation G of the t-th pixel point on c t The expression is as follows:
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all pixel points on the c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on the c' to obtain a hand diaphysis rough foreground edge image of the c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework to obtain a hand topological graph of the c.
A hand topological structure diagram obtained by utilizing the rough foreground edge diagram of the hand backbone can obtain structured ROI information, and can well express the corresponding semantic feature in CHN-05, namely the length of the finger bones of the hand; can be more beneficial to the evaluation of the bone age in the later period.
Preferably, the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is the gaussian standard deviation;
the expression of the pixel value similarity is as follows:
wherein f (i, j) represents the pixel value of the image at the point a (i, j), and f (k, l) represents the imageThe pixel value at point b (k, l), δ r Is a standard deviation of Gauss, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering and denoising treatment as c ";
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1;(9)
wherein σ 2 The inter-class variance is represented as,mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
where k = {1, \8230;, T }, n denotes the class of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, L representing the pixel level of the image, p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value representing background, m 1 Representing the level of an objectMean gray value, and the expression is as follows:
s334: substituting the formula (10) into the formula (11) and simplifying to obtain the formula (16), the formula is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ;(16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through a formula (9) -a formula (16) in sequence to obtain N sigma 2 ;
S336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 A pixel optimal threshold as c ";
s337: and c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
In a traditional Canny edge detection algorithm, linear gaussian filtering is adopted for denoising, wherein convolution with a weighting coefficient is used for performing convolution denoising on an image, which is actually mean value blurring, but gaussian blurring is performed according to weighted average, and the closer points are weighted more, the farther points are weighted less, so that edge information and feature information in the image are blurred by calculating the mean value, and many features are lost. The bilateral filtering is a nonlinear filtering method, which is a compromise treatment combining the spatial proximity and the pixel value similarity of an image, and simultaneously considers the spatial information and the gray scale similarity, namely, the purpose is achieved by introducing a spatial domain kernel and a value domain kernel. The bilateral filter can well retain image edge characteristics and filter out noise of low frequency components.
In the traditional Canny edge detection algorithm, a dual-threshold algorithm is adopted for classifying pixel points, however, the fixed threshold cannot achieve universality for each different image, so that the inherent problem that operators TC, laplacian, prewitt and Roberts cannot identify the boundary of the ossification center and the metacarpal bone end can occur. The Jinjin algorithm which is suitable for the optimal threshold value of different pictures is used for replacing manual threshold value selection, so that the uncertainty of the algorithm performance is minimized, and the final result is more in line with the semantic meaning of CHN-05, namely, the ossification center is clearly visible and is in a disc shape, the edge is smooth and continuous, and the fusion degree of epiphysis and diaphysis can be observed.
Compared with the prior art, the invention has at least the following advantages:
1. according to the model, a hand topological graph, an edge feature enhancement module and a deep learning network improvement model are used for constructing a bone age prediction evaluation model-TENet model, and the hand topological graph is used in the TENet model to enhance the extraction of information features of a hand X-ray film, so that available information is increased; by utilizing the improved Canny edge detection algorithm, namely an edge feature enhancement module, the extraction amount and accuracy of the information features of the edge part of the hand X-ray film are improved, and meanwhile, the uncertainty of the performance of the algorithm is minimized by using the Dajin algorithm to replace manual selection of a threshold, so that the final result is ensured to be more consistent with the belonged semantics of CHN-05.
2. Pulse noise often exists in a hand X-ray film, and the traditional denoising methods such as Gaussian filtering, median filtering, square frame filtering and the like cannot well protect high-frequency information, so that edge details of an image are blurred; the bilateral filtering denoising algorithm used by the invention can compromise the spatial proximity and the pixel similarity of the image, consider spatial domain information and gray level similarity, and retain the edge characteristics of the image while denoising, thereby achieving the effect of edge-preserving denoising.
3. The Otsu algorithm which is suitable for the optimal threshold value of different pictures is calculated to replace the manual threshold value selection, so that the uncertainty of the algorithm performance is minimized.
4. In data preprocessing, an automatic color balance algorithm is used for solving the problem that the contrast of a background area and an ROI of a hand X-ray film is low.
Drawings
Fig. 1 is an overall framework diagram of the tent model.
Fig. 2 is a network architecture used by the present invention.
Fig. 3 is a flow chart of a modified Canny algorithm.
FIG. 4 is a diagram of the effect of different algorithms on processing a hand X-ray film.
FIG. 5 is a graph of the processing results of a data set used by the method of the present invention. (a) - (d) are sample images using a private database, and (e) - (h) are sample images using an RSNA database.
Detailed Description
The present invention is described in further detail below. A novel automatic bone age evaluation model TENet is provided, and the purpose of horizontal fusion of multi-feature expression is achieved through feature abstraction of semantic description of bone age evaluation in CHN-05 and an algorithm. Firstly, a hand topological graph module is designed, length and position structure information of metacarpal-phalangeal bones are provided, and semantics of bone age evaluation in four directions of CHN-05 are met; secondly, a reinforced edge feature module is developed to identify the hand structure and extract effective edge information, so that the requirement of CHN-05 on bone edge definition is met, and the accuracy of bone age assessment is also enhanced.
Referring to fig. 1-5, a method for pediatric bone age prediction comprising the steps of:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D, the deep learning network is the prior art, the edge feature enhancement module is an improved Canny edge detection algorithm, the Canny edge detection algorithm is the prior art, the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and an Otsu algorithm, and the bilateral filtering denoising algorithm and the Otsu algorithm are the prior art;
the structure of the deep learning network improved model D in the S200 is that a deep learning network is used as a backbone network, the neural network layer at the top layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-connection layer with 32 neurons are sequentially added behind the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in the A is fused into image characteristics through the full-connection layer.
S300: constructing a training set of the TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A; the automatic color balance algorithm is the prior art, and is used for carrying out regional adaptive filtering on an original image to finish chromatic aberration correction so as to obtain a spatial domain reconstructed image, and characteristics and valuable information in the image are highlighted, so that the image is more in line with the perception of human eyes;
the specific steps of obtaining the preprocessing chart corresponding to the picture in the picture a in the step S310 are as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Wherein p and q represent two pixel points on the c-th image, I c (p) and I c (q) represents the gray-scale difference of p and q, respectively, representing the quasi-biological side-suppression, d (p, q) represents the distance metric function between p and q, which maps the region-adaptation of the filtering, S α (. Cndot.) represents a luminance expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full definition field representing L (p), the purpose of the full definition field being to achieve global white balance for the image;
s314: repeating S312-S313, traversing all the pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture;
s320: performing feature extraction processing on the pictures in the A by using a hand topology module to obtain a hand topology picture corresponding to each picture in the A;
the specific steps of obtaining the hand topological graph corresponding to the picture in the picture A in the step S320 are as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt The specific expression is as follows:
wherein, G xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Representing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
wherein G is yt Expressing the brightness difference approximate value of the t-th pixel point in the longitudinal direction;
s323: calculating gradient approximate value G of the t-th pixel point on c t The expression is as follows:
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all the pixel points on c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on c', wherein the threshold value binarization processing is the prior art, and obtaining the rough hand backbone foreground edge map of c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework, wherein the MediaPipe application framework is the prior art, and obtaining a hand topological graph of the c.
S330: performing edge enhancement on the preprocessed image corresponding to the image in the A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the image in the A;
the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is a Gaussian standardA difference;
the expression of the pixel value similarity is as follows:
wherein f (i, j) represents the pixel value of the image at the point a (i, j), f (k, l) represents the pixel value of the image at the point b (k, l), and delta r Is a Gaussian standard deviation, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering and denoising treatment as c ";
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1;(9)
wherein σ 2 The inter-class variance is expressed as a variance between classes,mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
whereinK = {1, \8230;, T }, n denotes a category of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, and L represents the pixel level of the image, typically 255,p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value, m, representing the background 1 Represents the average gradation value of the object, and the expression is as follows:
s334: substituting equation (10) into equation (11) and simplifying to obtain equation (16), the expression is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ;(16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through formula (9) -formula (16) in turn to obtain N sigma values 2 The gray level refers to the gray level to which the pixel belongs;
s336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 As the pixel optimal threshold of c', in the calculation process, all pixel points are circularly traversed within the gray level range of 0-255, so that the gray level maximizing the formula (16) is obtained as the pixel optimal threshold of the picture;
s337: and c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
S340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a preprocessing image corresponding to each picture in A, a hand topological image corresponding to each picture in A and an edge feature enhancement image corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample;
s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in a training set are used as input of D, and an Adam optimizer is used for training the D to update parameters of the D, wherein the Adam optimizer is in the prior art, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using the D' as a trained TENet model;
s500: inputting an X-ray film of a hand part of a child needing to predict the bone age of unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
Experimental verification
Pediatric Bone Age Assessment (BAA) is an important human physical examination that can reflect the growth potential and sexual maturation trends of humans. In clinical practice, the Chinese method for assessing and assessing maturity of carpal bones of young children (CHN-05) is a widely used method for BAA by Chinese radiologists. CHN-05 uses Metacarpophalangeal Length (ML) and metacarpophalangeal edge (MM) as reference criteria to estimate bone age. Inspired by the CHN-05 semantic description, the present invention proposes a new model, called TENet, for automatic bone age assessment. A hand topology module is designed in the TENet model to identify key positions of hands, and the key positions are used for extracting structural semantics; an edge feature enhancement module is designed to provide accurate bone edge information during the training process. The TENet model can detect edge overall information and topological structure local information, so that the aim of evaluating the bone age through multi-feature horizontal fusion is fulfilled.
The experimental result shows that the TENet model achieves the most advanced model performance of 5.35meanabsolute terror (MAE) on the public data set RSNA, and the design of the TENet model follows CHN-05 semantic logic, so that the TENet model has very good reliability and interpretability in clinical application.
The experiment was conducted to evaluate the TENet model on a private dataset and a North American society of radiology bone age assessment (RSNA-BAA) public dataset. The private data set comprises 4954 hand X-rays in the training set, and 275 hand X-rays in the verification set and the test set; the RSNA-BAA public data set contained 5611 hand X-rays in the training set, and 275 hand X-rays in the validation set and the testing set. All hand X-rays were resized to 560X 560 before processing by the tent model. The mean absolute error between the basic real bone age and the corresponding predicted bone age is also reported.
TABLE 1 ablation study of the model of the invention
EXP. | OI | PHI | ImCAT | L1 | MAE |
1 (private data set) | √ | √ | 7.15 | ||
2 (private data set) | √ | √ | 6.92 | ||
3 (private data set) | √ | √ | √ | 5.76 | |
4 (public data set) | √ | √ | √ | 5.80 |
( EXP is Experimental and represents the Experiment; OI is originALImage and represents an original image; PHI is Pre-processdhandimage and represents a hand preprocessing diagram; imCAT is Impravadaptivethroresholdingcandyedgedetection, and represents a hand edge enhancement map; l1 is Loss1 and represents the MAE Loss function; MAE is meanabloluoteerror, representing the mean absolute loss. )
Implementation details
TENET was achieved via Tensorflow1.9 and training was completed on a NVIDIATITAN RTX GPU and 32GRAM system, which took about 8 hours. We used 100 epochs to train the whole of TENetAnd (6) carrying out the process. The batch size was 32. Initial learning rate of 3 × 10 -3 After 50 epochs, the decrease is 10 -3 The drop was 10-fold after 80 epochs. The optimizer used is Adam.
TENet model ablation experiment
The role of each module in the TENet model, namely artwork, pre-Processed Hand Images (PHI) and edge feature enhancement (ImCAT), was first studied. The experimental results are shown in table 1, without these two modules, the network degenerated to Xception, resulting in an MAE score of 7.19 months; when the preprocessing image is applied to the model instead of the original image, the obtained MAE score is 6.92 months; when the edge feature enhancement module is applied to the model, the obtained MAE score is 5.76 months, and the MAE score is improved by 1.16 months; the necessity of acquiring the edge features and the effectiveness of the ImCAT algorithm are illustrated by actual experimental data.
Combining these two components, the TENet model achieved a 5.80 month MAE on the public dataset. Taken together, the results from the private dataset over the public dataset can prove the correctness of the TENet model design elicited by the CHN-05 wrist bone standard. Note: smaller MAE values indicate better results.
TABLE 2 ablation study of ImCAT assay
Hand edge feature ablation experiment
The TENet model detects hand edge features and combines the topologies for bone age assessment. By observation, imCAT also benefits from its modular design, using peak signal-to-noise ratio (PSNR), mean Square Error (MSE), structural Similarity (SSIM), root Mean Square Error (RMSE), and Mean Absolute Error (MAE) to measure performance, with the results shown in table 2. The Traditional Canny (TC) algorithm is used as a base line, four edge calculators are used as the base line, the ImCAT algorithm used in the method is obviously superior to other algorithms, and the ImCAT has the capability of capturing the whole edge details of the hand better.
The results in the RSNA public and private datasets show that the tent model achieves a good performance. In addition, the TENet model is designed by CHN-05 prior knowledge, so that the TENet model has good interpretability and reliability for clinical practice. In addition, the invention also provides a reliable experimental basis for forming an end-to-end automatic bone age assessment framework by combining the characteristic acquisition module and the scoring regression module in an attempt.
Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (5)
1. A method for pediatric bone age prediction, comprising: the method comprises the following steps:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D, the edge feature enhancement module is an improved Canny edge detection algorithm, and the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and a Otsu algorithm;
s300: constructing a training set of a TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A;
s320: using a hand topology module to perform feature extraction processing on the pictures in the A to obtain a hand topology picture corresponding to each picture in the A;
s330: performing edge enhancement on the preprocessed image corresponding to the image in the A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the image in the A;
s340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a preprocessing image corresponding to each picture in A, a hand topological image corresponding to each picture in A and an edge feature enhancement image corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample;
s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in the training set are used as input of D, an Adam optimizer is used for training the D to update parameters of the D, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using D' as a trained TENet model;
s500: inputting an X-ray film of the hand of the child needing to predict the unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
2. The method of claim 1, wherein the bone age of the child is predicted by: the structure of the deep learning network improved model D in the S200 is that a deep learning network is used as a backbone network, the neural network layer at the top layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-connection layer with 32 neurons are sequentially added behind the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in the A is fused into image characteristics through the full-connection layer.
3. A method for pediatric bone age prediction as defined in claim 2, wherein: the specific steps of obtaining the preprocessing chart corresponding to the picture in the picture a in the step S310 are as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Wherein p and q represent two pixel points on the c-th image, I c (p) and I c (q) represents the gray level difference of p and q, respectively, d (p, q) represents the distance metric function between p and q, S α () represents a luma expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full domain representing L (p);
s314: and repeating S312-S313, traversing all the pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture.
4. A method for pediatric bone age prediction as defined in claim 3, wherein: the specific steps of obtaining the hand topological graph corresponding to the picture in the picture A in the step S320 are as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt Detailed description of the inventionThe formula is as follows:
wherein G is xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Expressing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
wherein, G yt Expressing the brightness difference approximate value of the t-th pixel point in the longitudinal direction;
s323: calculating the gradient approximation G of the t-th pixel point on c t The expression is as follows:
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all the pixel points on c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on the c' to obtain a hand diaphysis rough foreground edge image of the c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework to obtain a hand topological graph of the c.
5. The method of claim 4, wherein the bone age of the child is predicted by: the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is the gaussian standard deviation;
the expression of the pixel value similarity is as follows:
wherein f (i, j) represents the pixel value of the image at the point a (i, j), f (k, l) represents the pixel value of the image at the point b (k, l), and delta r Is a Gaussian standard deviation, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering denoising treatment as c;
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1; (9)
wherein σ 2 The inter-class variance is represented as,mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
where k = {1, \8230;, T }, n denotes the class of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, L representing the pixel level of the image, p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value, m, representing the background 1 Represents the average gradation value of the object, and the expression is as follows:
s334: substituting equation (10) into equation (11) and simplifying to obtain equation (16), the expression is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ; (16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through a formula (9) -a formula (16) in sequence to obtain N sigma 2 ;
S336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 Pixel optimal threshold as c ";
s337: c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211076541.3A CN115423779A (en) | 2022-09-05 | 2022-09-05 | Method for predicting bone age of children |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211076541.3A CN115423779A (en) | 2022-09-05 | 2022-09-05 | Method for predicting bone age of children |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115423779A true CN115423779A (en) | 2022-12-02 |
Family
ID=84202426
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211076541.3A Pending CN115423779A (en) | 2022-09-05 | 2022-09-05 | Method for predicting bone age of children |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115423779A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117351325A (en) * | 2023-12-06 | 2024-01-05 | 浙江省建筑设计研究院 | Model training method, building effect graph generation method, equipment and medium |
-
2022
- 2022-09-05 CN CN202211076541.3A patent/CN115423779A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117351325A (en) * | 2023-12-06 | 2024-01-05 | 浙江省建筑设计研究院 | Model training method, building effect graph generation method, equipment and medium |
CN117351325B (en) * | 2023-12-06 | 2024-03-01 | 浙江省建筑设计研究院 | Model training method, building effect graph generation method, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107369160B (en) | Choroid neogenesis blood vessel segmentation algorithm in OCT image | |
CN107610087B (en) | Tongue coating automatic segmentation method based on deep learning | |
Gao et al. | A deep learning based approach to classification of CT brain images | |
CN106940816A (en) | Connect the CT image Lung neoplasm detecting systems of convolutional neural networks entirely based on 3D | |
CN109035172B (en) | Non-local mean ultrasonic image denoising method based on deep learning | |
CN109886965B (en) | Retina layer segmentation method and system combining level set with deep learning | |
Koitka et al. | Mimicking the radiologists’ workflow: Estimating pediatric hand bone age with stacked deep neural networks | |
JP7427080B2 (en) | Weakly supervised multitask learning for cell detection and segmentation | |
Bibiloni et al. | A real-time fuzzy morphological algorithm for retinal vessel segmentation | |
CN112465905A (en) | Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning | |
Pan et al. | Prostate segmentation from 3d mri using a two-stage model and variable-input based uncertainty measure | |
Chen et al. | Skin lesion segmentation using recurrent attentional convolutional networks | |
CN111062936B (en) | Quantitative index evaluation method for facial deformation diagnosis and treatment effect | |
JP2006507579A (en) | Histological evaluation of nuclear polymorphism | |
CN115423779A (en) | Method for predicting bone age of children | |
Zhao et al. | Attention residual convolution neural network based on U-net (AttentionResU-Net) for retina vessel segmentation | |
CN116579975A (en) | Brain age prediction method and system of convolutional neural network | |
CN112036298A (en) | Cell detection method based on double-segment block convolutional neural network | |
Wang et al. | Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting | |
CN117274278A (en) | Retina image focus part segmentation method and system based on simulated receptive field | |
CN114862799B (en) | Full-automatic brain volume segmentation method for FLAIR-MRI sequence | |
CN113177499A (en) | Tongue crack shape identification method and system based on computer vision | |
CN112949585A (en) | Identification method and device for blood vessels of fundus image, electronic equipment and storage medium | |
EP3905129A1 (en) | Method for identifying bone images | |
CN112381767A (en) | Cornea reflection image screening method and device, intelligent terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |