CN115423779A - Method for predicting bone age of children - Google Patents

Method for predicting bone age of children Download PDF

Info

Publication number
CN115423779A
CN115423779A CN202211076541.3A CN202211076541A CN115423779A CN 115423779 A CN115423779 A CN 115423779A CN 202211076541 A CN202211076541 A CN 202211076541A CN 115423779 A CN115423779 A CN 115423779A
Authority
CN
China
Prior art keywords
picture
pixel
hand
value
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211076541.3A
Other languages
Chinese (zh)
Inventor
简坤元
杨梦宁
王思敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202211076541.3A priority Critical patent/CN115423779A/en
Publication of CN115423779A publication Critical patent/CN115423779A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/36Applying a local operator, i.e. means to operate on image points situated in the vicinity of a given point; Non-linear local filtering operations, e.g. median filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Nonlinear Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for predicting the bone age of children, which comprises the following steps: randomly selecting partial hand X-ray from a hand X-ray public data set with bone age labels to form a data set, and adjusting all pictures to a specified size; establishing a TENet model and simultaneously establishing a training set; and taking the training set as the input of the TENet model, training the TENet model by using an Adam optimizer, and obtaining the trained TENet model when the maximum iteration times is reached. The model of the invention can accurately predict the bone age of children.

Description

Method for predicting bone age of children
Technical Field
The invention relates to the field of computer image vision and big data medical image application, in particular to a method for predicting the bone age of children.
Background
The traditional bone age assessment is usually performed by taking photographs of the subject's hand and wrist, and then interpreted by a doctor according to the photographed conditions. The explanation method includes simple counting method, atlas method, scoring method and computerized bone age scoring system. The most common methods are G-P spectroscopy and TW3 scoring; under the influence of ethnic difference and long-term trend of growth and development, the bone age standard most suitable for contemporary children in China at present is CHN-05, and in 2006, the Chinese-05 bone age standard is compiled into the industry Standard of the people's republic of China, and becomes the currently only industry bone age standard in China.
Recently, deep neural networks have been widely used in the medical field because the use of deep learning methods to automatically estimate bone age is much faster than manual bone age determination, and the accuracy rate is far better than that of conventional methods. HyunkwangLee et al use google lenet as a backbone to determine final bone age by displaying three to five reference images of G & P atlas. ToanDucBui et al used the fast-RCNN and inclusion-v 4 networks, respectively, for region of interest (ROI) detection and classification. ChuanbinLiu et al introduced an attention agent and an identification agent for suggesting distinct skeletal parts and characteristic learning and age assessment, respectively.
Although these methods can achieve good performance in BAA, they cannot objectively and accurately evaluate the bone age of contemporary children in china in the course of actual testing because of the addition of chronological changes. Based on the above problems, there is a need to find a method for studying a standardized sample based on the china-05, the standard of the china-05 is composed of normal children of chinese urban chinese in 2005, the development trend of contemporary chinese children can be truly reflected, and the palm phalange edge and the palm phalange length of the chinese children are used as reference standards, so as to obtain the correct bone age.
Disclosure of Invention
Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: how to accurately predict the bone age of children.
In order to solve the technical problems, the invention adopts the following technical scheme: a method for pediatric bone age prediction comprising the steps of:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D; the edge feature enhancement module is an improved Canny edge detection algorithm, and the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and an Otsu algorithm;
s300: constructing a training set of a TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A;
s320: performing feature extraction processing on the pictures in the A by using a hand topology module to obtain a hand topology picture corresponding to each picture in the A;
s330: performing edge enhancement on the preprocessed image corresponding to the picture in the picture A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the picture in the picture A;
s340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a pretreatment picture corresponding to each picture in A, a hand topological picture corresponding to each picture in A and an edge feature enhancement picture corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample; s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in the training set are used as input of D, an Adam optimizer is used for training the D to update parameters of the D, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using the D' as a trained TENet model;
s500: inputting an X-ray film of a hand part of a child needing to predict the bone age of unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
Preferably, the structure of the deep learning network improved model D in S200 is that a deep learning network is used as a backbone network, the top-most neural network layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-link layer with 32 neurons are sequentially added after the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in a is blended into image features through the full-link layer.
Preferably, the specific step of obtaining the preprocessing chart corresponding to the picture in the picture a in S310 is as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Figure BDA0003831416170000031
Wherein p and q represent two pixel points on the c picture, I c (p) and I c (q) represents the gray level difference of p and q, respectively, d (p, q) represents the distance metric function between p and q, S α (. Cndot.) represents a luminance expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
Figure BDA0003831416170000032
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full domain representing L (p);
s314: repeating S312-S313, traversing all pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture;
the problem that the contrast of a background area and ROI of a hand X-ray film is low can be solved by preprocessing the picture.
Preferably, the specific step of obtaining the hand topological graph corresponding to the picture in the picture a in S320 is as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt The specific expression is as follows:
Figure BDA0003831416170000033
wherein, G xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Expressing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
Figure BDA0003831416170000034
wherein, G yt Representing the brightness difference approximate value of the t pixel point in the longitudinal direction;
s323: calculating the gradient approximation G of the t-th pixel point on c t The expression is as follows:
Figure BDA0003831416170000035
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
Figure BDA0003831416170000041
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all pixel points on the c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on the c' to obtain a hand diaphysis rough foreground edge image of the c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework to obtain a hand topological graph of the c.
A hand topological structure diagram obtained by utilizing the rough foreground edge diagram of the hand backbone can obtain structured ROI information, and can well express the corresponding semantic feature in CHN-05, namely the length of the finger bones of the hand; can be more beneficial to the evaluation of the bone age in the later period.
Preferably, the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
Figure BDA0003831416170000042
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is the gaussian standard deviation;
the expression of the pixel value similarity is as follows:
Figure BDA0003831416170000043
wherein f (i, j) represents the pixel value of the image at the point a (i, j), and f (k, l) represents the imageThe pixel value at point b (k, l), δ r Is a standard deviation of Gauss, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering and denoising treatment as c ";
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1;(9)
Figure BDA0003831416170000044
Figure BDA0003831416170000045
wherein σ 2 The inter-class variance is represented as,
Figure BDA0003831416170000046
mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
Figure BDA0003831416170000051
Figure BDA0003831416170000052
where k = {1, \8230;, T }, n denotes the class of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, L representing the pixel level of the image, p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value representing background, m 1 Representing the level of an objectMean gray value, and the expression is as follows:
Figure BDA0003831416170000053
Figure BDA0003831416170000054
s334: substituting the formula (10) into the formula (11) and simplifying to obtain the formula (16), the formula is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ;(16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through a formula (9) -a formula (16) in sequence to obtain N sigma 2
S336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 A pixel optimal threshold as c ";
s337: and c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
In a traditional Canny edge detection algorithm, linear gaussian filtering is adopted for denoising, wherein convolution with a weighting coefficient is used for performing convolution denoising on an image, which is actually mean value blurring, but gaussian blurring is performed according to weighted average, and the closer points are weighted more, the farther points are weighted less, so that edge information and feature information in the image are blurred by calculating the mean value, and many features are lost. The bilateral filtering is a nonlinear filtering method, which is a compromise treatment combining the spatial proximity and the pixel value similarity of an image, and simultaneously considers the spatial information and the gray scale similarity, namely, the purpose is achieved by introducing a spatial domain kernel and a value domain kernel. The bilateral filter can well retain image edge characteristics and filter out noise of low frequency components.
In the traditional Canny edge detection algorithm, a dual-threshold algorithm is adopted for classifying pixel points, however, the fixed threshold cannot achieve universality for each different image, so that the inherent problem that operators TC, laplacian, prewitt and Roberts cannot identify the boundary of the ossification center and the metacarpal bone end can occur. The Jinjin algorithm which is suitable for the optimal threshold value of different pictures is used for replacing manual threshold value selection, so that the uncertainty of the algorithm performance is minimized, and the final result is more in line with the semantic meaning of CHN-05, namely, the ossification center is clearly visible and is in a disc shape, the edge is smooth and continuous, and the fusion degree of epiphysis and diaphysis can be observed.
Compared with the prior art, the invention has at least the following advantages:
1. according to the model, a hand topological graph, an edge feature enhancement module and a deep learning network improvement model are used for constructing a bone age prediction evaluation model-TENet model, and the hand topological graph is used in the TENet model to enhance the extraction of information features of a hand X-ray film, so that available information is increased; by utilizing the improved Canny edge detection algorithm, namely an edge feature enhancement module, the extraction amount and accuracy of the information features of the edge part of the hand X-ray film are improved, and meanwhile, the uncertainty of the performance of the algorithm is minimized by using the Dajin algorithm to replace manual selection of a threshold, so that the final result is ensured to be more consistent with the belonged semantics of CHN-05.
2. Pulse noise often exists in a hand X-ray film, and the traditional denoising methods such as Gaussian filtering, median filtering, square frame filtering and the like cannot well protect high-frequency information, so that edge details of an image are blurred; the bilateral filtering denoising algorithm used by the invention can compromise the spatial proximity and the pixel similarity of the image, consider spatial domain information and gray level similarity, and retain the edge characteristics of the image while denoising, thereby achieving the effect of edge-preserving denoising.
3. The Otsu algorithm which is suitable for the optimal threshold value of different pictures is calculated to replace the manual threshold value selection, so that the uncertainty of the algorithm performance is minimized.
4. In data preprocessing, an automatic color balance algorithm is used for solving the problem that the contrast of a background area and an ROI of a hand X-ray film is low.
Drawings
Fig. 1 is an overall framework diagram of the tent model.
Fig. 2 is a network architecture used by the present invention.
Fig. 3 is a flow chart of a modified Canny algorithm.
FIG. 4 is a diagram of the effect of different algorithms on processing a hand X-ray film.
FIG. 5 is a graph of the processing results of a data set used by the method of the present invention. (a) - (d) are sample images using a private database, and (e) - (h) are sample images using an RSNA database.
Detailed Description
The present invention is described in further detail below. A novel automatic bone age evaluation model TENet is provided, and the purpose of horizontal fusion of multi-feature expression is achieved through feature abstraction of semantic description of bone age evaluation in CHN-05 and an algorithm. Firstly, a hand topological graph module is designed, length and position structure information of metacarpal-phalangeal bones are provided, and semantics of bone age evaluation in four directions of CHN-05 are met; secondly, a reinforced edge feature module is developed to identify the hand structure and extract effective edge information, so that the requirement of CHN-05 on bone edge definition is met, and the accuracy of bone age assessment is also enhanced.
Referring to fig. 1-5, a method for pediatric bone age prediction comprising the steps of:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D, the deep learning network is the prior art, the edge feature enhancement module is an improved Canny edge detection algorithm, the Canny edge detection algorithm is the prior art, the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and an Otsu algorithm, and the bilateral filtering denoising algorithm and the Otsu algorithm are the prior art;
the structure of the deep learning network improved model D in the S200 is that a deep learning network is used as a backbone network, the neural network layer at the top layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-connection layer with 32 neurons are sequentially added behind the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in the A is fused into image characteristics through the full-connection layer.
S300: constructing a training set of the TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A; the automatic color balance algorithm is the prior art, and is used for carrying out regional adaptive filtering on an original image to finish chromatic aberration correction so as to obtain a spatial domain reconstructed image, and characteristics and valuable information in the image are highlighted, so that the image is more in line with the perception of human eyes;
the specific steps of obtaining the preprocessing chart corresponding to the picture in the picture a in the step S310 are as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Figure BDA0003831416170000071
Wherein p and q represent two pixel points on the c-th image, I c (p) and I c (q) represents the gray-scale difference of p and q, respectively, representing the quasi-biological side-suppression, d (p, q) represents the distance metric function between p and q, which maps the region-adaptation of the filtering, S α (. Cndot.) represents a luminance expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
Figure BDA0003831416170000072
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full definition field representing L (p), the purpose of the full definition field being to achieve global white balance for the image;
s314: repeating S312-S313, traversing all the pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture;
s320: performing feature extraction processing on the pictures in the A by using a hand topology module to obtain a hand topology picture corresponding to each picture in the A;
the specific steps of obtaining the hand topological graph corresponding to the picture in the picture A in the step S320 are as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt The specific expression is as follows:
Figure BDA0003831416170000081
wherein, G xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Representing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
Figure BDA0003831416170000082
wherein G is yt Expressing the brightness difference approximate value of the t-th pixel point in the longitudinal direction;
s323: calculating gradient approximate value G of the t-th pixel point on c t The expression is as follows:
Figure BDA0003831416170000083
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
Figure BDA0003831416170000084
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all the pixel points on c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on c', wherein the threshold value binarization processing is the prior art, and obtaining the rough hand backbone foreground edge map of c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework, wherein the MediaPipe application framework is the prior art, and obtaining a hand topological graph of the c.
S330: performing edge enhancement on the preprocessed image corresponding to the image in the A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the image in the A;
the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
Figure BDA0003831416170000091
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is a Gaussian standardA difference;
the expression of the pixel value similarity is as follows:
Figure BDA0003831416170000092
wherein f (i, j) represents the pixel value of the image at the point a (i, j), f (k, l) represents the pixel value of the image at the point b (k, l), and delta r Is a Gaussian standard deviation, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering and denoising treatment as c ";
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1;(9)
Figure BDA0003831416170000093
Figure BDA0003831416170000094
wherein σ 2 The inter-class variance is expressed as a variance between classes,
Figure BDA0003831416170000095
mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
Figure BDA0003831416170000096
Figure BDA0003831416170000097
whereinK = {1, \8230;, T }, n denotes a category of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, and L represents the pixel level of the image, typically 255,p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value, m, representing the background 1 Represents the average gradation value of the object, and the expression is as follows:
Figure BDA0003831416170000098
Figure BDA0003831416170000099
s334: substituting equation (10) into equation (11) and simplifying to obtain equation (16), the expression is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ;(16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through formula (9) -formula (16) in turn to obtain N sigma values 2 The gray level refers to the gray level to which the pixel belongs;
s336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 As the pixel optimal threshold of c', in the calculation process, all pixel points are circularly traversed within the gray level range of 0-255, so that the gray level maximizing the formula (16) is obtained as the pixel optimal threshold of the picture;
s337: and c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
S340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a preprocessing image corresponding to each picture in A, a hand topological image corresponding to each picture in A and an edge feature enhancement image corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample;
s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in a training set are used as input of D, and an Adam optimizer is used for training the D to update parameters of the D, wherein the Adam optimizer is in the prior art, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using the D' as a trained TENet model;
s500: inputting an X-ray film of a hand part of a child needing to predict the bone age of unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
Experimental verification
Pediatric Bone Age Assessment (BAA) is an important human physical examination that can reflect the growth potential and sexual maturation trends of humans. In clinical practice, the Chinese method for assessing and assessing maturity of carpal bones of young children (CHN-05) is a widely used method for BAA by Chinese radiologists. CHN-05 uses Metacarpophalangeal Length (ML) and metacarpophalangeal edge (MM) as reference criteria to estimate bone age. Inspired by the CHN-05 semantic description, the present invention proposes a new model, called TENet, for automatic bone age assessment. A hand topology module is designed in the TENet model to identify key positions of hands, and the key positions are used for extracting structural semantics; an edge feature enhancement module is designed to provide accurate bone edge information during the training process. The TENet model can detect edge overall information and topological structure local information, so that the aim of evaluating the bone age through multi-feature horizontal fusion is fulfilled.
The experimental result shows that the TENet model achieves the most advanced model performance of 5.35meanabsolute terror (MAE) on the public data set RSNA, and the design of the TENet model follows CHN-05 semantic logic, so that the TENet model has very good reliability and interpretability in clinical application.
The experiment was conducted to evaluate the TENet model on a private dataset and a North American society of radiology bone age assessment (RSNA-BAA) public dataset. The private data set comprises 4954 hand X-rays in the training set, and 275 hand X-rays in the verification set and the test set; the RSNA-BAA public data set contained 5611 hand X-rays in the training set, and 275 hand X-rays in the validation set and the testing set. All hand X-rays were resized to 560X 560 before processing by the tent model. The mean absolute error between the basic real bone age and the corresponding predicted bone age is also reported.
TABLE 1 ablation study of the model of the invention
EXP. OI PHI ImCAT L1 MAE
1 (private data set) 7.15
2 (private data set) 6.92
3 (private data set) 5.76
4 (public data set) 5.80
( EXP is Experimental and represents the Experiment; OI is originALImage and represents an original image; PHI is Pre-processdhandimage and represents a hand preprocessing diagram; imCAT is Impravadaptivethroresholdingcandyedgedetection, and represents a hand edge enhancement map; l1 is Loss1 and represents the MAE Loss function; MAE is meanabloluoteerror, representing the mean absolute loss. )
Implementation details
TENET was achieved via Tensorflow1.9 and training was completed on a NVIDIATITAN RTX GPU and 32GRAM system, which took about 8 hours. We used 100 epochs to train the whole of TENetAnd (6) carrying out the process. The batch size was 32. Initial learning rate of 3 × 10 -3 After 50 epochs, the decrease is 10 -3 The drop was 10-fold after 80 epochs. The optimizer used is Adam.
TENet model ablation experiment
The role of each module in the TENet model, namely artwork, pre-Processed Hand Images (PHI) and edge feature enhancement (ImCAT), was first studied. The experimental results are shown in table 1, without these two modules, the network degenerated to Xception, resulting in an MAE score of 7.19 months; when the preprocessing image is applied to the model instead of the original image, the obtained MAE score is 6.92 months; when the edge feature enhancement module is applied to the model, the obtained MAE score is 5.76 months, and the MAE score is improved by 1.16 months; the necessity of acquiring the edge features and the effectiveness of the ImCAT algorithm are illustrated by actual experimental data.
Combining these two components, the TENet model achieved a 5.80 month MAE on the public dataset. Taken together, the results from the private dataset over the public dataset can prove the correctness of the TENet model design elicited by the CHN-05 wrist bone standard. Note: smaller MAE values indicate better results.
TABLE 2 ablation study of ImCAT assay
Figure BDA0003831416170000111
Figure BDA0003831416170000121
Hand edge feature ablation experiment
The TENet model detects hand edge features and combines the topologies for bone age assessment. By observation, imCAT also benefits from its modular design, using peak signal-to-noise ratio (PSNR), mean Square Error (MSE), structural Similarity (SSIM), root Mean Square Error (RMSE), and Mean Absolute Error (MAE) to measure performance, with the results shown in table 2. The Traditional Canny (TC) algorithm is used as a base line, four edge calculators are used as the base line, the ImCAT algorithm used in the method is obviously superior to other algorithms, and the ImCAT has the capability of capturing the whole edge details of the hand better.
The results in the RSNA public and private datasets show that the tent model achieves a good performance. In addition, the TENet model is designed by CHN-05 prior knowledge, so that the TENet model has good interpretability and reliability for clinical practice. In addition, the invention also provides a reliable experimental basis for forming an end-to-end automatic bone age assessment framework by combining the characteristic acquisition module and the scoring regression module in an attempt.
Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. A method for pediatric bone age prediction, comprising: the method comprises the following steps:
s100: selecting a public data set, wherein the public data set comprises a hand X-ray film with a bone age label; randomly selecting partial hand X-ray films to form a data set A, and adjusting the sizes of all pictures in the A to be 560X 560;
s200: establishing a TENet model, wherein the TENet model comprises a hand topology module, an edge feature enhancement module and a deep learning network improvement model D, the edge feature enhancement module is an improved Canny edge detection algorithm, and the improved Canny edge detection algorithm uses a bilateral filtering denoising algorithm and a Otsu algorithm;
s300: constructing a training set of a TENet model:
s310: using an automatic color balance algorithm to carry out picture preprocessing on the pictures in the A to obtain a preprocessed picture corresponding to each picture in the A;
s320: using a hand topology module to perform feature extraction processing on the pictures in the A to obtain a hand topology picture corresponding to each picture in the A;
s330: performing edge enhancement on the preprocessed image corresponding to the image in the A by using an edge feature enhancement module to obtain an edge feature enhancement image corresponding to the image in the A;
s340: repeating S310-S330, traversing all the hand X-ray films in A to obtain a preprocessing image corresponding to each picture in A, a hand topological image corresponding to each picture in A and an edge feature enhancement image corresponding to each picture in A;
s350: all the preprocessing graphs, the hand topological graph and the edge feature enhancement graph obtained in the step S340 jointly form a training set of a TENet model, wherein the preprocessing graph, the hand topological graph and the edge feature enhancement graph corresponding to each graph form a training sample;
s400: presetting maximum iteration times, presetting the initial learning rate of D as mu, and training D:
s410: all training samples in the training set are used as input of D, an Adam optimizer is used for training the D to update parameters of the D, and an MAE target loss function is used in the Adam optimizer;
stopping training when the maximum iteration times are reached to obtain a trained deep learning network improved model D';
s420: taking the TENet model using D' as a trained TENet model;
s500: inputting an X-ray film of the hand of the child needing to predict the unknown bone age into the trained TENet model, and outputting the X-ray film to be the predicted value of the bone age of the child.
2. The method of claim 1, wherein the bone age of the child is predicted by: the structure of the deep learning network improved model D in the S200 is that a deep learning network is used as a backbone network, the neural network layer at the top layer is removed sequentially from top to bottom, then a 3X3 down-sampling layer, a 3X3 maximum pooling layer and a full-connection layer with 32 neurons are sequentially added behind the bottom layer of the original backbone network, and gender information corresponding to the hand X-ray film in the A is fused into image characteristics through the full-connection layer.
3. A method for pediatric bone age prediction as defined in claim 2, wherein: the specific steps of obtaining the preprocessing chart corresponding to the picture in the picture a in the step S310 are as follows:
s311: selecting a c picture from A:
s312: calculating the adaptive filtering intermediate variable R of any pixel point p in the c picture c (p);
Figure FDA0003831416160000021
Wherein p and q represent two pixel points on the c-th image, I c (p) and I c (q) represents the gray level difference of p and q, respectively, d (p, q) represents the distance metric function between p and q, S α () represents a luma expression function;
s313: calculating the L (p) value of the c-th graph, and calculating the expression as follows:
Figure FDA0003831416160000022
wherein L (p) represents the preprocessing result of the pixel point p, and the mapping interval of the pixel p is [0,255 ]],[minR c (p),maxR c (p)]A full domain representing L (p);
s314: and repeating S312-S313, traversing all the pixel points on the c picture, and preprocessing all the pixel points in the c picture to obtain a preprocessed picture corresponding to the c picture.
4. A method for pediatric bone age prediction as defined in claim 3, wherein: the specific steps of obtaining the hand topological graph corresponding to the picture in the picture A in the step S320 are as follows:
s321: selecting a c picture from A:
s322: performing planar convolution on the t-th pixel point on the c by using the horizontal direction matrix to obtain a brightness difference approximate value G of the t-th pixel point in the transverse direction xt Detailed description of the inventionThe formula is as follows:
Figure FDA0003831416160000023
wherein G is xt Representing an approximation of the difference in luminance in the transverse direction, c, of the t-th pixel t Expressing the t pixel point;
using the vertical direction matrix to carry out plane convolution with the t-th pixel point on the c to obtain a brightness difference approximate value G on the t-th pixel point in the longitudinal direction yt The specific expression is as follows:
Figure FDA0003831416160000024
wherein, G yt Expressing the brightness difference approximate value of the t-th pixel point in the longitudinal direction;
s323: calculating the gradient approximation G of the t-th pixel point on c t The expression is as follows:
Figure FDA0003831416160000031
calculating the direction of the t-th pixel point on the c, wherein the expression is as follows:
Figure FDA0003831416160000032
wherein, α (x) t ,y t ) Representing the direction of the t-th pixel point on the c;
s324: repeating S322-S323, traversing all the pixel points on c to obtain the convolution value of each pixel point, wherein the convolution values of all the pixel points form a matrix B, and the matrix B is a gradient map c';
s325: and (5) carrying out threshold value binarization processing on the c' to obtain a hand diaphysis rough foreground edge image of the c.
S326: and (4) performing position prediction on joint points on the hand diaphysis rough foreground edge graph corresponding to the c by using a MediaPipe application framework to obtain a hand topological graph of the c.
5. The method of claim 4, wherein the bone age of the child is predicted by: the specific steps of obtaining the edge feature enhancement map corresponding to the picture in the picture a in S330 are as follows:
s331: selecting a c picture from A:
s332: calculating the spatial proximity and the pixel value similarity of each pixel point on the c picture, and performing bilateral filtering denoising processing on the c picture, wherein the expression of the spatial proximity is as follows:
Figure FDA0003831416160000033
wherein i, j, k, l represents coordinate values of pixel points a and b, i.e. a (i, j) and b (k, l), delta d Is the gaussian standard deviation;
the expression of the pixel value similarity is as follows:
Figure FDA0003831416160000034
wherein f (i, j) represents the pixel value of the image at the point a (i, j), f (k, l) represents the pixel value of the image at the point b (k, l), and delta r Is a Gaussian standard deviation, W r Calculating the difference degree between the near point a and the central point b;
setting a graph obtained after the c-th graph is subjected to bilateral filtering denoising treatment as c;
s333: setting a threshold value T, and dividing c' into a background and a target;
and (3) calculating to obtain the optimal pixel threshold value of c' by adopting an Otsu algorithm, wherein the calculation formula is as follows:
p 0 +p 1 =1; (9)
Figure FDA0003831416160000035
Figure FDA0003831416160000036
wherein σ 2 The inter-class variance is represented as,
Figure FDA0003831416160000037
mean gray value, p, representing c ″ 0 Representing the proportion of the background on the picture, p 1 Represents the proportion of the target on the picture, and the expression is as follows:
Figure FDA0003831416160000041
Figure FDA0003831416160000042
where k = {1, \8230;, T }, n denotes the class of gray values, p n Indicating the frequency of occurrence of each type of grey value, p 0 Representing the sum of the probability of occurrence of each type of gray value between 0 and k, L representing the pixel level of the image, p 1 Representing the sum of the occurrence probability of each type of gray value between k +1 and L;
m 0 mean gray value, m, representing the background 1 Represents the average gradation value of the object, and the expression is as follows:
Figure FDA0003831416160000043
Figure FDA0003831416160000044
s334: substituting equation (10) into equation (11) and simplifying to obtain equation (16), the expression is as follows:
σ 2 =p 0 p 1 (m 0 -m 1 ) 2 ; (16)
s335: traversing all pixel points on c ', and calculating the pixel points with the same category gray value on c' through a formula (9) -a formula (16) in sequence to obtain N sigma 2
S336: will obtain N σ 2 Sorting in descending order, selecting the sigma with the largest value 2 Pixel optimal threshold as c ";
s337: c' after 336 processing is taken as an edge feature enhancement map corresponding to the c-th map.
CN202211076541.3A 2022-09-05 2022-09-05 Method for predicting bone age of children Pending CN115423779A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211076541.3A CN115423779A (en) 2022-09-05 2022-09-05 Method for predicting bone age of children

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211076541.3A CN115423779A (en) 2022-09-05 2022-09-05 Method for predicting bone age of children

Publications (1)

Publication Number Publication Date
CN115423779A true CN115423779A (en) 2022-12-02

Family

ID=84202426

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211076541.3A Pending CN115423779A (en) 2022-09-05 2022-09-05 Method for predicting bone age of children

Country Status (1)

Country Link
CN (1) CN115423779A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351325A (en) * 2023-12-06 2024-01-05 浙江省建筑设计研究院 Model training method, building effect graph generation method, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351325A (en) * 2023-12-06 2024-01-05 浙江省建筑设计研究院 Model training method, building effect graph generation method, equipment and medium
CN117351325B (en) * 2023-12-06 2024-03-01 浙江省建筑设计研究院 Model training method, building effect graph generation method, equipment and medium

Similar Documents

Publication Publication Date Title
CN107369160B (en) Choroid neogenesis blood vessel segmentation algorithm in OCT image
CN107610087B (en) Tongue coating automatic segmentation method based on deep learning
Gao et al. A deep learning based approach to classification of CT brain images
CN106940816A (en) Connect the CT image Lung neoplasm detecting systems of convolutional neural networks entirely based on 3D
CN109035172B (en) Non-local mean ultrasonic image denoising method based on deep learning
CN109886965B (en) Retina layer segmentation method and system combining level set with deep learning
Koitka et al. Mimicking the radiologists’ workflow: Estimating pediatric hand bone age with stacked deep neural networks
JP7427080B2 (en) Weakly supervised multitask learning for cell detection and segmentation
Bibiloni et al. A real-time fuzzy morphological algorithm for retinal vessel segmentation
CN112465905A (en) Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning
Pan et al. Prostate segmentation from 3d mri using a two-stage model and variable-input based uncertainty measure
Chen et al. Skin lesion segmentation using recurrent attentional convolutional networks
CN111062936B (en) Quantitative index evaluation method for facial deformation diagnosis and treatment effect
JP2006507579A (en) Histological evaluation of nuclear polymorphism
CN115423779A (en) Method for predicting bone age of children
Zhao et al. Attention residual convolution neural network based on U-net (AttentionResU-Net) for retina vessel segmentation
CN116579975A (en) Brain age prediction method and system of convolutional neural network
CN112036298A (en) Cell detection method based on double-segment block convolutional neural network
Wang et al. Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting
CN117274278A (en) Retina image focus part segmentation method and system based on simulated receptive field
CN114862799B (en) Full-automatic brain volume segmentation method for FLAIR-MRI sequence
CN113177499A (en) Tongue crack shape identification method and system based on computer vision
CN112949585A (en) Identification method and device for blood vessels of fundus image, electronic equipment and storage medium
EP3905129A1 (en) Method for identifying bone images
CN112381767A (en) Cornea reflection image screening method and device, intelligent terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination