CN107871316B - Automatic X-ray film hand bone interest area extraction method based on deep neural network - Google Patents

Automatic X-ray film hand bone interest area extraction method based on deep neural network Download PDF

Info

Publication number
CN107871316B
CN107871316B CN201710975940.6A CN201710975940A CN107871316B CN 107871316 B CN107871316 B CN 107871316B CN 201710975940 A CN201710975940 A CN 201710975940A CN 107871316 B CN107871316 B CN 107871316B
Authority
CN
China
Prior art keywords
image
hand bone
layer
hand
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710975940.6A
Other languages
Chinese (zh)
Other versions
CN107871316A (en
Inventor
郝鹏翼
陈易京
尤堃
吴福理
黄玉娇
白琮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Feitu Imaging Technology Co ltd
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201710975940.6A priority Critical patent/CN107871316B/en
Publication of CN107871316A publication Critical patent/CN107871316A/en
Application granted granted Critical
Publication of CN107871316B publication Critical patent/CN107871316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

An X-ray hand bone interest area automatic extraction method based on a deep neural network is characterized in that for an original hand bone X-ray image, parts of black backgrounds at two sides of the image, which are embedded with characters, are removed; uniformly brightening and denoising the original hand bone X-ray film image; sampling and training a model M1 to obtain an X-ray film image Output2 of the hand bones without characters; normalizing the size for Output2 results in Output 3; sampling and training a model M2, and judging the intersection part of the hand bone, the background and the hand bone background in Output 3; judging an image sliding window in Output3 based on the model M2, and obtaining an hand bone mark mapping map Output4 according to a judgment value; obtaining an image Output5 only with hand bones based on Output3 and Output 4; and optimizing Output5 to obtain the final hand bone interest area. The invention can automatically acquire the hand bone interest area in the X-ray film.

Description

Automatic X-ray film hand bone interest area extraction method based on deep neural network
Technical Field
The invention relates to the field of medical image analysis and machine learning, in particular to a method for extracting an interest region of an X-ray image of a hand bone of a human body, and belongs to the field of medical image analysis based on deep learning.
Background
Bone age, abbreviated as bone age, is determined by the degree of calcification of the bones of a child. The bone age can accurately reflect the development level of each age stage in the process from birth to complete maturity of people, and has an important role in the analysis and diagnosis of endocrine diseases, developmental disorders, nutritional disorders, genetic diseases and metabolic diseases. Radiologists measure the bone age of children by comparing their hands with their standard state of age. This technology is stable and has been around for decades. With the improvement of health consciousness of parents, the number of children for bone age detection is increased day by day, but the current situation of medical institutions is not changed: there are few medical practitioners in pediatric imaging department, and the reading efficiency is not high.
With the GPU accelerated deep learning technique, it has become possible to accomplish automatic detection of bone age through artificial intelligence. Deep learning requires a large number of hand bone X-ray pictures for training to achieve good detection effect. But the original hand bone X-ray picture has a lot of useless information, such as characters, noise and the like; in addition, the original hand bone X-ray picture is not uniform in placement of each hand and illumination of a machine, so that the position and brightness of the hand in the X-ray picture are not fixed, and the learning effect of the detection model is poor due to the factors, so that a large judgment error is caused. It is therefore crucial for bone age prediction based on deep learning to accurately acquire hand bone regions on an X-ray film and process them into good samples that can be learned by the detection model. However, manual labeling of the hand part is inefficient, and the labeling results of different labeling users are different.
Disclosure of Invention
In order to overcome the defects of low efficiency, large error and low precision of the existing X-ray hand bone interest region extraction mode, the invention provides the automatic X-ray hand bone interest region extraction method based on the deep neural network, which has high efficiency, small error and high precision, and not only can automatically acquire the hand bone region in an X-ray, but also can automatically remove noise and adjust brightness.
In order to solve the technical problems, the invention adopts the technical scheme that:
an automatic X-ray picture hand bone interest area extraction method based on a deep neural network comprises the following steps:
removing the parts of black backgrounds at two sides of an original hand bone X-ray film gray image, which are embedded into characters, so that most of the characters are removed from the original image;
step two, carrying out brightening operation on the original hand bone X-ray film gray image, wherein the step is to carry out overall brightness evaluation on the image, carry out brightening operation on the image with insufficient brightness, carry out denoising operation after brightening, and obtain an image set called Output 1;
step three, sampling and training a model M1, wherein the model is used for removing characters near hand bones and on the hand bones in an X-ray film in Output1 to obtain an X-ray film grayscale image Output2 of the hand bones without the characters;
step four, normalizing the sizes of all the images in Output2, in order to keep the height and width consistent, firstly performing black bottom filling operation on two sides, when the width ratio of the image is high, adopting inward cutting operation on two sides, then reducing the images to (512 ), and calling a new image set as Output 3;
step five, sampling and training a model M2, wherein the model is used for judging three parts of hand bones, backgrounds and intersecting hand bone backgrounds in Output3, and the size of the hand bone background is 16 × 16;
step six, judging an image sliding window in Output3 by using a model M2, adding a judged value to each pixel point, setting the pixel value of the hand bone part to be 255 and the pixel value of the background part to be 0 according to the values of different judgment types obtained by each pixel point, and thus obtaining a hand bone binary marker image which is called Output 4;
step seven, obtaining an image only with a hand bone region on the basis of Output3 by contrasting the hand bone binary marker map of Output4, wherein background impurities still exist in the image, so that calculation of a maximum connected region needs to be performed once here to remove the impurities, and Output5 is obtained;
step eight, because the phenomenon that the aperture around the image is judged as hand bone tissue exists in the Output5, and the aperture is connected with the maximum communication area; since the length of the aperture is much larger than the width of the hand bone, the difference comparison is made for the bottom part of each image in Output5, so as to remove the aperture in the image and obtain the final hand bone region of interest.
The operation flow is described to the end.
Further, in the first step, the method for removing the text embedded in the black background comprises: firstly, converting the image into a numerical array, detecting the left column and the right column towards the middle, and judging that the non-black background embedded character part is started if the number of the existing non-pure black characters exceeds 10% of the whole column due to the particularity of the pure white characters and the pure black background, and then completely cutting the previous part.
Still further, in the second step, the overall brightness evaluation process is as follows: let the resolution of an original image O be MxN, and the value of each pixel be tijBy the formula
Figure GDA0002440674580000031
Figure GDA0002440674580000032
The overall brightness of the image is calculated. The pixel points with the pixel value larger than 120 are considered, and different parameters are used for brightening different Aug values.
Further, in the third step, for training the model M1, the sample collection process is as follows: intercepting 100 x 100 samples containing letters in the original gray image as positive samples; intercepting samples with the same size and without letters as negative samples; the process of constructing the two-dimensional convolutional neural network comprises the following steps:
step 3.1 the input image was subjected to Conv2D convolution to extract local features with an input size of 100 x 1, followed by a relu activation function layer, maxpoloring pooling layer.
Step 3.2 extraction of three Conv2D convolutional layers of different sizes, in which the activation function layer and pooling layer structure was in accordance with 3.1.
Step 3.3 the 4 convolutional layers described above are connected to the next fully-connected layer via a Flatten layer.
Step 3.4 the above features are prevented from overfitting by the first fully connected layer, the internal sequence comprising a sense layer, a relu active layer, and a Dropout layer. The second fully-connected layer follows, the internal sequence comprises a Dense layer and a sigmoid activation layer, and an output result is obtained.
Among them, the letter L can be found by the selective search method and accurately positioned to the size of 100 × 100 because of the particularity of the letter. Because other letters are mainly concentrated in the upper right corner of the image and the upper right corner characters do not affect the judgment of the hand bone region, after the L is found and removed, the upper right corner region is filled according to the average value of the background peripheral values.
In the fifth step, the training process of the model M2 is as follows: after the text interference is removed through step four, the model is used to distinguish the background, hand bones, and the intersection area of the hand bones and the background. Sampling the three types by adopting a sliding window, wherein the three types are defined as 0,1 and 2 and respectively correspond to a background, a hand bone background intersection area and a hand bone; the process of constructing the two-dimensional convolutional neural network model comprises the following steps:
step 5.1, extracting local features of the input image through a two-dimensional convolution layer, wherein the input size is 16 × 1, and then a relu activation function layer and a Maxpooling pooling layer are carried out;
step 5.2, extracting a Conv2D two-dimensional convolution layer with different sizes, wherein the structures of an activation function layer and a Maxpooling layer are consistent with 5.1;
step 5.3, connecting the convolution layer with the following full-connection layer through a Flatten layer;
step 5.4 the above feature obtains the output result through the first fully-connected layer, the internal sequence including the sense layer, the relu active layer, and the Dropout layer to prevent overfitting, followed by the second fully-connected layer, the internal sequence including the sense layer, and the softmax active layer (output is of three types).
In the sixth step, the sliding window judging process is as follows:
step 6.1, performing sliding window on the input image, 16 × 16, wherein the step size is 1;
step 6.2, for the 16 × 16 patch, judging by using the model M2, adding 1 to the x values of the 16 × 16 pixels in the patch to obtain a value x, and defining val [512] [512] [3] to count the number of different x values of each pixel, wherein each pixel has a process of counting the number of different x values;
step 6.3, defining an output array result, for each point in the result, only comparing the number of 0 types and 2 types in the corresponding val array, if the number of the 2 types is more, filling the point corresponding to the result to be 255, namely white, otherwise, 0, and black;
since the sliding window adds statistics corresponding to the pixels, the complexity is O (nxnxnxnxmxmxmxm ×), since we only need a single point query, here the complexity reduction is done by the tree array, the statistical complexity is reduced to O (nxnxnxlog)2(n)), the single point query complexity is log2(n)。
In the step eight, the process of difference comparison is as follows:
step 8.1: the image is firstly converted into a numerical array, and because the light shadow exists at the bottom, only the cross section (patch) with a certain height is taken from the bottom, and the height is determined according to the average height of the light shadow;
step 8.2: performing white statistics on each line in the patch, and reserving the most white lines and the least white lines;
step 8.3: comparing the two reserved rows column by column, wherein different numbers t are recorded when one row is white and one row is black;
step 8.4: when t is larger than a set difference threshold value, judging that light shadow occurs, discarding the part between the two lines, and if the t is not larger than the threshold value, keeping the part unchanged;
step 8.5: and performing maximum connected region calculation again, wherein the part between the light shadow and the hand bone is abandoned, so that the hand bone region is reserved at this time, and the light shadow part is also removed.
The technical conception of the invention is as follows: the character information in the X-ray image is detected by adopting the deep neural network, the contents of different areas in the X-ray image are classified by adopting the deep neural network, and the obtained model can automatically and quickly extract the interesting area of the hand bones.
In the process provided by the invention, the first model M1 is used for distinguishing the most prominent letters L (representing the left hand), R (representing the right hand) and some residual character information in the initial X-ray film image from other backgrounds through sampling training, so that the task of removing the letter information from the original image is realized, and an important foundation is laid for extracting the interested region of the hand bones later. The purpose of the second model M2 is to obtain a hand bone marker map that distinguishes between interesting and non-interesting parts of the input picture by different labels, and the invention uses markers placed in white and black. Since text has been removed in model M1, at this stage the model is used primarily to distinguish between the three classes, background, hand bone background intersection. And sampling the input image by sliding a window, and then counting the detection result on each pixel point to obtain the mark mapping.
Compared with the traditional method for artificially extracting the hand bone region of interest, the method has the advantages that: 1. the efficiency of obtaining the hand bone region of interest is greatly improved. 2. And the unified processing can reduce errors caused by manual operation. 3. More accurate results than artificial extraction can be obtained. 4. Different tasks at different stages, the layered operation flow reduces the paralysis risk of the whole flow caused by extreme pictures, and is easier to maintain and improve.
Drawings
FIG. 1 is a flow chart of X-ray hand bone region of interest extraction based on a deep neural network.
Fig. 2 is a schematic structural diagram of a model M1 for removing text.
Fig. 3 is a schematic structural diagram of a model M2 for constructing hand bone marker mapping.
FIG. 4 is a schematic diagram of the convolution module structures in model M1 and model M2.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 4, for the sampling operation of Output1, pictures with text information, especially the large letter L, are sampled mainly through a sliding window. The detection of whether the picture has the text information can be almost 100% completed by training the model M1. When the Output1 is input into the model, a selective search (selective search) is used to detect large letters as objects, and a uniform coverage strategy is used for other characters in corners. By doing so, the time for sliding the original large image is saved, and the text information can be effectively removed.
According to the method, after the text information is removed, the image is normalized to be of a uniform size, so that the time complexity in the subsequent acquisition of the mark mapping of the hand bone interesting region is reduced. For the acquisition of the tag map, detailed description is made here:
1. the model M2 can classify the 16 × 16 samples into three categories, background, hand bone, and hand bone background intersection.
2. With a step size of 1, a total of 500 × 500 samples are detected by sliding window to Output 3.
3. The detection result of each sample is counted on 16 × 16 pixels of the sample to obtain the detection result of each category. So a 512 x 3 array is needed for statistics. 0,1 and 2 respectively correspond to the background, the hand bone background intersection area and the hand bone. Statistics are necessary because assuming that the previous sample detection is background and the next sample detection is the intersection area, all pixels differed by this step size will eventually be considered as background, and the boundary in the intersection area is divided. According to the statistical data of three categories in the pixel, if the number of 0 categories is large, the pixel is set to be 0 (black), otherwise, the pixel is set to be 255 (white). Finally we can get that the hand bone Region (ROI) is white and the background noise is all black.
4. Because the time complexity of the algorithm is high, the tree array is used for optimization, and the conventional tree array is expanded into a two-dimensional tree array.
5. After the marker map is obtained, it is inevitable that there will be some white noise, which is removed by calculating and retaining the maximum connected region. In addition, the light and shadow of the hand bone region and the bottom of the image are sometimes overlapped, and the light and shadow are very similar to the hand bone tissue, so that the model cannot accurately distinguish the hand bone region and the bottom of the image, and the difference detection is performed on the upward cross section of the bottom of the image, and the method comprises the following specific steps: the target area is converted into a data array, the rows with the most white dots and the rows with the least white dots are calculated, and then the rows with the most white dots and the rows with the least white dots are compared column by column, wherein the rows are either white or black, otherwise, the rows are regarded as differences. If the difference is greater than the threshold, it is determined to be caused by light shadow, and the area between the two rows is discarded, and then the maximum connected area calculation is performed again, so that the noise caused by light shadow can be removed.
Example (c): the hand bones used were X-ray images containing an age range of 0 to 18 years, for a total of 944 specimens. 632 samples of the samples were used as training sets, model M1 and model M2 were trained, and the remaining 312 samples were used as test sets. The validation set and the training set coincide. The following describes the construction and testing processes of the two models respectively.
Model M1:
step 1.1, constructing a deep convolutional neural network, wherein the specific structure is shown in fig. 2.
Step 1.1.1: the convolutional neural network comprises 4 convolutional modules, one Flatten layer and two fully-connected layers
Step 1.1.2: in the convolutional layer, the convolution kernel size is 1 × 3, the input size is (1,100,100), and the number of convolution kernels increases with the depth of the network, and is 6,12,24 and 48 in sequence. Each convolution module also includes a relu activation layer, and a Maxpooling pooling layer, with pool _ size being (1, 2).
Step 1.1.3: before the full bonding of the layers, there is a flatten layer to flatten the output of the convolutional layers.
Step 1.1.4: and in the fully-connected layers, the first fully-connected layer is provided with a Dropout layer to prevent overfitting, the parameter is 0.5, and the subsequent fully-connected layers are output through a sigmoid active layer.
Step 1.2, data sampling and model training
Step 1.2.1: the hand bones X-ray film images are all gray-scale images, and the number of channels is 1. 500 images were sampled, each with 99 samples plus one letter L at random. Connecting all samples into a numpy array, wherein the first dimension is the number of the samples, and a four-dimensional array (50000,1,100 and 100) is obtained to serve as a training set; the test set was treated in the same manner as (10000,1, 100). The validation set and the training set coincide.
Step 1.2.2: the model adopts a batch training mode, the number of samples of each batch of the training set generator and the verification set generator is 120, 200 times of training are performed totally, a logarithmic loss function is adopted, and an rmsprop algorithm is adopted for optimization. The models only remain the models with the highest accuracy.
Step 1.3, model testing
For the function of the model M2, see the operation flow in detail, and are not described herein again.
Model M2:
and 2.1, constructing a deep convolutional neural network, wherein the specific structure is shown in figure 2.
Step 2.1.1: the convolutional neural network comprises 2 convolutional modules, one Flatten layer and two fully-connected layers
Step 2.1.2: in the convolutional layer, the size of the convolutional kernel is 1 × 3, the input size is (1,16,16), and the number of convolutional kernels increases with the depth of the network, and is sequentially 6 and 24. Each convolution module also includes an activation layer of relu, and a Maxpooling layer, with pool _ size of (1, 2).
Step 2.1.3: before fully joining the layers, there will be a flatten layer to flatten the output of the roll substrate.
Step 2.1.4: of the fully-connected layers, the first fully-connected layer inputs the previous 24 outputs at 96 input points and has a Dropout layer to prevent overfitting, with a parameter of 0.05, and the following fully-connected layer outputs in three categories through the active layer of softmax.
Step 2.2, data sampling and model training
Step 2.2.1: the hand bones X-ray film images are all gray-scale images, and the number of channels is 1. The 500 images were sampled in a sliding window of 16 x 16, step size 32, too unique to the background, step size 8, and this portion of the sample was doubled to increase its weight in the model training. Thereby obtaining a four-dimensional array (100000,1,16,16) as a training set, wherein the first dimension is the number of samples; the test set is treated in the same way as (20000,1,16, 16). The validation set and the training set coincide.
Step 2.2.2: the model adopts a batch training mode, the number of samples of each batch of the training set generator and the verification set generator is 120, the training is carried out for 2000 times, the loss uses a multi-classification logarithmic loss function, and the optimizer selects adam. The models only remain the models with the highest accuracy.
Step 2.3, model testing
Through the operation of the steps, the extraction of the hand bone interest area in the hand bone X-ray film image can be realized.
The above detailed description is intended to illustrate the objects, aspects and advantages of the present invention, and it should be understood that the above detailed description is only exemplary of the present invention, and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. An automatic X-ray picture hand bone interest area extraction method based on a deep neural network is characterized by comprising the following steps: the method comprises the following steps:
removing the parts of black backgrounds at two sides of an original hand bone X-ray film gray image, which are embedded into characters, so that most of the characters are removed from the original image;
step two, carrying out brightening operation on the original hand bone X-ray film gray image, wherein the step is to carry out overall brightness evaluation on the image, carry out brightening operation on the image with insufficient brightness, carry out denoising operation after brightening, and obtain an image set called Output 1;
step three, sampling and training a model M1, wherein the model is used for removing characters near hand bones and on the hand bones in an X-ray film in Output1 to obtain an X-ray film grayscale image Output2 of the hand bones without the characters;
step four, normalizing the sizes of all the images in Output2, in order to keep the height and width consistent, firstly performing black bottom filling operation on two sides, when the width ratio of the image is high, adopting inward cutting operation on two sides, then reducing the images to (512 ), and calling a new image set as Output 3;
step five, sampling and training a model M2, wherein the model is used for judging three parts of hand bones, backgrounds and intersecting hand bone backgrounds in Output3, and the size of the hand bone background is 16 × 16;
step six, judging an image sliding window in Output3 by using a model M2, adding a judged value to each pixel point, setting the pixel value of the hand bone part to be 255 and the pixel value of the background part to be 0 according to the values of different judgment types obtained by each pixel point, and thus obtaining a hand bone binary marker image which is called Output 4;
step seven, obtaining an image only with a hand bone region on the basis of Output3 by contrasting the hand bone binary marker map of Output4, wherein background impurities still exist in the image, so that calculation of a maximum connected region needs to be performed once here to remove the impurities, and Output5 is obtained;
step eight, because the aperture around the image is judged as the hand bone tissue in the Output5 and is connected with the maximum communication area, and because the length of the aperture is far greater than the width of the hand bone, the bottom part of each image in the Output5 is compared differently, so that the aperture in the image is removed, and the final hand bone interest area is obtained.
2. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 1, characterized in that: in the first step, the method for removing the text embedded in the black background comprises the following steps: firstly, converting an image into a numerical array, detecting the left column and the right column towards the middle, and judging that the non-black background embedded character part is started at the position if the number of the existing non-pure black characters exceeds 10% of the whole column due to the particularity of the pure white characters and the pure black background, and then completely cutting the previous part.
3. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 1 or 2, characterized in that: in the second step, the overall brightness evaluation processComprises the following steps: let the resolution of an original image O be MxN, and the value of each pixel be tijBy the formula
Figure FDA0002440674570000021
And calculating the overall brightness of the image, wherein pixel points with pixel values larger than 120 are considered, and different parameters are used for brightening different Aug values.
4. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 1 or 2, characterized in that: in the third step, for the training of the model M1, the sample acquisition process is as follows: intercepting 100 x 100 samples containing letters in the original gray image as positive samples; intercepting samples with the same size and without letters as negative samples; the process of constructing the two-dimensional convolutional neural network comprises the following steps:
step 3.1, extracting local features from the input image through a Conv2D convolution layer, wherein the input size is 100 × 1, and then a relu activation function layer and a Maxpooling pooling layer are carried out;
step 3.2, extracting three Conv2D convolutional layers with different sizes, wherein the structures of an activation function layer and a pooling layer are consistent with the step 3.1;
step 3.3, connecting the 4 convolutional layers with the next full-connection layer through a Flatten layer;
step 3.4, obtaining an output result through a first full connection layer, wherein the internal sequence comprises a Dense layer, a relu activation layer and a Dropout layer to prevent overfitting, and then a second full connection layer, wherein the internal sequence comprises the Dense layer and a sigmoid activation layer;
the letters are searched for and judged in the image through a selective search method, and because of the particularity of the letters, the letters L can be found through the method and are accurately positioned to be 100 x 100 in size; because other letters are mainly concentrated in the upper right corner of the image and the upper right corner characters do not affect the judgment of the hand bone region, after the L is found and removed, the upper right corner region is filled according to the average value of the background peripheral values.
5. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 1 or 2, characterized in that: in the fifth step, training a model M2, after character interference is removed through the third step, the model is used for distinguishing the background, the hand bones and the intersecting areas of the hand bones and the background; sampling the three types by adopting a sliding window, wherein the three types are defined as 0,1 and 2 and respectively correspond to a background, a hand bone background intersection area and a hand bone; the process of constructing the two-dimensional convolutional neural network model comprises the following steps:
step 5.1, extracting local features of the input image through a two-dimensional convolution layer, wherein the input size is 16 × 1, and then a relu activation function layer and a Maxpooling pooling layer are carried out;
step 5.2, extracting a Conv2D two-dimensional convolution layer with different sizes, wherein the structures of an activation function layer and a Maxpooling layer are consistent with 5.1;
step 5.3, connecting the convolution layer with the following full-connection layer through a Flatten layer;
step 5.4, the characteristics are prevented from being over-fitted through a first full connection layer, wherein the internal sequence comprises a Dense layer, a relu activation layer and a Dropout layer; the second fully-connected layer follows, the internal sequence including the Dense layer, and the sotfmax active layer, resulting in an output result.
6. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 5, characterized in that: in the sixth step, the sliding window judging process is as follows:
step 6.1, performing sliding window on the input image, 16 × 16, wherein the step size is 1;
step 6.2, for the 16 × 16 patch, judging by using the model M2, adding 1 to the x values of the 16 × 16 pixels in the patch to obtain a value x, and defining val [512] [512] [3] to count the number of different x values of each pixel, wherein each pixel has a process of counting the number of different x values;
step 6.3, an output array result is defined, for each point in the result, the number of only 0 type and 2 types is compared in the corresponding val array, if the number of 2 types is more, the point corresponding to the result is filled with 255, namely white, otherwise, 0, and black.
7. The automatic extraction method of the interest area of the hand bone of the X-ray film based on the deep neural network as claimed in claim 1 or 2, characterized in that: in the step eight, the process of difference comparison is as follows:
step 8.1: the image is firstly converted into a numerical array, and because the light shadow exists at the bottom, only the cross section patch with a certain height is taken from the bottom, and the height is determined according to the average height of the light shadow;
step 8.2: performing white statistics on each line in the patch, and reserving the most white lines and the least white lines;
step 8.3: comparing the two reserved rows column by column, wherein different numbers t are recorded when one row is white and one row is black;
step 8.4: when t is larger than a set difference threshold value, judging that light shadow occurs, discarding the part between the two lines, and if the t is not larger than the threshold value, keeping the part unchanged;
step 8.5: and performing maximum connected region calculation again, wherein the part between the light shadow and the hand bone is abandoned, so that the hand bone region is reserved at this time, and the light shadow part is also removed.
CN201710975940.6A 2017-10-19 2017-10-19 Automatic X-ray film hand bone interest area extraction method based on deep neural network Active CN107871316B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710975940.6A CN107871316B (en) 2017-10-19 2017-10-19 Automatic X-ray film hand bone interest area extraction method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710975940.6A CN107871316B (en) 2017-10-19 2017-10-19 Automatic X-ray film hand bone interest area extraction method based on deep neural network

Publications (2)

Publication Number Publication Date
CN107871316A CN107871316A (en) 2018-04-03
CN107871316B true CN107871316B (en) 2020-10-27

Family

ID=61753117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710975940.6A Active CN107871316B (en) 2017-10-19 2017-10-19 Automatic X-ray film hand bone interest area extraction method based on deep neural network

Country Status (1)

Country Link
CN (1) CN107871316B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108968991B (en) * 2018-05-08 2022-10-11 平安科技(深圳)有限公司 Hand bone X-ray film bone age assessment method, device, computer equipment and storage medium
WO2020024127A1 (en) * 2018-08-01 2020-02-06 中国医药大学附设医院 Bone age assessment and height prediction model, system thereof and prediction method therefor
CN109948522B (en) * 2019-03-18 2020-12-01 浙江工业大学 X-ray hand bone maturity interpretation method based on deep neural network
CN109961044B (en) * 2019-03-22 2021-02-02 浙江工业大学 CHN method interest area extraction method based on shape information and convolutional neural network
CN111754414B (en) * 2019-03-29 2023-10-27 北京搜狗科技发展有限公司 Image processing method and device for image processing
CN110414562B (en) * 2019-06-26 2023-11-24 平安科技(深圳)有限公司 X-ray film classification method, device, terminal and storage medium
US11315228B2 (en) 2020-05-18 2022-04-26 Accenture Global Solutions Limited System and method for mineral exploration
KR102481112B1 (en) * 2020-11-02 2022-12-26 사회복지법인 삼성생명공익재단 Prediction method of adult height with deep-learning and system thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6711282B1 (en) * 1999-10-29 2004-03-23 Compumed, Inc. Method for automatically segmenting a target bone from a digital image
CN102938055A (en) * 2012-10-09 2013-02-20 哈尔滨工程大学 Hand bone identification system
WO2015052710A1 (en) * 2013-10-09 2015-04-16 Yosibash Zohar Automated patient-specific method for biomechanical analysis of bone

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6711282B1 (en) * 1999-10-29 2004-03-23 Compumed, Inc. Method for automatically segmenting a target bone from a digital image
CN102938055A (en) * 2012-10-09 2013-02-20 哈尔滨工程大学 Hand bone identification system
WO2015052710A1 (en) * 2013-10-09 2015-04-16 Yosibash Zohar Automated patient-specific method for biomechanical analysis of bone

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《ASM的手骨提取方法研究》;黄飞 等;《计算机工程与应用》;20141211;第52卷(第3期);全文 *

Also Published As

Publication number Publication date
CN107871316A (en) 2018-04-03

Similar Documents

Publication Publication Date Title
CN107871316B (en) Automatic X-ray film hand bone interest area extraction method based on deep neural network
CN107316307B (en) Automatic segmentation method of traditional Chinese medicine tongue image based on deep convolutional neural network
CN110097554B (en) Retina blood vessel segmentation method based on dense convolution and depth separable convolution
CN108154102B (en) Road traffic sign identification method
CN109886335B (en) Classification model training method and device
CN107038416B (en) Pedestrian detection method based on binary image improved HOG characteristics
CN111369563A (en) Semantic segmentation method based on pyramid void convolutional network
CN113822185A (en) Method for detecting daily behavior of group health pigs
CN109242826B (en) Mobile equipment end stick-shaped object root counting method and system based on target detection
CN112950780B (en) Intelligent network map generation method and system based on remote sensing image
CN111553438A (en) Image identification method based on convolutional neural network
CN109377487B (en) Fruit surface defect detection method based on deep learning segmentation
CN113627257B (en) Detection method, detection system, device and storage medium
CN115994907B (en) Intelligent processing system and method for comprehensive information of food detection mechanism
CN113256624A (en) Continuous casting round billet defect detection method and device, electronic equipment and readable storage medium
CN114612406A (en) Photovoltaic panel defect detection method based on visible light and infrared vision
CN115578615A (en) Night traffic sign image detection model establishing method based on deep learning
CN114648806A (en) Multi-mechanism self-adaptive fundus image segmentation method
CN112613428A (en) Resnet-3D convolution cattle video target detection method based on balance loss
CN113313678A (en) Automatic sperm morphology analysis method based on multi-scale feature fusion
CN116740758A (en) Bird image recognition method and system for preventing misjudgment
CN114612399A (en) Picture identification system and method for mobile phone appearance mark
CN110136098B (en) Cable sequence detection method based on deep learning
CN116030346A (en) Unpaired weak supervision cloud detection method and system based on Markov discriminator
CN115187878A (en) Unmanned aerial vehicle image analysis-based blade defect detection method for wind power generation device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210203

Address after: Room 1204, Hailiang building, 1508 Binsheng Road, Xixing street, Binjiang District, Hangzhou, Zhejiang 310000

Patentee after: ZHEJIANG FEITU IMAGING TECHNOLOGY Co.,Ltd.

Address before: 310014 Zhejiang University of Technology, 18, Chao Wang Road, Xiacheng District, Hangzhou, Zhejiang

Patentee before: ZHEJIANG University OF TECHNOLOGY

TR01 Transfer of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Automatic Extraction Method of Hand Bone Region of Interest from X-rays Based on Deep Neural Network

Effective date of registration: 20231012

Granted publication date: 20201027

Pledgee: Zhejiang Juzhou Commercial Bank Co.,Ltd. Hangzhou Branch

Pledgor: ZHEJIANG FEITU IMAGING TECHNOLOGY Co.,Ltd.

Registration number: Y2023980060761

PE01 Entry into force of the registration of the contract for pledge of patent right