CN111340760A - Knee joint positioning method based on multitask two-stage convolutional neural network - Google Patents

Knee joint positioning method based on multitask two-stage convolutional neural network Download PDF

Info

Publication number
CN111340760A
CN111340760A CN202010097868.3A CN202010097868A CN111340760A CN 111340760 A CN111340760 A CN 111340760A CN 202010097868 A CN202010097868 A CN 202010097868A CN 111340760 A CN111340760 A CN 111340760A
Authority
CN
China
Prior art keywords
knee
knee joint
image
feature map
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010097868.3A
Other languages
Chinese (zh)
Other versions
CN111340760B (en
Inventor
窦勇
王康
牛新
姜晶菲
熊运生
杨迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010097868.3A priority Critical patent/CN111340760B/en
Publication of CN111340760A publication Critical patent/CN111340760A/en
Application granted granted Critical
Publication of CN111340760B publication Critical patent/CN111340760B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10116X-ray image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone

Abstract

The invention discloses a knee joint positioning method based on a multitask two-stage convolutional neural network, and aims to improve the knee joint detection accuracy. The knee joint area positioning network based on the multitask two-stage convolutional neural network is constructed by two stages of networks, wherein the first stage of network is composed of a first feature extraction module and a preliminary target detection module, and the second stage of network is composed of a second feature extraction module and a final target detection module; training the knee joint area positioning network to obtain a trained knee joint area positioning network model; preprocessing a double-knee X-ray film to be detected, and dividing a double-knee image into single-knee images; and performing knee joint positioning on the single knee image based on the trained knee joint area positioning network to obtain a final boundary frame of the single knee image and a final key point of the knee joint area. The invention can improve the detection accuracy of knee joint positioning.

Description

Knee joint positioning method based on multitask two-stage convolutional neural network
Technical Field
The invention relates to the field of image processing, computer vision and target positioning, in particular to a knee joint positioning method based on a multitask two-stage convolutional neural network.
Background
Gonarthritis is a very common joint disease and can even lead to disability. The knee joint arthritis is a disease which needs to be paid high attention because the knee joint arthritis has high morbidity among the old, the obese and the sedentary population, if the knee joint arthritis is serious, the pain feeling of the sufferer can be affected and even the total joint replacement can be caused, and the annual knee joint treatment cost of each patient suffering from the knee joint arthritis can reach 19000 Euro. Currently, physicians are primarily clinically diagnosing gonarthritis on the basis of X-ray film, although there are other medical imaging categories, such as: magnetic resonance imaging, ultrasound imaging, but X-ray film is a cheap and widely used medical imaging. Clinically, the X-ray images taken by patients are medical images including legs, and gonarthritis mainly shows pathological features of narrowing of knee joint gaps, osteophyte formation and hardening, so that only knee joint areas need to be concerned. Since doctors are subjective and time-consuming and labor-consuming when reviewing X-ray films, with the development of computer technology, computer-aided methods also begin to step into the medical field, but for X-ray films including knees, whether doctors perform visual diagnosis or computer-aided diagnosis, positioning of knee joint regions is a prerequisite for diagnosis and is also a crucial step.
At present, knee joint areas are positioned on X-ray films containing double knees in a manual marking mode, namely, the knee joint areas are manually marked by checking each X-ray film based on human eyes, and the manual marking mode is time-consuming and labor-consuming. Computer-aided methods of positioning the knee joint area have emerged later, for example (document "Early d)etection of radiographic kneeosteoarthritis using computer-aided analysis[J]Osteoarthritis and Cartilage,2009,17(10): 1307-: the computer aided analysis is utilized to discover radioactive gonarthritis in early stage, and the osteoarthritis and cartilage journal) proposes that a template matching method is adopted to automatically detect knee joint parts, and the method is slow for large data sets and low in knee joint detection accuracy rate. (document "Quantifying radiographic marking using decreasing volumetric network [ C ]]//201623rdInternational Conference on Pattern Recognition (ICPR). IEEE,2016: 1195-: the severity of the radioactive gonarthritis is quantified by using a deep convolutional neural network, and a method based on Sobel horizontal image gradients and a Support Vector Machine (SVM) is proposed in the 23rd international conference on pattern recognition in 2016. firstly, the knee joint center is automatically detected, and then a fixed area is extracted according to the detected knee joint center to serve as the interested knee joint part. The highest detection rate on 4496X-ray films of the knee joints in the public database OAI (https:// OAI. epi-ucsf. org/datarelease /) was 81.8%, and the highest detection rate using the template matching method was 54.4%. (document "A novel method for automatic localization of joint area on knee panel radiograms [ C)]Springer, Cham,2017: 290-: a new method for automatically positioning the knee joint area in a polished section, a Scandinavian image analysis conference, proposes a method for completing the positioning of the knee joint area by combining a direction gradient Histogram (HOG) and a Support Vector Machine (SVM). The mean IOU (IOU is cross-over ratio, i.e., the IOU cross-over ratio) was measured on 473X-ray films of the knee joints in the public data set MOST (http:// MOST. ucsf. edu)
Figure BDA0002385850050000021
Is 0.84. The knee joint area is positioned by adopting the modes of extracting traditional characteristics and combining a traditional classifier, the missing detection rate and the false detection rate are high, and the knee joint detection rate is still to be improved. With the development of deep learning, the deep learning technology can effectively represent image features, and has been widely applied to the fields of target recognition, target detection and the like in images.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in order to further improve the knee joint detection accuracy, a knee joint positioning method based on a multitask two-stage convolutional neural network is provided.
The technical scheme of the invention is as follows: the knee joint positioning method based on the multitask two-stage convolutional neural network mainly comprises the processes of network construction, network training, image preprocessing of an X-ray film to be detected and knee joint positioning in the X-ray film to be detected, and the overall steps are as shown in figure 1:
firstly, a knee joint area positioning network based on a multitask two-stage convolutional neural network is built.
The knee joint area positioning network based on the multitask two-stage convolution neural network comprises two stages of networks: the system comprises a first-level network and a second-level network, wherein the output of the first-level network is used as the input of the second-level network.
The first-level network is composed of a first feature extraction module and a preliminary target detection module. The first feature extraction module receives the single knee image I from the outside, extracts first image features of the single knee image I and sends the first image features to the preliminary target detection module; and the preliminary target detection module detects the first image characteristics and outputs a preliminary knee joint area in the single knee image I.
The first feature extraction module consists of 5 convolutional layers including 4 convolutional layers of 3 × 3 and 1 convolutional layer of 2 × 2 and 3 max pooling layers including 2 × 2 max pooling layers and 1 max pooling layer of 3 × 3.
The method comprises the steps of performing convolution operation on a single knee image I by a first 3 × 3 convolution layer, performing pooling operation on the single knee image I which is subjected to the convolution operation by a first 2 × 2 maximum pooling layer to obtain a feature map F1, performing convolution operation on a feature map F1 by a second 3 × 3 convolution layer, performing pooling operation on a feature map F1 which is subjected to the convolution operation by a first 3 × 3 maximum pooling layer to obtain a feature map F2, performing convolution operation on a feature map F2 by a third 3 × 3 convolution layer, performing pooling operation on a feature map F2 which is subjected to the convolution operation by a second 2 × 2 maximum pooling layer to obtain a feature map F3, performing convolution operation on a feature map F3 by a fourth 3 × 3 convolution layer to obtain a feature map F4, performing convolution operation on a fifth 2 × 2 convolution map F4 to obtain a feature map F5, and obtaining a feature map F5 which is the extracted first image feature.
The preliminary target detection module comprises a1 × 1 convolution layer, a knee joint boundary frame coordinate operation layer and a first non-maximum value inhibition screening layer, wherein the 1 × 1 convolution layer convolves the feature map F5 to obtain two groups of vectors, namely a probability vector A1 of a knee joint and a knee joint boundary frame coordinate offset vector B1, the knee joint boundary frame coordinate operation layer determines a knee joint boundary frame according to A1 and B1 to obtain a knee joint area in the single-knee image I, the first non-maximum value inhibition screening layer conducts non-maximum inhibition screening on the knee joint area in the single-knee image I to obtain a preliminary knee joint area in the single-knee image I, and the single-knee image I and the preliminary knee joint area in the single-knee image I are sent to the second-level network.
The second-level network is composed of a second feature extraction module and a final target detection module. The second feature extraction module is used for carrying out feature extraction on the single knee image I received from the first-level network and the preliminary knee joint region in the single knee image I to obtain second image features; and the final target detection module performs target detection on the second image characteristics to obtain final knee joint boundary frame coordinates of the single knee image I and knee joint region key point coordinates of the single knee image I.
The second feature extraction module is composed of 4 convolutional layers, 3 maximum pooling layers and 1 fully-connected layer, the 4 convolutional layers include 3 × 3 convolutional layers and 12 × 2 convolutional layers, and the 3 maximum pooling layers include 2 × 2 maximum pooling layers and 13 × 3 maximum pooling layers.
The first 3 × convolutional layer in the second feature extraction module performs convolution operation on the preliminary knee joint region in the single knee image I, the first 2 × maximum pooling layer performs pooling operation on the preliminary knee joint region in the single knee image I which is subjected to the convolution operation to obtain a feature map F6, the second 3 × convolutional layer performs convolution operation on a feature map F6, the second 3 × maximum pooling layer performs pooling operation on the feature map F6 which is subjected to the convolution operation to obtain a feature map F7, the third 3 × convolutional layer performs convolution operation on a feature map F7, the third 2 × maximum pooling layer performs pooling operation on the feature map F7 which is subjected to the convolution operation to obtain a feature map F8, the fourth 3 × convolutional layer performs convolution operation on the feature map F8 to obtain a feature map F9, the first full-connection layer performs full-connection operation on the feature map F9 to obtain a feature map F10, and the feature map F10 is the second feature image extracted by the feature map F10.
The final target detection module comprises a second full-connection layer, a knee joint boundary frame, a key point coordinate operation layer of a knee joint area and a second non-maximum value inhibition screening layer. The second full-connection layer performs full-connection on the feature map F10 to obtain three groups of vectors, namely a probability vector A2 of the knee joint, a coordinate offset vector B2 of a boundary frame of the knee joint and a coordinate offset vector C of six key points of a knee joint region. And the key point coordinate operation layer of the knee joint boundary box and the knee joint area determines the coordinates of the knee joint boundary box and the coordinates of the key points of the knee joint area according to A2, B2 and C for the single knee image I. And the second non-maximum inhibition screening layer carries out non-maximum inhibition screening on the knee joint region and the knee joint key point coordinates in the single knee image I, and outputs the single knee image I and the final knee joint region of the single knee image I and the knee joint region key point coordinates of the single knee image I.
And secondly, training the knee joint area positioning network based on the multitask two-stage convolutional neural network.
2.1 prepare the data of the knee joint area location network based on the multitask two-stage convolution neural network.
2.1.1 preprocessing M (M is a positive integer and M is more than 2000) original images to obtain 2M single knee images subjected to histogram equalization processing, wherein the method comprises the following steps:
2.1.1.1 randomly selected M raw images from the OAI baseline public database (https:// OAI. epi-ucsf. org/datarelease/, version 11 months 2008). The original image is an X-ray medical image containing the left knee and the right knee, and due to the influence of different illumination when the X-ray medical image is shot, the X-ray medical image can be a double-knee image with a bright background and dark legs, and can also be a double-knee image with a dark background and bright legs.
2.1.1.2 uniformly transforms M original images into an image with dark background and bright legs. Firstly, double knee images with bright background and dark legs are selected from M original images, then image pixel inversion is carried out, namely, 255 is used for subtracting original pixel values, and the M original images are uniformly converted into images with dark background and bright legs.
2.1.1.3 convert M pixels of an image with a dark background and light legs to [0, 255%]I.e., processing the dark background and light legs images into a fluid 8 image by: the pixel value of each pixel in M images with dark background and bright legs is processed as follows:
Figure BDA0002385850050000041
where P is the pixel value of any pixel in the pre-processed image, PmaxFor the maximum pixel value in the pre-processed image, PminFor minimum pixel value in the pre-processed image, PnewIs the pixel value of any one pixel in the processed image.
2.1.1.4 convert the M images of the uit 8 into 2M single knee images by: first, find the width W and half of the width of the M images of the fluid 8, respectively
Figure BDA0002385850050000042
Then half the width
Figure BDA0002385850050000043
Get the whole and mark as
Figure BDA0002385850050000044
Finally, from the image of the agent 8 according to the width coordinate
Figure BDA0002385850050000045
And (4) cutting, namely dividing the diagram of each agent 8 into 2 single knee images, and finally obtaining 2M single knee images.
2.1.1.5 histogram equalization processing is performed on each of the 2M single knee images. The histogram equalization processing method is shown in (document "anyhow.
2.1.2 annotate the knee joint real bounding box in the 2M single knee images after histogram equalization, the method is as follows:
2.1.2.1 initializing variable m ═ 1;
2.1.2.2 manually and manually labeling 6 key points of the knee joint region of the mth single knee image. Since the doctor or the computer mainly focuses on the knee joint gap and the osteophyte part, 6 points on the boundary of the knee joint gap and the osteophyte are manually marked as main key points of the knee joint area, such as 6 key points of the right knee shown in fig. 3(a), namely a femur inner osteophyte point (FM), a femur outer osteophyte point (FL), a tibia inner osteophyte point (TM), a tibia outer osteophyte point (TL), a joint gap inner point (JSM) and a joint gap outer point (JSL). As shown in fig. 3(b), 6 key points of the left knee are also the femoral medial osteophyte point (FM), the femoral lateral osteophyte point (FL), the tibial medial osteophyte point (TM), the tibial lateral osteophyte point (TL), the joint space medial point (JSM), and the joint space lateral point (JSL), respectively. As shown, the 6 key points of the left and right knees are mirror images. The subsequent steps are processing for 6 key points, without concern for the left knee or the right knee.
2.1.2.3 marking the boundary box of the knee joint in the mth single knee image according to the key points marked manually, the method is:
2.1.2.3.1 center point coordinates (x) of 6 key points in the mth single knee image are calculated respectivelymid,ymid). Let the coordinates of 6 key points be (x)1,y1)、(x2,y2)、(x3,y3)、(x4,y4)、(x5,y5)、(x6,y6) Center point coordinates (x) of 6 key pointsmid,ymid) Comprises the following steps:
Figure BDA0002385850050000051
Figure BDA0002385850050000052
2.1.2.3.2 calculating the width w of the kneekneeSquare, squareThe method comprises the following steps: the maximum abscissa (x) of 6 key points is calculatedmax) And minimum abscissa (x)min) Maximum abscissa (x)max) And minimum abscissa (x)min) The difference therebetween is taken as the knee joint width wkneeNamely:
xmax=max(x1,x2,x3,x4,x5,x6)
xmin=min(x1,x2,x3,x4,x5,x6)
wknee=xmax-xmin
2.1.2.3.3 marking the knee joint area to obtain the real knee joint area boundary box coordinate
Figure BDA0002385850050000053
The coordinates represent the bounding box of the real knee joint region (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate). The center points of the 6 key points are used as the center to expand the knee joint area by 0.65 times (the knee joint area on each graph can be framed by 0.65 times) of the width of the knee joint up, down, left and right, and the knee joint area is used as the knee joint area really interested, namely the labeled boundary box, and the calculation formula is as follows:
Figure BDA0002385850050000054
Figure BDA0002385850050000061
Figure BDA0002385850050000062
Figure BDA0002385850050000063
2.1.2.4 if M is more than or equal to 2M, rotating to 2.2; if M <2M, let M be M +1, go to 2.1.2.2.
2.2 prepare training samples for the first stage network in the knee joint area positioning network based on the multitask two-stage convolution neural network, the method is as follows:
2.2.1 initializing variable m ═ 1;
2.2.2 prepare training samples for the first level network on the mth single knee image by:
2.2.2.1 initializing variable k ═ 1;
2.2.2.2 randomly taking the kth point on the mth single knee image, and enabling the coordinate of the kth point to be
Figure BDA0002385850050000064
By point
Figure BDA0002385850050000065
As the top left coordinate point of the randomly selected bounding box.
2.2.2.2 taking the width of the kth randomly chosen bounding box
Figure BDA0002385850050000066
And height
Figure BDA0002385850050000067
Let the width of the mth single knee image be wmHeight of hmAnd take wmAnd hmIs half of the minimum value of (a), and a rounding operation is performed, i.e., int (min (w)m,hm)/2),int(min(wm,hm) [ 2 ] denotes the ratio of min (w)m,hm) Rounded at [48, int (min (w) ]m,hm)/2)]Randomly taking an integer in the value range of (a) as the width of the randomly selected bounding box
Figure BDA0002385850050000068
And height
Figure BDA0002385850050000069
2.2.2.3 determine the coordinates of the kth randomly chosen bounding box. Order point
Figure BDA00023858500500000610
For the right side of a randomly selected bounding boxThe lower coordinate point is set to be the lower coordinate point,
Figure BDA00023858500500000611
randomly selected bounding box coordinates of
Figure BDA00023858500500000612
Representing the kth randomly chosen bounding box (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate).
2.2.2.4 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure BDA00023858500500000613
If it is
Figure BDA00023858500500000614
The kth randomly chosen bounding box is used
Figure BDA00023858500500000615
As a positive sample of training, go to 2.2.2.5; if it is
Figure BDA00023858500500000616
The kth randomly chosen bounding box is used
Figure BDA00023858500500000617
As part of the training, go to 2.2.2.5; if it is
Figure BDA0002385850050000074
The kth randomly chosen bounding box is used
Figure BDA0002385850050000071
As a negative example of training, go 2.2.2.5.
2.2.2.5 if K < K (K >50, K is the total number of randomly selected bounding boxes on the mth single knee image), let K be K +1, turn 2.2.2.2; if K is more than or equal to K, the operation is switched to 2.2.2.6.
2.2.2.6 if M is more than or equal to 2M, turning to 2.2.3; if M <2M, let M be M +1, turn 2.2.2.
2.2.3 downscaling the training samples (including positive, partial, negative) generated in step 2.2.2 to 48 × 48 × 3 (image length × image width × image channels) since it is an RGB image, the number of image channels is 3.
2.2.4 the training sample after 2.2.3 steps of size reduction is amplified by mirror image operation, the method is: horizontally turning each training sample left and right, namely performing mirror image operation on each training sample to generate a training sample of a first-level network, wherein the training sample comprises N11Positive sample, N12Partial sample, N13A negative sample, N11、N12And N13All positive integers are determined by randomly selecting a boundary box on the knee image and calculating the size of the intersection ratio.
2.3 training the first-stage network by adopting the first-stage network training sample to obtain the trained first-stage network, wherein the method comprises the following steps:
2.3.1 initializing variable q to 1, setting values of model parameters, including setting learning rate to 0.001 and batch size to 500;
2.3.2 inputting the training sample of the first network into the first network, and outputting an overall loss function Lq,LqIs calculated as follows, with the model parameters set in step 2.3.1, by LqThe back propagation of the value updates the parameters in the first-level network to obtain the first-level network NET with updated parameters for the q-th timeq
Figure BDA0002385850050000072
Where N is the number of training samples for the first stage network, and N is N11+N12+N13
Figure BDA0002385850050000073
Represents the loss of the ith training sample on the jth task, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary box positioning, knee joint key point positioning, αjCoefficient of importance of j-th task and weight of knee joint detection taskDegree of importance coefficient α11, importance coefficient α for knee joint bounding box positioning task20.5, importance coefficient α for knee joint key point localization task3=0;
2.3.3, if Q is more than or equal to 2 and less than or equal to Q +1, Q is the number of times of network training, Q is 10, and then 2.3.4 is carried out; otherwise, after training, the first-stage network after training, namely NET, is obtainedQTurning to 2.4;
2.3.4 input training samples of the first level network into NETq-1Outputting the q-th overall loss function LqPassing through L under model parameters set in step 2.3.1qReverse propagation of values to update NETq-1The parameter in (1) is obtained, and the first-level network NET with the parameter updated for the q-th time is obtainedqTurning to 2.3.3;
2.4 prepare training samples for the second-stage network in the knee joint area positioning network based on the multitask two-stage convolutional neural network, the method comprises the following steps:
2.4.1 initializing variable m ═ 1;
2.4.2 positioning the preliminary knee joint region of the mth single knee image I by using the trained first-level network, wherein the method comprises the following steps:
2.4.2.1 generating a picture pyramid of the mth single-knee image I, the method comprises the steps of multiplying the mth single-knee image I by different scaling factors s (s is a positive real number, and 0< s is less than or equal to 1), scaling the mth single-knee image I in different scales to obtain PP (PP is a positive integer and 0< PP <100) single-knee images with different sizes, and forming the picture pyramid of the single-knee image I, wherein the size of each image in the Picture Pyramid (PP) is recorded as H × W × 3 (the number of channels of × images of the high × images of the images is RGB images, and 3 channels exist).
2.4.2.2 initializing variable p ═ 1;
2.4.2.3 the first trained feature extraction module extracts the features of the p-th picture in the picture pyramid to obtain a first image feature F5, where F5 is
Figure BDA0002385850050000081
(Picture height × Picture Width × feature map channel number).
2.4.2.4 the preliminary target detection module detects the first image feature F5 by using a knee joint region detection method to obtain a knee joint region in the single knee image I, wherein the knee joint region detection method is as follows:
2.4.2.4.1 the 1 × 1 convolution layer in the preliminary target detection module performs 1 × 1 convolution on the first image feature F5 of the p picture in the picture pyramid, and outputs two groups of vectors, namely the probability vector A1 of the knee jointp,A1pIs composed of
Figure BDA0002385850050000082
Figure BDA0002385850050000083
Representing the number of vectors, 1 representing the dimension of each vector), vector a1pStored are probability values (in the range of 0, 1) for what is considered to be the knee joint]) When the probability value is not lower than α, the knee joint is considered, α is a first threshold value, α∈ [0.5, 1]α preferably having a value of 0.6, knee joint bounding box coordinate offset vector B1p,B1pIs composed of
Figure BDA0002385850050000084
Figure BDA0002385850050000085
Representing the number of vectors, 4 representing the dimension of each vector), 4 dimensions (x) of each vector1,y1,x2,y2) Representing the bounding box (upper left abscissa offset, upper left ordinate offset, lower right abscissa offset, lower right ordinate offset).
2.4.2.4.2 Knee Joint bounding Box coordinate operation layer pairs of preliminary object detection Module A1pAnd B1pAnd screening and calculating to obtain the knee joint area in the single knee image I. Found A1pN vector positions with medium probability value not lower than α
Figure BDA0002385850050000091
At the same time as B1pFind the corresponding N vector positions
Figure BDA0002385850050000092
And 4-dimensional coordinate offset vector corresponding to the N positions
Figure BDA0002385850050000093
Wherein the content of the first and second substances,
Figure BDA0002385850050000094
and (3) coordinate offsets (upper left horizontal coordinate offset, upper left vertical coordinate offset, lower right horizontal coordinate offset, lower right vertical coordinate offset) of the nth bounding box. Calculating the coordinates of N knee joint region bounding boxes in the original single knee image I
Figure BDA0002385850050000095
And framing N knee joint areas on the original drawing I according to the coordinates of the boundary frame, wherein,
Figure BDA0002385850050000096
coordinates (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate) representing the nth bounding box are calculated as follows:
Figure BDA0002385850050000097
Figure BDA0002385850050000098
Figure BDA0002385850050000099
Figure BDA00023858500500000910
wherein the content of the first and second substances,
Figure BDA00023858500500000911
coordinate offset (upper left horizontal coordinate offset, upper left vertical coordinate offset, lower right horizontal coordinate offset, lower right vertical coordinate offset) of the nth bounding box, s (0)<s≤1) Representing the scaling factor of the original single knee image I.
2.4.2.5 if p is not less than PP, turning to 2.4.2.6; if p < PP, let p be p +1, go to 2.4.2.3.
2.4.2.6 the first non-maximum suppression screening layer employs a non-maximum suppression (NMS) algorithm to filter all knee joint regions of the mth single knee picture I. Non-Maximum Suppression (NMS) algorithm reference (document "Efficient Non-Maximum Suppression [ C ]]//18th International Conference on Pattern Recognition (ICPR 2006),20-24August 2006, Hong Kong, China. Efficient non-maximum suppression, international conference on pattern recognition at 18th conference), setting the non-maximum suppression filter threshold delta to 0.6, and screening the knee joint boundary region by the threshold to obtain the N of the mth single knee image I1(N1Is a positive integer and 0-N1≦ 100) preliminary knee joint regions.
2.4.3 prepare training samples for the second level network on the mth single knee image by:
2.4.3.1 initializing variable n1=1,k=1;
2.4.3.2 calculating the nth of the mth Single Knee image I1The intersection and combination ratio of the preliminary knee joint region and the knee joint region marked by the mth single knee image I
Figure BDA0002385850050000101
If it is
Figure BDA0002385850050000102
Will n be1Taking the preliminary knee joint area as a positive sample of the second-level network training, turning to 2.4.3.3; if it is
Figure BDA0002385850050000103
Will n be1Taking the preliminary knee joint area as a partial sample of the second-stage network training, turning to 2.4.3.3; if it is
Figure BDA0002385850050000104
Will n be1The individual preliminary knee joint regions are used as negative examples for the second level of network training, turn 2.4.3.3.
2.4.3.3 if n1≥N1(N1Is the number of the preliminary knee joint areas in the mth single knee joint image I, and N is more than or equal to 01Less than or equal to 100), rotating 2.4.3.4; if n is1<N1Let n be1=n1+1, 2.4.3.2.
2.4.3.4 randomly selecting the kth point on the mth single knee image
Figure BDA0002385850050000105
Will be dotted
Figure BDA0002385850050000106
As the top left coordinate point of the randomly selected bounding box.
2.4.3.5 taking the width of the kth randomly chosen bounding box
Figure BDA0002385850050000107
And height
Figure BDA0002385850050000108
Let the width of the mth single knee image be wmHeight of hmTaking wmAnd hmIs half of the minimum value of (d), and a rounding operation is performed, i.e., int (min (w)m,hm) At [48, int (min (w) ]2)m,hm)/2)]Randomly taking an integer as the width of the randomly selected bounding box
Figure BDA0002385850050000109
And height
Figure BDA00023858500500001010
2.4.3.6 determine the coordinates of the kth randomly chosen bounding box: dot
Figure BDA00023858500500001011
Marked as the bottom right coordinate point of the kth randomly chosen bounding box,
Figure BDA00023858500500001012
the kth randomly selected bounding boxIs marked as
Figure BDA00023858500500001013
Representing the kth randomly chosen bounding box (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate).
2.4.3.7 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure BDA00023858500500001014
If it is
Figure BDA00023858500500001015
Taking the kth randomly selected bounding box as training data for positioning the key points of the knee joint area, and turning to 2.4.3.8; if it is
Figure BDA00023858500500001016
Go directly to 2.4.3.8.
2.4.3.8 if K < K, let K equal to K +1, go to 2.4.3.4, if K ≧ K, go to 2.4.4.
2.4.4 if M is more than or equal to 2M, rotating to 2.4.5; if M <2M, let M be M +1, turn 2.4.2.
2.4.5 the positive, partial, and negative samples generated in steps 2.4.3-2.4.4 were down-scaled to 48 × 48 × 3 (image length × image width × image channels) since they are RGB images, the number of image channels is 3.
2.4.6 the positive sample, partial sample and negative sample after 2.4.5 steps of size reduction are subjected to mirror image operation and amplification, and the method comprises the following steps: turning each training sample in the left-right direction, namely performing mirror image operation to obtain training samples of a second-level network, wherein the training samples comprise N21Positive sample, N22Partial sample, N23A negative sample, N21、N22And N23All positive integers are determined by the initial knee joint area selected on the single knee image and the size of the intersection ratio obtained by calculation.
2.5 training the second-level network by adopting the second-level network training sample generated in the step 2.4.6 to obtain the trained second-level network, wherein the method comprises the following steps:
2.5.1 initializing variable t ═ 1, setting the values of the model parameters, including setting the learning rate to 0.0001 and the batch size batch _ size to 500;
2.5.2 inputting the training sample of the second network into the second network, and outputting an overall loss function Lt,LtIs calculated as follows, with the model parameters set in step 2.5.1, by LtThe back propagation of the values updates the parameters in the second level network to obtain the second level network NET with updated parameters for the t timet
Figure BDA0002385850050000111
Wherein N is2Is the number of training samples, N2=N21+N22+N23
Figure BDA0002385850050000112
Represents the loss of the ith training sample on the jth task, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary box positioning, knee joint key point positioning, αjAn importance degree coefficient α showing the importance degree coefficient of the jth task and the knee joint detection task10.8, importance of knee positioning task α20.6, importance coefficient α for knee joint key point localization task3=1.5;
2.5.3, if T is equal to or more than 2 and is equal to or less than T, T is the number of times of network training, T is equal to 10, and then 2.5.4 is carried out; otherwise, after training, the second-level Network (NET) after training is obtainedTTurning to the third step;
2.5.4 inputting training sample of second-level network into NETt-1Outputting the t-th overall loss function LtPassing through L under model parameters set in step 2.5.1tReverse propagation of values to update NETt-1The parameter in (1) is obtained, and the second-level network NET with the parameter updated for the t time is obtainedt(ii) a Turn 2.5.3.
Thirdly, preprocessing the X-ray film of the double knees to be detected to obtain 2 single knee images to be detected which are processed by histogram equalization, wherein the method comprises the following steps:
3.1 if the X-ray film of the double knees to be detected is that the background is bright and the two legs are dark, converting the X-ray film of the double knees to be detected into an image that the background is dark and the two legs are bright, namely subtracting the original pixel value of the X-ray film of the double knees to be detected from 255 to be used as a new image pixel value, converting the new image pixel value into an image that the background is dark and the two legs are bright, and converting the image into 3.2; if the X-ray film of the double knees to be detected is dark background and bright legs, directly rotating to 3.2.
3.2 converting the pixels of the X-ray film to be detected into [0, 255 ]]Processing the double-knee X-ray film to be detected into double-knee images of the uint 8; the new pixel value corresponding to each pixel in the double knee image of the agent 8 is
Figure BDA0002385850050000113
Figure BDA0002385850050000121
P is the pixel value of each pixel of the pre-processed image, PmaxFor maximum pixel value of pre-processed image, PminIs the minimum pixel value of the pre-processed image.
3.3 converting the double knee image of the agent 8 into a single knee image by: finding the width W and half of the width of the uint8 bipartite knee image, respectively
Figure BDA0002385850050000122
Then half the width
Figure BDA0002385850050000123
Get the whole and mark as
Figure BDA0002385850050000124
Finally, according to width coordinates [0, int (W/2) ] from the double knee image]、[int(W/2)+1,W-1]The truncated, i.e., double knee image of the fluid 8 is divided into 2 single knee images.
3.4, performing histogram equalization processing on 2 single knee images, wherein the 2 single knee images subjected to the histogram equalization processing are used as the single knee images to be detected.
Fourthly, performing knee joint positioning on the 2 to-be-detected single knee images obtained in the third step based on the trained knee joint area positioning network of the multitask two-stage convolutional neural network, wherein the method comprises the following steps of:
4.1 initializing variable d ═ 1;
and 4.2, generating a picture pyramid of the d-th single knee image to be detected, zooming the single knee image in different scales, wherein the zooming factor is s (0< s is less than or equal to 1), obtaining PP (0< PP <100) single knee images in different sizes, and setting the size of each image as H × W × 3 (the number of channels of the wide × image of the high × image of the image), wherein the picture pyramid of the single knee image to be detected is called as the picture pyramid of the single knee image to be detected.
4.3 processing the image pyramid of the d-th single knee image to be detected by the first-stage network in the knee joint area positioning network based on the trained multitask two-stage convolutional neural network to obtain a preliminary knee joint area, wherein the method comprises the following steps:
4.3.1 initializing variable p ═ 1;
4.3.2 the first feature extraction module extracts the features of the p image in the image pyramid of the d single knee image to be detected, and the method comprises the following steps:
4.3.2.1 convolution operation is carried out on the p picture in the picture pyramid of the d single knee image to be detected by the first 3 × 3 convolution layer in the first feature extraction module, pooling operation is carried out on the p picture after convolution operation by the first 2 × 2 maximum pooling layer, a feature image F1 is output, wherein the convolution operation step length is 1, the maximum pooling operation step length is 2, the size of the p picture in the picture pyramid is set as H × W × 3 (the number of picture channels is × pictures and × pictures, and is 3 because of RGB images), and the convolution operation is carried out on the p picture pyramid through the first convolution and maximum pooling operation layer to obtain the final image
Figure BDA0002385850050000125
(number of channels of signature height × signature width × signature F1) size signature F1.
4.3.2.2 the second 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F1, the second 3 × 3 maximal pooling layer performs pooling operation on the feature map F1 which completes the convolution operation, and obtains a feature map F2, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure BDA0002385850050000126
(number of channels in feature map height × feature map width × feature map F1) size feature map F1 is obtained by second convolution and maximum pooling operation layer
Figure BDA0002385850050000131
Figure BDA0002385850050000132
(number of channels of signature height × signature width × signature F2) size signature F2.
4.3.2.3 the third 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F2, the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F2 which completes the convolution operation, and obtains a feature map F3, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure BDA0002385850050000133
(the number of channels of feature map height × feature map width × feature map F2) is obtained by a third convolution and a maximum pooling operation layer
Figure BDA0002385850050000134
(number of channels of signature height × signature width × signature F3) size signature F3.
4.3.2.4 the fourth 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F3 to obtain a feature map F4, wherein the step size of the convolution operation is 1,
Figure BDA0002385850050000135
(number of channels in feature map height × feature map width × feature map F3) feature map F3 is obtained through a fourth convolution operation layer
Figure BDA0002385850050000136
(number of channels of signature height × signature width × signature F4) size signature F4.
4.3.2.5 the fifth 2 × 2 convolutional layer in the first feature extraction module performs convolution operation on the feature map F4 to obtainFeature map F5: the convolution operation has a step size of 1,
Figure BDA0002385850050000137
(number of channels of feature map height × feature map width × feature map F4) feature map F4 is obtained through a fifth convolution operation layer
Figure BDA0002385850050000138
(the feature map height × and the feature map width × are the number of channels of the feature map F5) of the size of the feature map F5, wherein the feature map F5 is the first image feature extracted by the first-level network input of the pth picture of the picture pyramid of the pth single knee image to be detected.
4.3.3 the preliminary target detection module of the first-level network detects F5 by adopting the knee joint region detection method described in 2.4.2.4 to obtain the knee joint region of the single knee image to be detected;
4.3.4 if p is less than PP, let p be p +1, change to 4.3.2; if p is not less than PP, then 4.3.5 is turned.
And 4.3.5, filtering all knee joint bounding boxes in the d-th single knee image to be detected by a non-maximum suppression (NMS) algorithm through the second non-maximum suppression screening layer. Setting a filtering threshold delta of a non-maximum suppression (NMS) algorithm to be 0.7, screening a plurality of knee joint boundary frames in the single knee image to be detected through the threshold, framing a knee joint area on the d-th single knee image to be detected by the filtered boundary frames, and outputting N as N output by the first-level network1A preliminary knee joint region.
4.4 second-level network processing the No. d single knee image to be detected1Obtaining a final knee joint region and key points of the region by the initial knee joint region, wherein the method comprises the following steps:
4.4.1 initializing variable n1=1;
4.4.2 outputting the nth image of the d single knee to be detected from the first-level network1The preliminary knee region dimensions were normalized to 48 × 48 × 3 (number of channels for image height × image width × image).
4.4.3 second-level network extracting nth image on the d-th single knee image to be detected1Opening the characteristics of the initial knee joint region image, wherein the method comprises the following steps:
4.4.3.1 the first 3 × 3 convolutional layer pair in the second feature extraction Module1Performing convolution operation on the initial knee joint region image, and performing convolution operation on the nth layer of the first 2 × 2 maximum pooling layer1Pooling is carried out on the initial knee joint region images, the convolution operation step of the output feature map F6. is 1, the maximum pooling operation step is 2, the size of the scale-normalized initial knee joint region image is 48 × 48 × 3 (the number of channels of the picture × picture width × picture is 3 because the picture is an RGB image), and a feature map F6 (the number of channels of the feature map × feature map width × feature map F6) with the size of 23 × 23 × 32 is obtained through a first convolution and maximum pooling operation layer.
4.4.3.2 the second 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F6, the second 3 × 3 largest pooling layer performs pooling operation on the feature map F6 after the convolution operation is completed, the output feature map F7. has convolution operation step size of 1, the largest pooling operation step size is 2, and the feature map F6 with the size of 23 × 23 × 32 (feature map height × feature map width × feature map F6 channel number) passes through the second convolution and largest pooling operation layer to obtain the feature map F7 with the size of 10 × 10 × 64 (feature map height × feature map width × feature map F7 channel number).
4.4.3.3 the third 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F7, the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F7 which has completed the convolution operation, the output feature map F8. has a convolution operation step size of 1, the maximal pooling operation step size is 2, the feature map F7 with the size of 10 × 10 × 64 (the number of channels of the feature map F7 with the feature map height × and the feature map width ×) passes through the third convolution and maximal pooling operation layers to obtain the feature map F8 with the size of 4 × 4 × 64 (the number of channels of the feature map F8 with the feature map height × and the feature map width ×).
4.4.3.4 the feature map F8 is convolved by the fourth 3 × 3 convolution layer in the second feature extraction module, and the feature map F8 with the size of the convolution operation step of the fourth layer of the output feature map F9. is 1, 4 × 4 × 64 (the number of channels of the feature map F8 with feature map height × and feature map width ×) is processed by the fourth convolution operation layer to obtain the feature map F9 with the size of 2 × 2 × 128 (the number of channels of the feature map F9 with feature map height × and feature map width ×).
4.4.3.5 the first full-connection layer in the second feature extraction module performs full-connection operation on the feature map F9, and outputs a feature map F9 of the size of the feature map F10.2 × 2 × 128 (the number of channels of the feature map F9 with the feature map height × and the feature map width ×) to obtain a feature map F10 containing 256-dimensional vectors through the first full-connection layer, where the feature map F10 is the extracted second image feature.
4.4.4.4 the final target detection module of the second level network processes the second image feature, i.e. feature map F10, and outputs the final knee joint region and the key point coordinates of the d-th single knee image to be detected, the method is:
4.4.4.1 the second full-connection layer of the final target detection module of the second-level network performs full connection on the feature map F10 and outputs three groups of vectors, namely probability vectors of knee joints
Figure BDA0002385850050000151
Knee joint bounding box coordinate offset vector
Figure BDA0002385850050000152
Knee joint key point coordinate offset vector
Figure BDA0002385850050000153
Wherein
Figure BDA0002385850050000154
There is a 1-dimensional vector that is,
Figure BDA0002385850050000155
there is a 4-dimensional vector that is,
Figure BDA0002385850050000156
there is a 12-dimensional vector.
4.4.4.2 Knee joint boundary box coordinate and knee joint region key point coordinate operation layer pair of final target detection module of second-level network
Figure BDA0002385850050000157
And screening and calculating the vectors to obtain the knee joint area of the d-th single knee image to be detected and the key points of the knee joint area. Vector quantity
Figure BDA0002385850050000158
Stored are probability values (in the range of 0, 1) for what is considered to be the knee joint]) When the probability value is not lower than β, the knee joint is considered, β is a second threshold value, β∈ [0.5, 1]β preferably has a value of 0.7, if
Figure BDA0002385850050000159
Is greater than β, then the nth value1Zhang preliminary Knee region is preserved, n1Zhang preliminary knee joint area bounding box coordinates as
Figure BDA00023858500500001510
Denotes the n-th1Stretching the boundary box of the primary knee joint area (left upper horizontal coordinate, left upper vertical coordinate, right lower horizontal coordinate, right lower vertical coordinate), and reserving the coordinate offset vector of the boundary box of the knee joint
Figure BDA00023858500500001511
And knee joint key point coordinate offset vector
Figure BDA00023858500500001512
Is provided with
Figure BDA00023858500500001513
Denotes the n-th1The offset coordinates (the offset of the upper left horizontal coordinate, the offset of the upper left vertical coordinate, the offset of the lower right horizontal coordinate, and the offset of the lower right vertical coordinate) of the boundary frame of the knee joint are set
Figure BDA00023858500500001514
Denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of key point FM points of individual knee joint regions,
Figure BDA00023858500500001515
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point FL point of the individual knee joint region,
Figure BDA00023858500500001516
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the key point TM points of the individual knee joint regions,
Figure BDA00023858500500001517
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point TL point of the individual knee joint region,
Figure BDA00023858500500001518
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSM points of individual knee joint regions,
Figure BDA00023858500500001519
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSL points of individual knee joint regions. Final n th1Boundary frame coordinates of individual knee joint
Figure BDA00023858500500001520
(denotes the n-th1Coordinates of the bounding box of the individual knee joints (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate)) and the coordinates of the 6 key points
Figure BDA00023858500500001521
(denotes the n-th1The coordinates of the FM points of the individual knee regions),
Figure BDA00023858500500001522
(denotes the n-th1The coordinates of FL points for individual knee joint regions),
Figure BDA00023858500500001523
(denotes the n-th1The coordinates of the TM points of the individual knee joint regions),
Figure BDA00023858500500001524
(denotes the n-th1Coordinates of TL points of the individual knee joint regions),
Figure BDA00023858500500001525
(denotes the n-th1Coordinates of JSL points of individual knee joint regions),
Figure BDA0002385850050000161
(denotes the n-th1Coordinates of JSM points of individual knee joint regions) are calculated as follows:
Figure BDA0002385850050000162
Figure BDA0002385850050000163
Figure BDA0002385850050000164
Figure BDA0002385850050000165
Figure BDA0002385850050000166
Figure BDA0002385850050000167
Figure BDA0002385850050000168
Figure BDA0002385850050000169
Figure BDA00023858500500001610
Figure BDA00023858500500001611
Figure BDA00023858500500001612
Figure BDA00023858500500001613
Figure BDA00023858500500001614
Figure BDA00023858500500001615
Figure BDA00023858500500001616
Figure BDA00023858500500001617
4.4.4.3 if n1<N1Let n be1=n1+1, rotating to 4.4.2; if n is1≥N1Turn 4.4.4.4.
4.4.4.4.4 the second non-maximum suppression screening layer screens the knee joint bounding box in the d-th single knee image to be detected by adopting a non-maximum suppression (NMS) algorithm to obtain a final bounding box of the d-th single knee image to be detected and a final key point of the knee joint area, wherein the filtering threshold value delta of the non-maximum suppression (NMS) algorithm is 0.7.
4.4.4.5 if d is less than 2, let d be d +1, turn 4.2; if d is more than or equal to 2, the fifth step is carried out.
And fifthly, ending.
The invention can achieve the following beneficial effects: the knee joint positioning method based on the multitask two-stage convolutional neural network improves the detection rate of knee joint positioning. The invention is compared with an HOG + SVM method (documents of' A novel method for automatic localization of joint area on knee joint plane radiograms [ C ]// Scandinavian Conference on Image analysis. Springer, Cham,2017: 290: ubuntu16.04, GPU 2080Ti, python3.6 and pytorch 0.4. When the method is tested on an OAI database (45110 single knee images), the average detection accuracy rate measured by the method is 99.93 percent, and the average detection accuracy rate measured by an HOG + SVM method is 91.73 percent; when the method is tested on an MOST database (19383 single knee images), the average detection accuracy rate measured by the method is 99.02 percent, and the average detection accuracy rate measured by an HOG + SVM method is 98.20 percent. Therefore, the knee joint positioning detection method effectively improves the knee joint positioning detection rate.
Drawings
FIG. 1 is an overall flowchart of the present invention for positioning a knee joint based on a multitasking two-stage convolutional neural network;
FIG. 2 is a logical block diagram of the multitasking two-stage convolutional neural network-based present invention;
FIG. 3 is an illustration of the present invention showing 6 key points of the left and right knees at step 2.1.2.2.
Detailed Description
FIG. 1 is an overall flowchart of the present invention for positioning a knee joint based on a multitasking two-stage convolutional neural network; as shown in fig. 1, the present invention comprises the steps of:
firstly, a knee joint area positioning network based on a multitask two-stage convolutional neural network is built.
The knee joint area positioning network based on the multitask two-stage convolutional neural network is shown in fig. 2 and comprises two stages of networks: the system comprises a first-level network and a second-level network, wherein the output of the first-level network is used as the input of the second-level network.
The first-level network is composed of a first feature extraction module and a preliminary target detection module. The first feature extraction module receives the single knee image I from the outside, extracts first image features of the single knee image I and sends the first image features to the preliminary target detection module; and the preliminary target detection module detects the first image characteristics and outputs a preliminary knee joint area in the single knee image I.
The first feature extraction module consists of 5 convolutional layers including 4 convolutional layers of 3 × 3 and 1 convolutional layer of 2 × 2 and 3 max pooling layers including 2 × 2 max pooling layers and 1 max pooling layer of 3 × 3.
The method comprises the steps of performing convolution operation on a single knee image I by a first 3 × 3 convolution layer, performing pooling operation on the single knee image I which is subjected to the convolution operation by a first 2 × 2 maximum pooling layer to obtain a feature map F1, performing convolution operation on a feature map F1 by a second 3 × 3 convolution layer, performing pooling operation on a feature map F1 which is subjected to the convolution operation by a first 3 × 3 maximum pooling layer to obtain a feature map F2, performing convolution operation on a feature map F2 by a third 3 × 3 convolution layer, performing pooling operation on a feature map F2 which is subjected to the convolution operation by a second 2 × 2 maximum pooling layer to obtain a feature map F3, performing convolution operation on a feature map F3 by a fourth 3 × 3 convolution layer to obtain a feature map F4, performing convolution operation on a fifth 2 × 2 convolution map F4 to obtain a feature map F5, and obtaining a feature map F5 which is the extracted first image feature.
The preliminary target detection module comprises a1 × 1 convolution layer, a knee joint boundary frame coordinate operation layer and a first non-maximum value inhibition screening layer, wherein the 1 × 1 convolution layer convolves the feature map F5 to obtain two groups of vectors, namely a probability vector A1 of a knee joint and a knee joint boundary frame coordinate offset vector B1, the knee joint boundary frame coordinate operation layer determines a knee joint boundary frame according to A1 and B1 to obtain a knee joint area in the single-knee image I, the first non-maximum value inhibition screening layer conducts non-maximum inhibition screening on the knee joint area in the single-knee image I to obtain a preliminary knee joint area in the single-knee image I, and the single-knee image I and the preliminary knee joint area in the single-knee image I are sent to the second-level network.
The second-level network is composed of a second feature extraction module and a final target detection module. The second feature extraction module is used for carrying out feature extraction on the single knee image I received from the first-level network and the preliminary knee joint region in the single knee image I to obtain second image features; and the final target detection module performs target detection on the second image characteristics to obtain final knee joint boundary frame coordinates of the single knee image I and knee joint region key point coordinates of the single knee image I.
The second feature extraction module is composed of 4 convolutional layers, 3 maximum pooling layers and 1 fully-connected layer, the 4 convolutional layers include 3 × 3 convolutional layers and 12 × 2 convolutional layers, and the 3 maximum pooling layers include 2 × 2 maximum pooling layers and 13 × 3 maximum pooling layers.
The first 3 × convolutional layer in the second feature extraction module performs convolution operation on the preliminary knee joint region in the single knee image I, the first 2 × maximum pooling layer performs pooling operation on the preliminary knee joint region in the single knee image I which is subjected to the convolution operation to obtain a feature map F6, the second 3 × convolutional layer performs convolution operation on a feature map F6, the second 3 × maximum pooling layer performs pooling operation on the feature map F6 which is subjected to the convolution operation to obtain a feature map F7, the third 3 × convolutional layer performs convolution operation on a feature map F7, the third 2 × maximum pooling layer performs pooling operation on the feature map F7 which is subjected to the convolution operation to obtain a feature map F8, the fourth 3 × convolutional layer performs convolution operation on the feature map F8 to obtain a feature map F9, the first full-connection layer performs full-connection operation on the feature map F9 to obtain a feature map F10, and the feature map F10 is the second feature image extracted by the feature map F10.
The final target detection module comprises a second full-connection layer, a knee joint boundary frame, a key point coordinate operation layer of a knee joint area and a second non-maximum value inhibition screening layer. The second full-connection layer performs full-connection on the feature map F10 to obtain three groups of vectors, namely a probability vector A2 of the knee joint, a coordinate offset vector B2 of a boundary frame of the knee joint and a coordinate offset vector C of six key points of a knee joint region. And the key point coordinate operation layer of the knee joint boundary box and the knee joint area determines the coordinates of the knee joint boundary box and the coordinates of the key points of the knee joint area according to A2, B2 and C for the single knee image I. And the second non-maximum inhibition screening layer carries out non-maximum inhibition screening on the knee joint region and the knee joint key point coordinates in the single knee image I, and outputs the single knee image I and the final knee joint region of the single knee image I and the knee joint region key point coordinates of the single knee image I.
And secondly, training the knee joint area positioning network based on the multitask two-stage convolutional neural network.
2.1 prepare the data of the knee joint area location network based on the multitask two-stage convolution neural network.
2.1.1 preprocessing M (M is a positive integer and M is more than 2000) original images to obtain 2M single knee images subjected to histogram equalization processing, wherein the method comprises the following steps:
2.1.1.1 randomly selected M raw images from the OAI baseline public database (https:// OAI. epi-ucsf. org/datarelease/, version 11 months 2008). The original image is an X-ray medical image containing the left and right knees.
2.1.1.2 uniformly transforms M original images into an image with dark background and bright legs. Firstly, double knee images with bright background and dark legs are selected from M original images, then image pixel inversion is carried out, namely, 255 is used for subtracting original pixel values, and the M original images are uniformly converted into images with dark background and bright legs.
2.1.1.3 convert M pixels of an image with a dark background and light legs to [0, 255%]I.e., processing the dark background and light legs images into a fluid 8 image by: the pixel value of each pixel in M images with dark background and bright legs is processed as follows:
Figure BDA0002385850050000191
where P is the pixel value of any pixel in the pre-processed image, PmaxFor the maximum pixel value in the pre-processed image, PminFor minimum pixel value in the pre-processed image, PnewIs the pixel value of any one pixel in the processed image.
2.1.1.4 convert the M images of the uit 8 into 2M single knee images by: first, find the width W and half of the width of the M images of the fluid 8, respectively
Figure BDA0002385850050000192
Then half the width
Figure BDA0002385850050000193
Get the whole and mark as
Figure BDA0002385850050000194
Finally, from the image of the agent 8 according to the width coordinate
Figure BDA0002385850050000195
And (4) cutting, namely dividing the diagram of each agent 8 into 2 single knee images, and finally obtaining 2M single knee images.
2.1.1.5 histogram equalization processing is performed on each of the 2M single knee images. The histogram equalization processing method is disclosed in the document "anybrillphi. application of histogram equalization to image processing [ J ] scientific and technical information, 2007(04): pages 39-40 ].
2.1.2 annotate the knee joint real bounding box in the 2M single knee images after histogram equalization, the method is as follows:
2.1.2.1 initializing variable m ═ 1;
2.1.2.2 manually and manually labeling 6 key points of the knee joint region of the mth single knee image. Since the doctor or computer is primarily concerned with the knee joint gaps and osteophyte locations, 6 points of the knee joint gaps and osteophyte boundaries are manually marked as the main key points of the knee joint area. Fig. 3 is an illustration of labeling 6 key points of the left and right knees at step 2.1.2.2 according to the present invention, wherein the 6 key points of the right knee shown in fig. 3(a) are the medial femoral osteophyte point (FM), the lateral femoral osteophyte point (FL), the medial tibial osteophyte point (TM), the lateral tibial osteophyte point (TL), the medial articular joint space point (JSM), and the lateral articular space point (JSL), respectively. As shown in fig. 3(b), 6 key points of the left knee are also the femoral medial osteophyte point (FM), the femoral lateral osteophyte point (FL), the tibial medial osteophyte point (TM), the tibial lateral osteophyte point (TL), the joint space medial point (JSM), and the joint space lateral point (JSL), respectively. As shown, the 6 key points of the left and right knees are mirror images. The subsequent steps are processing for 6 key points, without concern for the left knee or the right knee.
2.1.2.3 marking the boundary box of the knee joint in the mth single knee image according to the key points marked manually, the method is:
2.1.2.3.1 center point coordinates (x) of 6 key points in the mth single knee image are calculated respectivelymid,ymid). Let the coordinates of 6 key points be (x)1,y1)、(x2,y2)、(x3,y3)、(x4,y4)、(x5,y5)、(x6,y6) Center point coordinates (x) of 6 key pointsmid,ymid) Comprises the following steps:
Figure BDA0002385850050000201
Figure BDA0002385850050000202
2.1.2.3.2 calculating the width w of the kneekneeThe method comprises the following steps: the maximum abscissa (x) of 6 key points is calculatedmax) And minimum abscissa (x)min) Maximum abscissa (x)max) And minimum abscissa (x)min) The difference therebetween is taken as the knee joint width wkneeNamely:
xmax=max(x1,x2,x3,x4,x5,x6)
xmin=min(x1,x2,x3,x4,x5,x6)
wknee=xmax-xmin
2.1.2.3.3 marking the knee joint area to obtain the real knee joint area boundary box coordinate
Figure BDA0002385850050000203
The coordinates represent the bounding box of the real knee joint region (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate). The center points of the 6 key points are used as the center to expand the knee joint area by 0.65 times (the knee joint area on each graph can be framed by 0.65 times) of the width of the knee joint up, down, left and right, and the knee joint area is used as the knee joint area really interested, namely the labeled boundary box, and the calculation formula is as follows:
Figure BDA0002385850050000204
Figure BDA0002385850050000205
Figure BDA0002385850050000206
Figure BDA0002385850050000211
2.1.2.4 if M is more than or equal to 2M, rotating to 2.2; if M <2M, let M be M +1, go to 2.1.2.2.
2.2 prepare training samples for the first stage network in the knee joint area positioning network based on the multitask two-stage convolution neural network, the method is as follows:
2.2.1 initializing variable m ═ 1;
2.2.2 prepare training samples for the first level network on the mth single knee image by:
2.2.2.1 initializing variable k ═ 1;
2.2.2.2 randomly taking the kth point on the mth single knee image, and enabling the coordinate of the kth point to be
Figure BDA0002385850050000212
By point
Figure BDA0002385850050000213
As the top left coordinate point of the randomly selected bounding box.
2.2.2.2 taking the width of the kth randomly chosen bounding box
Figure BDA0002385850050000214
And height
Figure BDA0002385850050000215
Let the width of the mth single knee image be wmHeight of hmAnd take wmAnd hmIs half of the minimum value of (a), and a rounding operation is performed, i.e., int (min (w)m,hm)/2),int(min(wm,hm) [ 2 ] denotes the ratio of min (w)m,hm) Rounded at [48, int (min (w) ]m,hm)/2)]Is random in the value range ofTaking an integer as the width of the randomly selected bounding box
Figure BDA0002385850050000216
And height
Figure BDA0002385850050000217
2.2.2.3 determine the coordinates of the kth randomly chosen bounding box. Order point
Figure BDA0002385850050000218
For the lower right coordinate point of the randomly selected bounding box,
Figure BDA0002385850050000219
randomly selected bounding box coordinates of
Figure BDA00023858500500002110
Representing the kth randomly chosen bounding box (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate).
2.2.2.4 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure BDA00023858500500002111
If it is
Figure BDA00023858500500002112
The kth randomly chosen bounding box is used
Figure BDA00023858500500002113
As a positive sample of training, go to 2.2.2.5; if it is
Figure BDA00023858500500002114
The kth randomly chosen bounding box is used
Figure BDA00023858500500002115
As part of the training, go to 2.2.2.5; if it is
Figure BDA00023858500500002116
The kth randomly chosen bounding box is used
Figure BDA00023858500500002117
As a negative example of training, go 2.2.2.5.
2.2.2.5 if K < K (K >50, K is the total number of randomly selected bounding boxes on the mth single knee image), let K be K +1, turn 2.2.2.2; if K is more than or equal to K, the operation is switched to 2.2.2.6.
2.2.2.6 if M is more than or equal to 2M, turning to 2.2.3; if M <2M, let M be M +1, turn 2.2.2.
2.2.3 downscaling the training samples (including positive, partial, negative) generated in step 2.2.2 to 48 × 48 × 3 (length of image × image width of × image channels).
2.2.4 the training sample after 2.2.3 steps of size reduction is amplified by mirror image operation, the method is: horizontally turning each training sample left and right, namely performing mirror image operation on each training sample to generate the training sample of the first-stage network, wherein the number of positive samples of the first-stage network is N11The number of partial samples is N12The number of negative samples is N13,N11、N12And N13Are all positive integers.
2.3 training the first-stage network by adopting the first-stage network training sample to obtain the trained first-stage network, wherein the method comprises the following steps:
2.3.1 initializing variable q to 1, setting values of model parameters, including setting learning rate to 0.001 and batch size to 500;
2.3.2 inputting the training sample of the first network into the first network, and outputting an overall loss function Lq,LqIs calculated as follows, with the model parameters set in step 2.3.1, by LqThe back propagation of the value updates the parameters in the first-level network to obtain the first-level network NET with updated parameters for the q-th timeq
Figure BDA0002385850050000221
Wherein N isNumber of training samples of first-stage network, N ═ N11+N12+N13
Figure BDA0002385850050000222
Represents the loss of the ith training sample on the jth task, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary box positioning, knee joint key point positioning, αjAn importance degree coefficient α showing the importance degree coefficient of the jth task and the knee joint detection task11, importance coefficient α for knee joint bounding box positioning task20.5, importance coefficient α for knee joint key point localization task3=0;
2.3.3, if Q is more than or equal to 2 and less than or equal to Q +1, Q is the number of times of network training, Q is 10, and then 2.3.4 is carried out; otherwise, after training, the first-stage network after training, namely NET, is obtainedQTurning to 2.4;
2.3.4 input training samples of the first level network into NETq-1Outputting the q-th overall loss function LqPassing through L under model parameters set in step 2.3.1qReverse propagation of values to update NETq-1The parameter in (1) is obtained, and the first-level network NET with the parameter updated for the q-th time is obtainedqTurning to 2.3.3;
2.4 prepare training samples for the second-stage network in the knee joint area positioning network based on the multitask two-stage convolutional neural network, the method comprises the following steps:
2.4.1 initializing variable m ═ 1;
2.4.2 positioning the preliminary knee joint region of the mth single knee image I by using the trained first-level network, wherein the method comprises the following steps:
2.4.2.1 generating a picture pyramid of the mth single-knee image I, the method comprises the steps of multiplying the mth single-knee image I by different scaling factors s (s is a positive real number, and 0< s is less than or equal to 1), scaling the mth single-knee image I in different scales to obtain PP (PP is a positive integer and 0< PP <100) single-knee images with different sizes, and forming the picture pyramid of the single-knee image I, wherein the size of each image in the Picture Pyramid (PP) is recorded as H × W × 3 (the number of channels of × images of the high × images of the images is RGB images, and 3 channels exist).
2.4.2.2 initializing variable p ═ 1;
2.4.2.3 the first trained feature extraction module extracts the features of the p-th picture in the picture pyramid to obtain a first image feature F5, where F5 is
Figure BDA0002385850050000231
(Picture height × Picture Width × feature map channel number).
2.4.2.4 the preliminary target detection module detects the first image feature F5 by using a knee joint region detection method to obtain a knee joint region in the single knee image I, wherein the knee joint region detection method is as follows:
2.4.2.4.1 the 1 × 1 convolution layer in the preliminary target detection module performs 1 × 1 convolution on the first image feature F5 of the p picture in the picture pyramid, and outputs two groups of vectors, namely the probability vector A1 of the knee jointp,A1pIs composed of
Figure BDA0002385850050000232
Figure BDA0002385850050000233
Representing the number of vectors, 1 representing the dimension of each vector), vector a1pStored are probability values (in the range of 0, 1) for what is considered to be the knee joint]) When the probability value is not lower than α, the knee joint is considered, α is a first threshold value, α∈ [0.5, 1]α preferably having a value of 0.6, knee joint bounding box coordinate offset vector B1p,B1pIs composed of
Figure BDA0002385850050000234
Figure BDA0002385850050000235
Representing the number of vectors, 4 representing the dimension of each vector), 4 dimensions (x) of each vector1,y1,x2,y2) Representing the bounding box (upper left abscissa offset, upper left ordinate offset, lower right abscissa offset, lower right ordinate offset).
2.4.2.4.2 preliminary object detection moduleThe knee joint boundary box coordinate operation layer pair A1pAnd B1pAnd screening and calculating to obtain the knee joint area in the single knee image I. Found A1pN vector positions with medium probability value not lower than α
Figure BDA0002385850050000236
At the same time as B1pFind the corresponding N vector positions
Figure BDA0002385850050000237
And 4-dimensional coordinate offset vector corresponding to the N positions
Figure BDA0002385850050000241
Wherein the content of the first and second substances,
Figure BDA0002385850050000242
and (3) coordinate offsets (upper left horizontal coordinate offset, upper left vertical coordinate offset, lower right horizontal coordinate offset, lower right vertical coordinate offset) of the nth bounding box. Calculating the coordinates of N knee joint region bounding boxes in the original single knee image I
Figure BDA0002385850050000243
And framing N knee joint areas on the original drawing I according to the coordinates of the boundary frame, wherein,
Figure BDA0002385850050000244
coordinates (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate) representing the nth bounding box are calculated as follows:
Figure BDA0002385850050000245
Figure BDA0002385850050000246
Figure BDA0002385850050000247
Figure BDA0002385850050000248
wherein the content of the first and second substances,
Figure BDA0002385850050000249
coordinate offset (upper left horizontal coordinate offset, upper left vertical coordinate offset, lower right horizontal coordinate offset, lower right vertical coordinate offset) of the nth bounding box, s (0)<s ≦ 1) represents the scaling factor of the original single knee image I.
2.4.2.5 if p is not less than PP, turning to 2.4.2.6; if p < PP, let p be p +1, go to 2.4.2.3.
2.4.2.6 the first non-maximum suppression screening layer employs a non-maximum suppression (NMS) algorithm to filter all knee joint regions of the mth single knee picture I. Non-Maximum Suppression (NMS) algorithm reference (document "Efficient Non-Maximum Suppression [ C ]]//18th International Conference on Pattern Recognition (ICPR 2006),20-24August 2006, Hong Kong, China. Efficient non-maximum suppression, international conference on pattern recognition at 18th conference), setting the non-maximum suppression filter threshold delta to 0.6, and screening the knee joint boundary region by the threshold to obtain the N of the mth single knee image I1(N1Is a positive integer and 0-N1≦ 100) preliminary knee joint regions.
2.4.3 prepare training samples for the second level network on the mth single knee image by:
2.4.3.1 initializing variable n1=1,k=1;
2.4.3.2 calculating the nth of the mth Single Knee image I1The intersection and combination ratio of the preliminary knee joint region and the knee joint region marked by the mth single knee image I
Figure BDA00023858500500002410
If it is
Figure BDA00023858500500002411
Will n be1Taking the preliminary knee joint area as a positive sample of the second-level network training, turning to 2.4.3.3; if it is
Figure BDA00023858500500002412
Will n be1Taking the preliminary knee joint area as a partial sample of the second-stage network training, turning to 2.4.3.3; if it is
Figure BDA0002385850050000251
Will n be1The individual preliminary knee joint regions are used as negative examples for the second level of network training, turn 2.4.3.3.
2.4.3.3 if n1≥N1(N1Is the number of the preliminary knee joint areas in the mth single knee joint image I, and N is more than or equal to 01Less than or equal to 100), rotating 2.4.3.4; if n is1<N1Let n be1=n1+1, 2.4.3.2.
2.4.3.4 randomly selecting the kth point on the mth single knee image
Figure BDA0002385850050000252
Will be dotted
Figure BDA0002385850050000253
As the top left coordinate point of the randomly selected bounding box.
2.4.3.5 taking the width of the kth randomly chosen bounding box
Figure BDA0002385850050000254
And height
Figure BDA0002385850050000255
Let the width of the mth single knee image be wmHeight of hmTaking wmAnd hmIs half of the minimum value of (d), and a rounding operation is performed, i.e., int (min (w)m,hm) At [48, int (min (w) ]2)m,hm)/2)]Randomly taking an integer as the width of the randomly selected bounding box
Figure BDA0002385850050000256
And height
Figure BDA0002385850050000257
2.4.3.6 determine the coordinates of the kth randomly chosen bounding box: dot
Figure BDA0002385850050000258
Marked as the bottom right coordinate point of the kth randomly chosen bounding box,
Figure BDA0002385850050000259
the kth randomly selected bounding box coordinate is
Figure BDA00023858500500002510
Representing the kth randomly chosen bounding box (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate).
2.4.3.7 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure BDA00023858500500002511
If it is
Figure BDA00023858500500002512
Taking the kth randomly selected bounding box as training data for positioning the key points of the knee joint area, and turning to 2.4.3.8; if it is
Figure BDA00023858500500002513
Go directly to 2.4.3.8.
2.4.3.8 if K < K, let K equal to K +1, go to 2.4.3.4, if K ≧ K, go to 2.4.4.
2.4.4 if M is more than or equal to 2M, rotating to 2.4.5; if M <2M, let M be M +1, turn 2.4.2.
2.4.5 the positive, partial, and negative samples generated in steps 2.4.3-2.4.4 were down-scaled to 48 × 48 × 3 (image length × image width × image channels) since they are RGB images, the number of image channels is 3.
2.4.6 the positive sample, partial sample and negative sample after 2.4.5 steps of size reduction are subjected to mirror image operation and amplification, and the method comprises the following steps: turning each training sample in the left-right direction, namely performing mirror image operation to obtain the training of the second-level networkThe number of samples and positive samples of the second-stage network training is N21The number of partial samples is N22The number of negative samples is N23,N21、N22And N23Are all positive integers.
2.5 training the second-level network by adopting the second-level network training sample generated in the step 2.4.6 to obtain the trained second-level network, wherein the method comprises the following steps:
2.5.1 initializing variable t ═ 1, setting the values of the model parameters, including setting the learning rate to 0.0001 and the batch size batch _ size to 500;
2.5.2 inputting the training sample of the second network into the second network, and outputting an overall loss function Lt,LtIs calculated as follows, with the model parameters set in step 2.5.1, by LtThe back propagation of the values updates the parameters in the second level network to obtain the second level network NET with updated parameters for the t timet
Figure BDA0002385850050000261
Wherein N is2Is the number of training samples, N2=N21+N22+N23
Figure BDA0002385850050000262
Represents the loss of the ith training sample on the jth task, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary box positioning, knee joint key point positioning, αjAn importance degree coefficient α showing the importance degree coefficient of the jth task and the knee joint detection task10.8, importance of knee positioning task α20.6, importance coefficient α for knee joint key point localization task3=1.5;
2.5.3, if T is equal to or more than 2 and is equal to or less than T, T is the number of times of network training, T is equal to 10, and then 2.5.4 is carried out; otherwise, after training, the second-level Network (NET) after training is obtainedTTurning to the third step;
2.5.4 inputting training samples of the second level networkInto NETt-1Outputting the t-th overall loss function LtPassing through L under model parameters set in step 2.5.1tReverse propagation of values to update NETt-1The parameter in (1) is obtained, and the second-level network NET with the parameter updated for the t time is obtainedt(ii) a Turn 2.5.3.
Thirdly, preprocessing the X-ray film of the double knees to be detected to obtain 2 single knee images to be detected which are processed by histogram equalization, wherein the method comprises the following steps:
3.1 if the X-ray film of the double knees to be detected is that the background is bright and the two legs are dark, converting the X-ray film of the double knees to be detected into an image that the background is dark and the two legs are bright, namely subtracting the original pixel value of the X-ray film of the double knees to be detected from 255 to be used as a new image pixel value, converting the new image pixel value into an image that the background is dark and the two legs are bright, and converting the image into 3.2; if the X-ray film of the double knees to be detected is dark background and bright legs, directly rotating to 3.2.
3.2 converting the pixels of the X-ray film to be detected into [0, 255 ]]Processing the double-knee X-ray film to be detected into double-knee images of the uint 8; the new pixel value corresponding to each pixel in the double knee image of the agent 8 is
Figure BDA0002385850050000263
Figure BDA0002385850050000264
P is the pixel value of each pixel of the pre-processed image, PmaxFor maximum pixel value of pre-processed image, PminIs the minimum pixel value of the pre-processed image.
3.3 converting the double knee image of the agent 8 into a single knee image by: finding the width W and half of the width of the uint8 bipartite knee image, respectively
Figure BDA0002385850050000271
Then half the width
Figure BDA0002385850050000272
Get the whole and mark as
Figure BDA0002385850050000273
Finally, according to width coordinates [0, int (W/2) ] from the double knee image]、[int(W/2)+1,W-1]The truncated, i.e., double knee image of the fluid 8 is divided into 2 single knee images.
3.4, performing histogram equalization processing on 2 single knee images, wherein the 2 single knee images subjected to the histogram equalization processing are used as the single knee images to be detected.
Fourthly, performing knee joint positioning on the 2 to-be-detected single knee images obtained in the third step based on the trained knee joint area positioning network of the multitask two-stage convolutional neural network, wherein the method comprises the following steps of:
4.1 initializing variable d ═ 1;
and 4.2, generating a picture pyramid of the d-th single knee image to be detected, zooming the single knee image in different scales, wherein the zooming factor is s (0< s is less than or equal to 1), obtaining PP (0< PP <100) single knee images in different sizes, and setting the size of each image as H × W × 3 (the number of channels of the wide × image of the high × image of the image), wherein the picture pyramid of the single knee image to be detected is called as the picture pyramid of the single knee image to be detected.
4.3 processing the image pyramid of the d-th single knee image to be detected by the first-stage network in the knee joint area positioning network based on the trained multitask two-stage convolutional neural network to obtain a preliminary knee joint area, wherein the method comprises the following steps:
4.3.1 initializing variable p ═ 1;
4.3.2 the first feature extraction module extracts the features of the p image in the image pyramid of the d single knee image to be detected, and the method comprises the following steps:
4.3.2.1 convolution operation is carried out on the p picture in the picture pyramid of the d single knee image to be detected by the first 3 × 3 convolution layer in the first feature extraction module, pooling operation is carried out on the p picture after convolution operation by the first 2 × 2 maximum pooling layer, a feature image F1 is output, wherein the convolution operation step length is 1, the maximum pooling operation step length is 2, the size of the p picture in the picture pyramid is set as H × W × 3 (the number of picture channels is × pictures and × pictures, and is 3 because of RGB images), and the convolution operation is carried out on the p picture pyramid through the first convolution and maximum pooling operation layer to obtain the final image
Figure BDA0002385850050000274
(number of channels of signature height × signature width × signature F1) size signature F1.
4.3.2.2 the second 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F1, the second 3 × 3 maximal pooling layer performs pooling operation on the feature map F1 which completes the convolution operation, and obtains a feature map F2, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure BDA0002385850050000275
(number of channels in feature map height × feature map width × feature map F1) size feature map F1 is obtained by second convolution and maximum pooling operation layer
Figure BDA0002385850050000276
Figure BDA0002385850050000277
(number of channels of signature height × signature width × signature F2) size signature F2.
4.3.2.3 the third 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F2, the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F2 which completes the convolution operation, and obtains a feature map F3, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure BDA0002385850050000281
(the number of channels of feature map height × feature map width × feature map F2) is obtained by a third convolution and a maximum pooling operation layer
Figure BDA0002385850050000282
(number of channels of signature height × signature width × signature F3) size signature F3.
4.3.2.4 the fourth 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F3 to obtain a feature map F4, wherein the step size of the convolution operation is 1,
Figure BDA0002385850050000283
(feature height × feature Width × TetNumber of channels of the feature map F3) is obtained by the fourth convolution operation layer F3
Figure BDA0002385850050000284
(number of channels of signature height × signature width × signature F4) size signature F4.
4.3.2.5 the fifth 2 × 2 convolutional layer in the first feature extraction module performs convolution operation on the feature map F4 to obtain a feature map F5, wherein the step size of the convolution operation is 1,
Figure BDA0002385850050000285
(number of channels of feature map height × feature map width × feature map F4) feature map F4 is obtained through a fifth convolution operation layer
Figure BDA0002385850050000286
(the feature map height × and the feature map width × are the number of channels of the feature map F5) of the size of the feature map F5, wherein the feature map F5 is the first image feature extracted by the first-level network input of the pth picture of the picture pyramid of the pth single knee image to be detected.
4.3.3 the preliminary target detection module of the first-level network detects F5 by adopting the knee joint region detection method described in 2.4.2.4 to obtain the knee joint region of the single knee image to be detected;
4.3.4 if p is less than PP, let p be p +1, change to 4.3.2; if p is not less than PP, then 4.3.5 is turned.
And 4.3.5, filtering all knee joint bounding boxes in the d-th single knee image to be detected by a non-maximum suppression (NMS) algorithm through the second non-maximum suppression screening layer. Setting a filtering threshold delta of a non-maximum suppression (NMS) algorithm to be 0.7, screening a plurality of knee joint boundary frames in the single knee image to be detected through the threshold, framing a knee joint area on the d-th single knee image to be detected by the filtered boundary frames, and outputting N as N output by the first-level network1A preliminary knee joint region.
4.4 second-level network processing the No. d single knee image to be detected1Obtaining a final knee joint region and key points of the region by the initial knee joint region, wherein the method comprises the following steps:
4.4.1 initialization variablesn1=1;
4.4.2 outputting the nth image of the d single knee to be detected from the first-level network1The preliminary knee region dimensions were normalized to 48 × 48 × 3 (number of channels for image height × image width × image).
4.4.3 second-level network extracting nth image on the d-th single knee image to be detected1Opening the characteristics of the initial knee joint region image, wherein the method comprises the following steps:
4.4.3.1 the first 3 × 3 convolutional layer pair in the second feature extraction Module1Performing convolution operation on the initial knee joint region image, and performing convolution operation on the nth layer of the first 2 × 2 maximum pooling layer1Pooling is carried out on the initial knee joint region images, the convolution operation step of the output feature map F6. is 1, the maximum pooling operation step is 2, the size of the scale-normalized initial knee joint region image is 48 × 48 × 3 (the number of channels of the picture × picture width × picture is 3 because the picture is an RGB image), and a feature map F6 (the number of channels of the feature map × feature map width × feature map F6) with the size of 23 × 23 × 32 is obtained through a first convolution and maximum pooling operation layer.
4.4.3.2 the second 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F6, the second 3 × 3 largest pooling layer performs pooling operation on the feature map F6 after the convolution operation is completed, the output feature map F7. has convolution operation step size of 1, the largest pooling operation step size is 2, and the feature map F6 with the size of 23 × 23 × 32 (feature map height × feature map width × feature map F6 channel number) passes through the second convolution and largest pooling operation layer to obtain the feature map F7 with the size of 10 × 10 × 64 (feature map height × feature map width × feature map F7 channel number).
4.4.3.3 the third 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F7, the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F7 which has completed the convolution operation, the output feature map F8. has a convolution operation step size of 1, the maximal pooling operation step size is 2, the feature map F7 with the size of 10 × 10 × 64 (the number of channels of the feature map F7 with the feature map height × and the feature map width ×) passes through the third convolution and maximal pooling operation layers to obtain the feature map F8 with the size of 4 × 4 × 64 (the number of channels of the feature map F8 with the feature map height × and the feature map width ×).
4.4.3.4 the feature map F8 is convolved by the fourth 3 × 3 convolution layer in the second feature extraction module, and the feature map F8 with the size of the convolution operation step of the fourth layer of the output feature map F9. is 1, 4 × 4 × 64 (the number of channels of the feature map F8 with feature map height × and feature map width ×) is processed by the fourth convolution operation layer to obtain the feature map F9 with the size of 2 × 2 × 128 (the number of channels of the feature map F9 with feature map height × and feature map width ×).
4.4.3.5 the first full-connection layer in the second feature extraction module performs full-connection operation on the feature map F9, and outputs a feature map F9 of the size of the feature map F10.2 × 2 × 128 (the number of channels of the feature map F9 with the feature map height × and the feature map width ×) to obtain a feature map F10 containing 256-dimensional vectors through the first full-connection layer, where the feature map F10 is the extracted second image feature.
4.4.4.4 the final target detection module of the second level network processes the second image feature, i.e. feature map F10, and outputs the final knee joint region and the key point coordinates of the d-th single knee image to be detected, the method is:
4.4.4.1 the second full-connection layer of the final target detection module of the second-level network performs full connection on the feature map F10 and outputs three groups of vectors, namely probability vectors of knee joints
Figure BDA0002385850050000291
Knee joint bounding box coordinate offset vector
Figure BDA0002385850050000292
Knee joint key point coordinate offset vector
Figure BDA0002385850050000293
Wherein
Figure BDA0002385850050000294
There is a 1-dimensional vector that is,
Figure BDA0002385850050000295
there is a 4-dimensional vector that is,
Figure BDA0002385850050000296
there is a 12-dimensional vector.
4.4.4.2 Knee joint boundary box coordinate and knee joint region key point coordinate operation layer pair of final target detection module of second-level network
Figure BDA0002385850050000301
And screening and calculating the vectors to obtain the knee joint area of the d-th single knee image to be detected and the key points of the knee joint area. Vector quantity
Figure BDA0002385850050000302
Stored are probability values (in the range of 0, 1) for what is considered to be the knee joint]) When the probability value is not lower than β, the knee joint is considered, β is a second threshold value, β∈ [0.5, 1]β preferably has a value of 0.7, if
Figure BDA0002385850050000303
Is greater than β, then the nth value1Zhang preliminary Knee region is preserved, n1Zhang preliminary knee joint area bounding box coordinates as
Figure BDA0002385850050000304
Denotes the n-th1Stretching the boundary box of the primary knee joint area (left upper horizontal coordinate, left upper vertical coordinate, right lower horizontal coordinate, right lower vertical coordinate), and reserving the coordinate offset vector of the boundary box of the knee joint
Figure BDA0002385850050000305
And knee joint key point coordinate offset vector
Figure BDA0002385850050000306
Is provided with
Figure BDA0002385850050000307
Denotes the n-th1The offset coordinates (the offset of the upper left horizontal coordinate, the offset of the upper left vertical coordinate, the offset of the lower right horizontal coordinate, and the offset of the lower right vertical coordinate) of the boundary frame of the knee joint are set
Figure BDA0002385850050000308
Denotes the n-th1One knee jointOffset coordinates (offset abscissa, offset ordinate) of key point FM point of the nodal region,
Figure BDA0002385850050000309
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point FL point of the individual knee joint region,
Figure BDA00023858500500003010
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the key point TM points of the individual knee joint regions,
Figure BDA00023858500500003011
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point TL point of the individual knee joint region,
Figure BDA00023858500500003012
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSM points of individual knee joint regions,
Figure BDA00023858500500003013
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSL points of individual knee joint regions. Final n th1Boundary frame coordinates of individual knee joint
Figure BDA00023858500500003014
(denotes the n-th1Coordinates of the bounding box of the individual knee joints (upper left abscissa, upper left ordinate, lower right abscissa, lower right ordinate)) and the coordinates of the 6 key points
Figure BDA00023858500500003015
(denotes the n-th1The coordinates of the FM points of the individual knee regions),
Figure BDA00023858500500003016
(denotes the n-th1The coordinates of FL points for individual knee joint regions),
Figure BDA00023858500500003017
(denotes the n-th1The coordinates of the TM points of the individual knee joint regions),
Figure BDA00023858500500003018
(denotes the n-th1Coordinates of TL points of the individual knee joint regions),
Figure BDA00023858500500003019
(denotes the n-th1Coordinates of JSL points of individual knee joint regions),
Figure BDA00023858500500003020
(denotes the n-th1Coordinates of JSM points of individual knee joint regions) are calculated as follows:
Figure BDA0002385850050000311
Figure BDA0002385850050000312
Figure BDA0002385850050000313
Figure BDA0002385850050000314
Figure BDA0002385850050000315
Figure BDA0002385850050000316
Figure BDA0002385850050000317
Figure BDA0002385850050000318
Figure BDA0002385850050000319
Figure BDA00023858500500003110
Figure BDA00023858500500003111
Figure BDA00023858500500003112
Figure BDA00023858500500003113
Figure BDA00023858500500003114
Figure BDA00023858500500003115
Figure BDA00023858500500003116
4.4.4.3 if n1<N1Let n be1=n1+1, rotating to 4.4.2; if n is1≥N1Turn 4.4.4.4.
4.4.4.4.4 the second non-maximum suppression screening layer screens the knee joint bounding box in the d-th single knee image to be detected by adopting a non-maximum suppression (NMS) algorithm to obtain a final bounding box of the d-th single knee image to be detected and a final key point of the knee joint area, wherein the filtering threshold value delta of the non-maximum suppression (NMS) algorithm is 0.7.
4.4.4.5 if d is less than 2, let d be d +1, turn 4.2; if d is more than or equal to 2, the fifth step is carried out.
And fifthly, ending.

Claims (11)

1. A knee joint positioning method based on a multitask two-stage convolution neural network is characterized by comprising the following steps:
the method comprises the following steps that firstly, a knee joint area positioning network based on a multitask two-stage convolutional neural network is built, the knee joint area positioning network based on the multitask two-stage convolutional neural network comprises a first-stage network and a second-stage network, and the output of the first-stage network is used as the input of the second-stage network;
the first-level network consists of a first feature extraction module and a preliminary target detection module; the first feature extraction module receives the single knee image I from the outside, extracts first image features of the single knee image I and sends the first image features to the preliminary target detection module; the preliminary target detection module detects the first image characteristics and outputs a preliminary knee joint area in the single knee image I;
the first feature extraction module is composed of 5 convolutional layers and 3 maximum pooling layers, wherein the 5 convolutional layers comprise 4 convolutional layers of 3 × 3 and 1 convolutional layer of 2 × 2, and the 3 maximum pooling layers comprise 2 maximum pooling layers of 2 × 2 and 1 maximum pooling layer of 3 × 3;
performing convolution operation on the single knee image I by a first 3 × 3 convolution layer, performing pooling operation on the single knee image I after the convolution operation by a first 2 × 2 maximum pooling layer to obtain a feature map F1, performing convolution operation on the feature map F1 by a second 3 × 3 convolution layer, performing pooling operation on the feature map F1 after the convolution operation by a first 3 × 3 maximum pooling layer to obtain a feature map F2, performing convolution operation on the feature map F2 by a third 3 × 3 convolution layer, performing pooling operation on the feature map F2 after the convolution operation by a second 2 × 2 maximum pooling layer to obtain a feature map F3, performing convolution operation on the feature map F3 by a fourth 3 × 3 convolution layer to obtain a feature map F4, and performing convolution operation on the feature map F4 by a fifth 2 × 2 maximum pooling layer to obtain a feature map F5, wherein the feature map F5 is the extracted first image feature;
the preliminary target detection module comprises a1 × 1 convolution layer, a knee joint boundary frame coordinate operation layer and a first non-maximum value inhibition screening layer, wherein the 1 × 1 convolution layer convolves a characteristic diagram F5 to obtain two groups of vectors, namely a probability vector A1 of a knee joint and a knee joint boundary frame coordinate offset vector B1, the knee joint boundary frame coordinate operation layer determines a knee joint boundary frame according to A1 and B1 to obtain a knee joint area in the single knee image I, the first non-maximum value inhibition screening layer performs non-maximum inhibition screening on the knee joint area in the single knee image I to obtain a preliminary knee joint area in the single knee image I, and the single knee image I and the preliminary knee joint area in the single knee image I are sent to a second-level network;
the second-level network consists of a second feature extraction module and a final target detection module; the second feature extraction module is used for carrying out feature extraction on the single knee image I received from the first-level network and the preliminary knee joint region in the single knee image I to obtain second image features; the final target detection module performs target detection on the second image characteristics to obtain final knee joint boundary frame coordinates of the single knee image I and knee joint area key point coordinates of the single knee image I;
the second feature extraction module is composed of 4 convolutional layers, 3 maximum pooling layers and 1 full-link layer, wherein the 4 convolutional layers comprise 3 × 3 convolutional layers and 12 × 2 convolutional layers, and the 3 maximum pooling layers comprise 2 2 × 2 maximum pooling layers and 13 × 3 maximum pooling layer;
the first 3 × convolutional layer in the second feature extraction module performs convolution operation on a preliminary knee joint region in the single-knee image I, the first 2 × maximum pooling layer performs pooling operation on the preliminary knee joint region in the single-knee image I which completes the convolution operation to obtain a feature map F6, the second 3 × convolutional layer performs convolution operation on a feature map F6, the second 3 × maximum pooling layer performs pooling operation on a feature map F6 which completes the convolution operation to obtain a feature map F7, the third 3 × convolutional layer performs convolution operation on a feature map F7, the third 2 × maximum pooling layer performs pooling operation on a feature map F7 which completes the convolution operation to obtain a feature map F8, the fourth 3 × convolutional layer performs convolution operation on a feature map F8 to obtain a feature map F9, the first full-connection layer performs full-connection operation on the feature map F9 to obtain a feature map F10, and the feature map F10 is the second feature map F10;
the final target detection module comprises a second full-connection layer, a knee joint boundary frame, a key point coordinate operation layer of a knee joint area and a second non-maximum value inhibition screening layer; the second full-connection layer performs full-connection on the feature map F10 to obtain three groups of vectors, namely a probability vector A2 of the knee joint, a coordinate offset vector B2 of a boundary frame of the knee joint and a coordinate offset vector C of six key points in a knee joint region; determining the coordinates of the knee joint boundary box and the key point coordinates of the knee joint region of the single knee image I according to A2, B2 and C by the key point coordinate operation layer of the knee joint boundary box and the knee joint region; the second non-maximum inhibition screening layer carries out non-maximum inhibition screening on the knee joint region and the knee joint key point coordinates in the single knee image I, and outputs the single knee image I, the final knee joint region of the single knee image I and the knee joint region key point coordinates of the single knee image I;
secondly, training a knee joint area positioning network based on a multitask two-stage convolutional neural network, wherein the method comprises the following steps:
2.1 prepare the data of the knee joint area location network based on the multitask two-stage convolution neural network:
2.1.1, preprocessing M original images, namely X-ray medical images containing left and right knees, to obtain 2M single knee images subjected to histogram equalization, wherein M is a positive integer;
2.1.2 annotate the knee joint real bounding box in the 2M single knee images after histogram equalization, the method is as follows:
2.1.2.1 initializing variable m ═ 1;
2.1.2.2 manually and manually labeling 6 key points of the knee joint area of the mth single knee image; the 6 key points refer to 6 points on the boundaries of the knee joint gaps and osteophytes, and the 6 points on the left knee and the right knee are respectively femoral medial osteophyte points FM, femoral lateral osteophyte points FL, tibial medial osteophyte points TM, tibial lateral osteophyte points TL, joint gap medial points JSM and joint gap lateral points JSL; the 6 key points of the left knee and the right knee are in mirror symmetry;
2.1.2.3 marking the boundary box of the knee joint in the mth single knee image according to the key points marked manually, the method is:
2.1.2.3.1 center point coordinates (x) of 6 key points in the mth single knee image are calculated respectivelymid,ymid) (ii) a Noting 6 key point coordinatesAre respectively (x)1,y1)、(x2,y2)、(x3,y3)、(x4,y4)、(x5,y5)、(x6,y6) Center point coordinates (x) of 6 key pointsmid,ymid) Comprises the following steps:
Figure FDA0002385850040000031
Figure FDA0002385850040000032
2.1.2.3.2 calculating the width w of the kneekneeThe method comprises the following steps: the maximum abscissa x of 6 key points is calculatedmaxAnd minimum abscissa xminWidth of knee joint wkneeIs the maximum abscissa xmaxAnd minimum abscissa xminThe difference between, namely:
xmax=max(x1,x2,x3,x4,x5,x6);
xmih=min(x1,x2,x3,x4,x5,x6);
wknee=xmax-xmin
2.1.2.3.3 marking the knee joint area to obtain the real knee joint area boundary box coordinate
Figure FDA0002385850040000033
The coordinates represent the real knee area bounding box (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate); the central points of the 6 key points are taken as the center, and the width of the knee joint is expanded by 0.65 times of the width of the knee joint, so that the knee joint area which is really interested, namely the labeled boundary box, and the calculation formula is as follows:
Figure FDA0002385850040000034
Figure FDA0002385850040000035
Figure FDA0002385850040000036
Figure FDA0002385850040000037
2.1.2.4 if M is more than or equal to 2M, rotating to 2.2; if M is less than 2M, making M equal to M +1, and switching to 2.1.2.2;
2.2 prepare training samples for the first stage network in the knee joint area positioning network based on the multitask two-stage convolution neural network, the method is as follows:
2.2.1 initializing variable m ═ 1;
2.2.2 prepare training samples for the first level network on the mth single knee image by:
2.2.2.1 initializing variable k ═ 1;
2.2.2.2 randomly taking the kth point on the mth single knee image, and enabling the coordinate of the kth point to be
Figure FDA0002385850040000041
By point
Figure FDA0002385850040000042
As the upper left coordinate point of the randomly selected bounding box;
2.2.2.2 taking the width of the kth randomly chosen bounding box
Figure FDA0002385850040000043
And height
Figure FDA0002385850040000044
2.2.2.3, determining the coordinates of the kth randomly chosen bounding box by: order point
Figure FDA0002385850040000045
For the lower right coordinate point of the randomly selected bounding box,
Figure FDA0002385850040000046
Figure FDA0002385850040000047
randomly selected bounding box coordinates of
Figure FDA0002385850040000048
(ii) representing the kth randomly selected bounding box (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate);
2.2.2.4 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure FDA0002385850040000049
Figure FDA00023858500400000410
If it is
Figure FDA00023858500400000411
The kth randomly chosen bounding box is used
Figure FDA00023858500400000412
As a positive sample of training, go to 2.2.2.5; if it is
Figure FDA00023858500400000413
The kth randomly chosen bounding box is used
Figure FDA00023858500400000414
As part of the training, go to 2.2.2.5; if it is
Figure FDA00023858500400000415
Then will kA randomly selected bounding box
Figure FDA00023858500400000416
As a negative example of training, go 2.2.2.5;
2.2.2.5 if K is less than K, K is a positive integer, K is the total number of randomly selected bounding boxes on the mth single knee image, K is K +1, and then 2.2.2.2 is converted; if K is more than or equal to K, turning to 2.2.2.6;
2.2.2.6 if M is more than or equal to 2M, turning to 2.2.3; if M is less than 2M, making M equal to M +1, and rotating to 2.2.2;
2.2.3, reducing the scale of the training samples generated in the step 2.2.2, namely positive samples, partial samples and negative samples, to obtain an image with the length of × and the width of ×, wherein the number of image channels is 48 × 48 × 3;
2.2.4 carrying out mirror image operation on the training samples with the reduced size of 2.2.3 steps for augmentation, namely horizontally turning each training sample left and right, namely carrying out mirror image operation on each training sample to generate a first-stage network training sample, wherein the number of first-stage network positive samples is N11The number of partial samples is N12The number of negative samples is N13,N11、N12And N13Are all positive integers;
2.3, training the first-stage network by adopting a first-stage network training sample to obtain a trained first-stage network;
2.4 prepare training samples for the second-stage network in the knee joint area positioning network based on the multitask two-stage convolutional neural network, the method comprises the following steps:
2.4.1 initializing variable m ═ 1;
2.4.2 positioning the preliminary knee joint region of the mth single knee image I by using the trained first-level network, wherein the method comprises the following steps:
2.4.2.1, generating a picture pyramid of the mth single-knee image I, wherein the method comprises the steps of multiplying the mth single-knee image I by different scaling factors s in sequence, wherein s is a positive real number, scaling the mth single-knee image I in different scales to obtain PP single-knee images with different sizes to form the picture pyramid of the single-knee image I, and PP is a positive integer;
2.4.2.2 initializing variable p ═ 1;
2.4.2.3 the first trained feature extraction module extracts the features of the p-th picture in the picture pyramid to obtain a first image feature F5, where F5 is
Figure FDA0002385850040000051
2.4.2.4 the preliminary target detection module detects the first image characteristic F5 by adopting a knee joint region detection method to obtain a knee joint region in the single knee image I;
2.4.2.5 if p is not less than PP, turning to 2.4.2.6; if p is less than PP, let p be p +1, go to 2.4.2.3;
2.4.2.6 the first non-maximum suppression screening layer adopts the non-maximum suppression algorithm NMS algorithm to filter all knee joint regions of the mth single knee image I to obtain N of the mth single knee image I1Preliminary knee joint region, N1Is a positive integer;
2.4.3 prepare training samples for the second level network on the mth single knee image by:
2.4.3.1 initializing variable n1=1,k=1;
2.4.3.2 calculating the nth of the mth Single Knee image I1The intersection and combination ratio of the preliminary knee joint region and the knee joint region marked by the mth single knee image I
Figure FDA0002385850040000052
If it is
Figure FDA0002385850040000053
Will n be1Taking the preliminary knee joint area as a positive sample of the second-level network training, turning to 2.4.3.3; if it is
Figure FDA0002385850040000061
Will n be1Taking the preliminary knee joint area as a partial sample of the second-stage network training, turning to 2.4.3.3; if it is
Figure FDA0002385850040000062
Will n be1The individual preliminary knee joint regions are used as negative samples for the second level of network training, turn 2.4.3.3;
2.4.3.3 if n1≥N1Turning to 2.4.3.4; if n is1<N1Let n be1=n1+1, 2.4.3.2;
2.4.3.4 randomly selecting the kth point on the mth single knee image
Figure FDA0002385850040000063
Will be dotted
Figure FDA0002385850040000064
As the upper left coordinate point of the randomly selected bounding box;
2.4.3.5 taking the width of the kth randomly chosen bounding box
Figure FDA0002385850040000065
And high;
2.4.3.6 determine the coordinates of the kth randomly chosen bounding box: dot
Figure FDA0002385850040000066
Marked as the bottom right coordinate point of the kth randomly chosen bounding box,
Figure FDA0002385850040000067
Figure FDA0002385850040000068
the kth randomly selected bounding box coordinate is
Figure FDA0002385850040000069
(ii) representing the kth randomly selected bounding box (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate);
2.4.3.7 calculating the intersection ratio of the kth randomly selected bounding box and the labeled bounding box on the mth single knee image
Figure FDA00023858500400000610
Figure FDA00023858500400000611
If it is
Figure FDA00023858500400000612
Taking the kth randomly selected bounding box as training data for positioning the key points of the knee joint area, and turning to 2.4.3.8; if it is
Figure FDA00023858500400000613
Go directly to 2.4.3.8;
2.4.3.8 if K is less than K, let K equal to K +1, go to 2.4.3.4, if K is greater than or equal to K, go to 2.4.4;
2.4.4 if M is more than or equal to 2M, rotating to 2.4.5; if M is less than 2M, making M equal to M +1, and rotating to 2.4.2;
2.4.5, reducing the sizes of the positive samples, the partial samples and the negative samples generated in the steps 2.4.3-2.4.4 to 48 × 48 × 3;
2.4.6 the positive sample, partial sample and negative sample after 2.4.5 steps of size reduction are subjected to mirror image operation and amplification, and the method comprises the following steps: turning each training sample in the left-right direction, namely performing mirror image operation to obtain training samples of a second-level network, wherein the number of positive samples of the second-level network training is N21The number of partial samples is N22The number of negative samples is N23,N21、N22And N23Are all positive integers;
2.5 training the second-level network by adopting the second-level network training sample generated in the step 2.4.6 to obtain a trained second-level network;
thirdly, preprocessing the double-knee X-ray film to be detected to obtain 2 single-knee images to be detected, which are subjected to histogram equalization processing;
fourthly, performing knee joint positioning on the 2 single knee images to be detected based on the trained knee joint area positioning network of the multitask two-stage convolution neural network, wherein the method comprises the following steps:
4.1 initializing variable d ═ 1;
4.2, generating a picture pyramid of the d-th single knee image to be detected, wherein the method comprises the steps of scaling the single knee image in different scales by a scaling factor of s to obtain PP single knee images in different sizes, and setting the size of each image to be H × W × 3 to be called the picture pyramid of the single knee image to be detected;
4.3 processing the image pyramid of the d-th single knee image to be detected by the first-stage network in the knee joint area positioning network based on the trained multitask two-stage convolutional neural network to obtain a preliminary knee joint area, wherein the method comprises the following steps:
4.3.1 initializing variable p ═ 1;
4.3.2 the first feature extraction module extracts the features of the p image in the image pyramid of the d single knee image to be detected, and the method comprises the following steps:
4.3.2.1 convolution operation is carried out on the p picture in the picture pyramid of the d single knee image to be detected by the first 3 × 3 convolution layer in the first feature extraction module, pooling operation is carried out on the p picture after the convolution operation by the first 2 × 2 maximum pooling layer, and a feature picture F1 is output, wherein the convolution operation step length is 1, the maximum pooling operation step length is 2, the size of the p picture in the picture pyramid is H × W × 3, and the convolution operation and the pooling operation are carried out on the p picture in the first feature extraction module to obtain the final product
Figure FDA0002385850040000071
A size feature map F1;
4.3.2.2 the second 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F1, the second 3 × 3 maximal pooling layer performs pooling operation on the feature map F1 which completes the convolution operation, and obtains a feature map F2, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure FDA0002385850040000072
the feature map F1 of the size is obtained by the second convolution and the maximum pooling operation layer
Figure FDA0002385850040000073
A size feature map F2;
4.3.2.3 the third 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F2And the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F2 which completes the convolution operation to obtain a feature map F3, wherein the step size of the convolution operation is 1, the step size of the maximal pooling operation is 2,
Figure FDA0002385850040000074
obtained through a third convolution and a maximum pooling operation layer
Figure FDA0002385850040000075
A size feature map F3;
4.3.2.4 the fourth 3 × 3 convolutional layer in the first feature extraction module performs convolution operation on the feature map F3 to obtain a feature map F4, wherein the step size of the convolution operation is 1,
Figure FDA0002385850040000081
the feature map F3 of the size is obtained by the fourth convolution operation layer
Figure FDA0002385850040000082
A size feature map F4;
4.3.2.5 the fifth 2 × 2 convolutional layer in the first feature extraction module performs convolution operation on the feature map F4 to obtain a feature map F5, wherein the step size of the convolution operation is 1,
Figure FDA0002385850040000083
the feature map F4 of the size is obtained by a fifth convolution operation layer
Figure FDA0002385850040000084
A feature map F5 of size, wherein the feature map F5 is a first image feature extracted by the first-level network and input into the pth picture of the picture pyramid of the pth single knee image to be detected;
4.3.3 the primary target detection module of the first-level network detects the F5 by adopting a knee joint region detection method to obtain the knee joint region of the single knee image to be detected;
4.3.4 if p is less than PP, let p be p +1, change to 4.3.2; if p is more than or equal to PP, turning to 4.3.5;
4.3.5 thThe two non-maximum suppression screening layers adopt a non-maximum suppression algorithm to filter all knee joint bounding boxes in the d-th single knee image to be detected, the filtered bounding boxes frame a knee joint area on the d-th single knee image to be detected, and the N knee joint bounding boxes are output by the first-level network1A preliminary knee joint region;
4.4 second-level network processing the No. d single knee image to be detected1Obtaining a final knee joint region and key points of the region by the initial knee joint region, wherein the method comprises the following steps:
4.4.1 initializing variable n1=1;
4.4.2 outputting the nth image of the d single knee to be detected from the first-level network1The dimensions of the preliminary knee joint regions were normalized to 48 × 48 × 3;
4.4.3 second-level network extracting nth image on the d-th single knee image to be detected1Opening the characteristics of the initial knee joint region image, wherein the method comprises the following steps:
4.4.3.1 the first 3 × 3 convolutional layer pair in the second feature extraction Module1Performing convolution operation on the initial knee joint region image, and performing convolution operation on the nth layer of the first 2 × 2 maximum pooling layer1Performing pooling operation on the initial knee joint region image, and outputting a feature map F6, wherein the convolution operation step length is 1, the maximum pooling operation step length is 2, the size of the scale-normalized initial knee joint region image is 48 × 48 × 3, and a feature map F6 with the size of 23 × 23 × 32 is obtained through a first convolution and maximum pooling operation layer;
4.4.3.2 the second 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F6, the second 3 × 3 maximal pooling layer performs pooling operation on the feature map F6 which completes the convolution operation, and the feature map F7 is output, wherein the convolution operation step is 1, the maximal pooling operation step is 2, and the feature map F6 with the size of 23 × 23 × 32 is subjected to second convolution and maximal pooling operation to obtain a feature map F7 with the size of 10 × 10 × 64;
4.4.3.3 the third 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F7, the third 2 × 2 maximal pooling layer performs pooling operation on the feature map F7 which completes the convolution operation, and the feature map F8 is output, wherein the convolution operation step is 1, the maximal pooling operation step is 2, and the feature map F7 with the size of 10 × 10 × 64 is subjected to third convolution and maximal pooling operation to obtain the feature map F8 with the size of 4 × 4 × 64;
4.4.3.4 the fourth 3 × 3 convolutional layer in the second feature extraction module performs convolution operation on the feature map F8 and outputs a feature map F9, wherein the feature map F8 with the step size of 1 in the convolution operation of the fourth layer and the size of 4 × 4 × 64 obtains a feature map F9 with the size of 2 × 2 × 128 through the fourth convolution operation layer;
4.4.3.5 the first full-link layer in the second feature extraction module performs full-link operation on the feature map F9, and the feature map F9 with the size of output feature map F10: 2 × 2 × 128 passes through the first full-link layer to obtain a feature map F10 containing 256-dimensional vectors, wherein the feature map F10 is the extracted second image feature;
4.4.4 the final target detection module of the second level network processes F10, outputs the final knee joint region and the key point coordinates of the d-th single knee image to be detected, the method is:
4.4.4.1 the second full-link layer of the final target detection module of the second-level network performs full-link to F10 and outputs three groups of vectors, namely probability vector of knee joint
Figure FDA0002385850040000091
Knee joint bounding box coordinate offset vector
Figure FDA0002385850040000092
Knee joint key point coordinate offset vector
Figure FDA0002385850040000093
Wherein
Figure FDA0002385850040000094
There is a 1-dimensional vector that is,
Figure FDA0002385850040000095
there is a 4-dimensional vector that is,
Figure FDA0002385850040000096
there is a 12-dimensional vector;
4.4.4.2 Knee joint boundary box coordinate and knee joint region key point coordinate operation layer pair of final target detection module of second-level network
Figure FDA0002385850040000097
Screening and calculating vectors to obtain a knee joint area of the d-th single knee image to be detected and key points of the knee joint area; vector quantity
Figure FDA0002385850040000098
Storing a probability value of the knee joint, identifying the knee joint as the probability value when the probability value is not lower than β, β is a second threshold, β∈ [0.5, 1]If, if
Figure FDA0002385850040000099
Is greater than β, then the nth value1Zhang preliminary Knee region is preserved, n1Zhang preliminary knee joint area bounding box coordinates as
Figure FDA00023858500400000910
Denotes the n-th1Stretching the boundary box of the primary knee joint area (left upper horizontal coordinate, left upper vertical coordinate, right lower horizontal coordinate, right lower vertical coordinate), and reserving the coordinate offset vector of the boundary box of the knee joint
Figure FDA00023858500400000911
And knee joint key point coordinate offset vector
Figure FDA00023858500400000912
Is provided with
Figure FDA00023858500400000913
Denotes the n-th1The offset coordinates (the offset of the upper left horizontal coordinate, the offset of the upper left vertical coordinate, the offset of the lower right horizontal coordinate, and the offset of the lower right vertical coordinate) of the boundary frame of the knee joint are set
Figure FDA0002385850040000101
Denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of key point FM points of individual knee joint regions,
Figure FDA0002385850040000102
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point FL point of the individual knee joint region,
Figure FDA0002385850040000103
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the key point TM points of the individual knee joint regions,
Figure FDA0002385850040000104
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of the critical point TL point of the individual knee joint region,
Figure FDA0002385850040000105
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSM points of individual knee joint regions,
Figure FDA0002385850040000106
denotes the n-th1Offset coordinates (offset abscissa, offset ordinate) of JSL points of individual knee joint regions; final n th1Boundary frame coordinates of individual knee joint
Figure FDA0002385850040000107
Figure FDA0002385850040000108
I.e. n1Coordinates of the bounding box of the individual knee joints (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate) and coordinates of the 6 key points
Figure FDA0002385850040000109
I.e. n1The coordinates of the FM points of the individual knee joint regions,
Figure FDA00023858500400001010
i.e. n1The coordinates of the FL point of each knee joint region,
Figure FDA00023858500400001011
Figure FDA00023858500400001012
i.e. n1The coordinates of the TM points of the individual knee joint regions,
Figure FDA00023858500400001013
i.e. n1The coordinates of the TL point of the individual knee joint area,
Figure FDA00023858500400001014
i.e. n1The coordinates of the JSL points of the individual knee joint regions,
Figure FDA00023858500400001015
i.e. n1The coordinates of the JSM points for the individual knee regions are calculated as follows:
Figure FDA00023858500400001016
Figure FDA00023858500400001017
Figure FDA00023858500400001018
Figure FDA00023858500400001019
Figure FDA00023858500400001020
Figure FDA00023858500400001021
Figure FDA00023858500400001022
Figure FDA0002385850040000111
Figure FDA0002385850040000112
Figure FDA0002385850040000113
Figure FDA0002385850040000114
Figure FDA0002385850040000115
Figure FDA0002385850040000116
Figure FDA0002385850040000117
Figure FDA0002385850040000118
Figure FDA0002385850040000119
4.4.4.3 if n1<N1Let n be1=n1+1, rotating to 4.4.2; if n is1≥N1Turning to 4.4.4.4;
4.4.4.4 the second non-maximum inhibition screening layer screens the knee joint bounding box in the d-th single knee image to be detected by adopting a non-maximum inhibition NMS algorithm to obtain the final bounding box of the d-th single knee image to be detected and the key points of the final knee joint area;
4.4.4.5 if d is less than 2, let d be d +1, turn 4.2; if d is more than or equal to 2, turning to the fifth step;
and fifthly, ending.
2. The knee joint positioning method based on the multitask two-stage convolution neural network as claimed in claim 1, wherein the method for preprocessing the original image in the step 2.1.1 is as follows:
2.1.1.1 randomly selecting M original images from an OAIbasepine public database, namely X-ray medical images containing left and right knees;
2.1.1.2 uniformly transforms M original images into dark background and light leg images: firstly, selecting a double knee image with a bright background and dark legs from M original images, and then carrying out image pixel inversion, namely subtracting an original pixel value from 255 to uniformly convert the M original images into an image with a dark background and bright legs;
2.1.1.3 convert M pixels of an image with a dark background and light legs to [0, 255%]I.e., processing the dark background and light legs images into a fluid 8 image by: the pixel value of each pixel in M images with dark background and bright legs is processed as follows:
Figure FDA0002385850040000121
Figure FDA0002385850040000122
where P is the pixel value of any pixel in the pre-processed image, PmaxFor the maximum pixel value in the pre-processed image, PminFor minimum pixel value in the pre-processed image, PnewThe pixel value of any pixel in the processed image;
2.1.1.4 convert the M images of the uit 8 into 2M single knee images by: first, find the width W and half of the width of the M images of the fluid 8, respectively
Figure FDA0002385850040000123
Then half the width
Figure FDA0002385850040000124
Get the whole and mark as
Figure FDA0002385850040000125
Finally, from the image of the agent 8 according to the width coordinate
Figure FDA0002385850040000126
Intercepting, namely dividing the picture of each agent 8 into 2 single knee images, and finally obtaining 2M single knee images;
2.1.1.5 histogram equalization is performed on each of the 2M single knee images to obtain 2M single knee images that have been histogram equalized.
3. The method of claim 1, wherein the steps of 2.2.2.2 and 2.4.3.5 are further characterized by taking the width of the kth randomly selected bounding box
Figure FDA0002385850040000127
And height
Figure FDA0002385850040000128
The method comprises the following steps: let the width of the mth single knee image be wmHeight of hmAnd take wmAnd hmIs half of the minimum value of (a), and a rounding operation is performed, i.e., int (min (w)m,hm)/2),int(min(wm,hm) [ 2 ] denotes the ratio of min (w)m,hm) Rounded at [48, int (min (w) ]m,hm)/2)]Randomly taking an integer in the value range of (a) as the width of the randomly selected bounding box
Figure FDA0002385850040000129
And height
Figure FDA00023858500400001210
4. The method of claim 1, wherein M satisfies M>2000, said K satisfies K>50, the PP satisfies 0<PP<100, s satisfies 0<s is less than or equal to 1, N1Satisfies 0-N1≤100。
5. The knee joint positioning method based on the multitask two-stage convolutional neural network as claimed in claim 1, wherein the method for training the first-stage network in the 2.3 steps is:
2.3.1 initializing variable q to 1, setting values of model parameters, including setting learning rate to 0.001 and batch size to 500;
2.3.2 inputting the training sample of the first network into the first network, and outputting an overall loss function Lq,LqIs calculated as follows, with the model parameters set in step 2.3.1, by LqThe back propagation of the value updates the parameters in the first-level network to obtain the first-level network NET with updated parameters for the q-th timeq
Figure FDA0002385850040000131
Where N is the number of training samples for the first stage network, and N is N11+N12+N13
Figure FDA0002385850040000132
Representing the ith training sample on the jth taskLoss, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary frame positioning and knee joint key point positioning, αjAn importance degree coefficient α showing the importance degree coefficient of the jth task and the knee joint detection task11, importance coefficient α for knee joint bounding box positioning task20.5, importance coefficient α for knee joint key point localization task3=0;
2.3.3, if Q is more than or equal to 2 and less than or equal to Q +1, Q is the number of times of network training, Q is 10, and then 2.3.4 is carried out; otherwise, after training, the first-stage network after training, namely NET, is obtainedQAnd ending;
2.3.4 input training samples of the first level network into NETq-1Outputting the q-th overall loss function LqPassing through L under model parameters set in step 2.3.1qReverse propagation of values to update NETq-1The parameter in (1) is obtained, and the first-level network NET with the parameter updated for the q-th time is obtainedqTurning to 2.3.3;
6. the knee joint positioning method based on the multitask two-stage convolutional neural network as claimed in claim 1, wherein in step 2.4.2.4 said preliminary target detection module uses knee joint region detection method to detect the first image feature F5 to obtain the knee joint region in the single knee image I, the knee joint region detection method is:
2.4.2.4.1 the 1 × 1 convolution layer in the preliminary target detection module performs 1 × 1 convolution on the first image feature F5 of the p picture in the picture pyramid, and outputs two groups of vectors, namely the probability vector A1 of the knee jointp,A1pIs composed of
Figure FDA0002385850040000133
(
Figure FDA0002385850040000134
Representing the number of vectors, 1 representing the dimension of each vector), vector a1pStoring a probability value for a knee joint, α∈ [0.5, 1 ] for the knee joint when the probability value is not less than a first threshold α](ii) a Knee joint boundary frame coordinate deviationShift quantity B1p,B1pIs composed of
Figure FDA0002385850040000135
(
Figure FDA0002385850040000136
Representing the number of vectors, 4 representing the dimension of each vector), 4 dimensions (x) of each vector1,y1,x2,y2) Representing the bounding box (left upper horizontal coordinate offset, left upper vertical coordinate offset, right lower horizontal coordinate offset, right lower vertical coordinate offset);
2.4.2.4.2 Knee Joint bounding Box coordinate operation layer pairs of preliminary object detection Module A1pAnd B1pScreening and calculating to obtain a knee joint area in the single knee image I; found A1pN vector positions with medium probability value not lower than α
Figure FDA0002385850040000137
N is more than or equal to N and more than or equal to 1, N is more than or equal to 1, and B is 1pFind the corresponding N vector positions
Figure FDA0002385850040000141
And 4-dimensional coordinate offset vector corresponding to the N positions
Figure FDA0002385850040000142
Wherein the content of the first and second substances,
Figure FDA0002385850040000143
coordinate offsets (upper left horizontal coordinate offset, upper left vertical coordinate offset, lower right horizontal coordinate offset, lower right vertical coordinate offset) of the nth bounding box are represented; calculating the coordinates of N knee joint region bounding boxes in the original single knee image I
Figure FDA0002385850040000144
And framing N knee joint areas on the original drawing I according to the coordinates of the boundary frame, wherein,
Figure FDA0002385850040000145
coordinates (upper left horizontal coordinate, upper left vertical coordinate, lower right horizontal coordinate, lower right vertical coordinate) representing the nth bounding box are calculated as follows:
Figure FDA0002385850040000146
Figure FDA0002385850040000147
Figure FDA0002385850040000148
Figure FDA0002385850040000149
Figure FDA00023858500400001410
wherein the content of the first and second substances,
Figure FDA00023858500400001411
and s represents the scaling factor of the original single knee image I.
7. The method of claim 1, wherein the step 2.4.2.4.1 provides a value of α of 0.6.
8. The knee joint positioning method based on the multitask two-stage convolutional neural network as claimed in claim 1, wherein the method for training the second-stage network in the 2.5 steps is as follows:
2.5.1 initializing variable t ═ 1, setting the values of the model parameters, including setting the learning rate to 0.0001 and the batch size batch _ size to 500;
2.5.2 inputting the training sample of the second network into the second network, and outputting an overall loss function Lt,LtIs calculated as follows, with the model parameters set in step 2.5.1, by LtThe back propagation of the values updates the parameters in the second level network to obtain the second level network NET with updated parameters for the t timet
Figure FDA0002385850040000151
Wherein N is2Is the number of training samples, N2=N21+N22+N23
Figure FDA0002385850040000152
Represents the loss of the ith training sample on the jth task, j is more than or equal to 1 and less than or equal to 3, and totally has 3 tasks of knee joint detection, knee joint boundary box positioning, knee joint key point positioning, αjAn importance degree coefficient α showing the importance degree coefficient of the jth task and the knee joint detection task10.8, importance of knee positioning task α20.6, importance coefficient α for knee joint key point localization task3=1.5;
2.5.3, if T is equal to or more than 2 and is equal to or less than T, T is the number of times of network training, T is equal to 10, and then 2.5.4 is carried out; otherwise, after training, the second-level Network (NET) after training is obtainedTAnd ending;
2.5.4 inputting training sample of second-level network into NETt-1Outputting the t-th overall loss function LtPassing through L under model parameters set in step 2.5.1tReverse propagation of values to update NETt-1The parameter in (1) is obtained, and the second-level network NET with the parameter updated for the t time is obtainedt(ii) a Turn 2.5.3.
9. The knee joint positioning method based on the multitask two-stage convolutional neural network as claimed in claim 1, wherein the third step of preprocessing the X-ray film of the double knees to be detected is as follows:
3.1 if the X-ray film of the double knees to be detected is that the background is bright and the two legs are dark, converting the X-ray film of the double knees to be detected into an image that the background is dark and the two legs are bright, namely subtracting the original pixel value of the X-ray film of the double knees to be detected from 255 to be used as a new image pixel value, converting the new image pixel value into an image that the background is dark and the two legs are bright, and converting the image into 3.2; if the X-ray film of the double knees to be detected is dark background and bright legs, directly rotating to 3.2;
3.2 converting the pixels of the X-ray film to be detected into [0, 255 ]]Processing the double-knee X-ray film to be detected into double-knee images of the uint 8; the new pixel value corresponding to each pixel in the double knee image of the agent 8 is
Figure FDA0002385850040000153
P is the pixel value of each pixel of the pre-processed image, PmaxFor maximum pixel value of pre-processed image, PminIs the minimum pixel value of the image before processing;
3.3 converting the double knee image of the agent 8 into a single knee image by: finding the width W and half of the width of the uint8 bipartite knee image, respectively
Figure FDA0002385850040000154
Then half the width
Figure FDA0002385850040000155
Get the whole and mark as
Figure FDA0002385850040000156
Finally, according to width coordinates [0, int (W/2) ] from the double knee image]、[int(W/2)+1,W-1]Intercepting, namely dividing the double knee image of the uint8 into 2 single knee images;
3.4, performing histogram equalization processing on 2 single knee images, wherein the 2 single knee images subjected to the histogram equalization processing are used as the single knee images to be detected.
10. The method of claim 1, wherein the first non-maxima suppression screening layer at 2.4.2.6 filters all knee joint regions of the mth single knee image I with a non-maxima suppression algorithm, wherein the filtering threshold δ is 0.6; 4.3.5, setting a filtering threshold value delta to be 0.7 when the second non-maximum value inhibition screening layer adopts a non-maximum value inhibition algorithm to filter all knee joint boundary frames in the d-th single knee image to be detected; 4.4.4.4 step 4, the second non-maximum suppression screening layer screens the knee joint boundary frame in the d-th single knee image to be detected by adopting a non-maximum suppression algorithm, and the filtering threshold value delta is 0.7 when the final knee joint boundary frame and the final key point of the knee joint area are obtained.
11. The method of claim 1, wherein the value of β in 4.4.4.2 is 0.7.
CN202010097868.3A 2020-02-17 2020-02-17 Knee joint positioning method based on multitask two-stage convolution neural network Active CN111340760B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010097868.3A CN111340760B (en) 2020-02-17 2020-02-17 Knee joint positioning method based on multitask two-stage convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010097868.3A CN111340760B (en) 2020-02-17 2020-02-17 Knee joint positioning method based on multitask two-stage convolution neural network

Publications (2)

Publication Number Publication Date
CN111340760A true CN111340760A (en) 2020-06-26
CN111340760B CN111340760B (en) 2022-11-08

Family

ID=71186920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010097868.3A Active CN111340760B (en) 2020-02-17 2020-02-17 Knee joint positioning method based on multitask two-stage convolution neural network

Country Status (1)

Country Link
CN (1) CN111340760B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767857A (en) * 2020-06-30 2020-10-13 电子科技大学 Pedestrian detection method based on lightweight two-stage neural network
CN113076987A (en) * 2021-03-29 2021-07-06 北京长木谷医疗科技有限公司 Osteophyte identification method, device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network
CN109902806A (en) * 2019-02-26 2019-06-18 清华大学 Method is determined based on the noise image object boundary frame of convolutional neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN108564029A (en) * 2018-04-12 2018-09-21 厦门大学 Face character recognition methods based on cascade multi-task learning deep neural network
CN109902806A (en) * 2019-02-26 2019-06-18 清华大学 Method is determined based on the noise image object boundary frame of convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苗光等: "二维和三维卷积神经网络相结合的CT图像肺结节检测方法", 《激光与光电子学进展》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767857A (en) * 2020-06-30 2020-10-13 电子科技大学 Pedestrian detection method based on lightweight two-stage neural network
CN113076987A (en) * 2021-03-29 2021-07-06 北京长木谷医疗科技有限公司 Osteophyte identification method, device, electronic equipment and storage medium
WO2022205928A1 (en) * 2021-03-29 2022-10-06 北京长木谷医疗科技有限公司 Osteophyte identification method and apparatus, and electronic device and storage medium

Also Published As

Publication number Publication date
CN111340760B (en) 2022-11-08

Similar Documents

Publication Publication Date Title
Saeedi et al. Infrared and visible image fusion using fuzzy logic and population-based optimization
Lakshminarayanan et al. Deep Learning-Based Hookworm Detection in Wireless Capsule Endoscopic Image Using AdaBoost Classifier.
Choi et al. Boosting proximal dental caries detection via combination of variational methods and convolutional neural network
Lin et al. Automatic retinal vessel segmentation via deeply supervised and smoothly regularized network
Liang et al. United snakes
CN106940816A (en) Connect the CT image Lung neoplasm detecting systems of convolutional neural networks entirely based on 3D
Reddy et al. A novel computer-aided diagnosis framework using deep learning for classification of fatty liver disease in ultrasound imaging
Rajalakshmi et al. Deeply supervised u-net for mass segmentation in digital mammograms
Le et al. Liver tumor segmentation from MR images using 3D fast marching algorithm and single hidden layer feedforward neural network
Chen et al. Blood vessel enhancement via multi-dictionary and sparse coding: Application to retinal vessel enhancing
CN111709446B (en) X-ray chest radiography classification device based on improved dense connection network
CN111340760B (en) Knee joint positioning method based on multitask two-stage convolution neural network
Mehdizadeh et al. Deep feature loss to denoise OCT images using deep neural networks
Bala Intracardiac mass detection and classification using double convolutional neural network classifier
Wang et al. Generative image deblurring based on multi-scaled residual adversary network driven by composed prior-posterior loss
Kora et al. Automatic segmentation of polyps using U-net from colonoscopy images
Hasegawa et al. Convolution neural-network-based detection of lung structures
CN107146202A (en) The method of the Image Blind deblurring post-processed based on L0 regularizations and fuzzy core
Ramana Alzheimer disease detection and classification on magnetic resonance imaging (MRI) brain images using improved expectation maximization (IEM) and convolutional neural network (CNN)
Antony Automatic quantification of radiographic knee osteoarthritis severity and associated diagnostic features using deep convolutional neural networks
Suresh et al. Improving the mammogram images by intelligibility mammogram enhancement method
Sivakumar et al. A novel method on earlier detection of bone cancer using Markov random field segmentation
Hassanien et al. Digital mammogram segmentation algorithm using pulse coupled neural networks
Cheng et al. Spherical transformer for quality assessment of pediatric cortical surfaces
Sharma et al. Solving image processing critical problems using machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant