CN112381806A - Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method - Google Patents

Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method Download PDF

Info

Publication number
CN112381806A
CN112381806A CN202011293549.6A CN202011293549A CN112381806A CN 112381806 A CN112381806 A CN 112381806A CN 202011293549 A CN202011293549 A CN 202011293549A CN 112381806 A CN112381806 A CN 112381806A
Authority
CN
China
Prior art keywords
double
chromosome
centromere
neural network
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011293549.6A
Other languages
Chinese (zh)
Inventor
崔玉峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Beion Pharmaceutical Technology Co ltd
Original Assignee
Shanghai Beion Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Beion Pharmaceutical Technology Co ltd filed Critical Shanghai Beion Pharmaceutical Technology Co ltd
Priority to CN202011293549.6A priority Critical patent/CN112381806A/en
Publication of CN112381806A publication Critical patent/CN112381806A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention relates to a chromosome analysis technology, and discloses a double centromere aberration chromosome analysis prediction method based on a multi-scale fusion method, wherein a trained neural network model extracts deep features of an input chromosome image to be analyzed through CSPN; performing Drop Block operation and spatial pyramid pooling operation to refine key features; outputting three feature tensors on three scales respectively by adopting a feature fusion strategy combining a feature pyramid network and a path aggregation network; outputting three prediction tensors after Drop Block operation; screening the predicted bounding box by using a DIOU NMS algorithm; the double centromere aberrated chromosome was analyzed. The trained neural network model can rapidly mark and count the number of the distorted chromosomes on the distorted chromosome image, has higher detection accuracy and stronger robustness, and assists doctors in finishing biological dose estimation.

Description

Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method
Technical Field
The invention relates to biological dose estimation, in particular to a double centromere aberration chromosome analysis and prediction method based on a multi-scale fusion method.
Background
Chromosome aberration analysis has been used as a biological dose estimate for over 50 years, and its reliability has been confirmed by a large amount of data, which is considered to be one of the most reliable radiation biometric methods in incident dose estimation.
Currently, the radiology staff mainly analyzes and counts double centromere chromosomes (dicentric chromosomees), centromere rings, non-centromere chromosomes and the like during occupational health examination. When radiation accident biological dose is estimated, only the number of double centromere chromosomes or the number of double centromere chromosomes plus the centromere rings is counted, and no centromere chromosome is taken as an auxiliary index, so that the detection and counting of double centromere aberrated chromosomes have important significance for biological dose estimation.
However, in the conventional analysis of the distorted chromosomes, the distorted chromosomes existing in the chromosome images to be analyzed are observed and found out manually and then counted, which requires a great deal of effort of doctors to complete the analysis, and the working efficiency and accuracy of the doctors are also obviously reduced along with the increase of the working time. At present, many units use chromosome automatic scanning imaging systems to detect distorted chromosomes, which is more demanding for doctors to read. With the development of artificial intelligence and the progress of digital image technology in recent years, the detection speed can be greatly improved by adopting a computer to automatically analyze the distorted chromosomes, and higher analysis accuracy is achieved.
At present, a common automatic distorted chromosome analysis system usually adopts a two-step detection scheme, firstly, a region to be processed, where distorted chromosomes possibly exist, is segmented by a digital image processing technology, then, chromosomes are classified by a support vector machine or a classification convolutional neural network, and finally, whether the distorted chromosomes exist in a chromosome image is determined. The method can improve the analysis speed of the distorted chromosome to a certain extent through a computer, but the system of the method is extremely sensitive to the image quality of the chromosome to be analyzed, and has low detection accuracy and poor system robustness for images with more impurities, overlapped chromosomes or dense chromosomes.
Disclosure of Invention
Aiming at the defects of poor detection accuracy and poor robustness of the detection method in the prior art, the invention provides the double centromere aberration chromosome analysis and prediction method based on the multi-scale fusion method, which can accurately and quickly carry out automatic analysis and prediction on the input chromosome image to be analyzed, mark the double centromere aberration chromosomes existing in the chromosome image to be analyzed, count the number of the double centromere aberration chromosomes in the chromosome image to be analyzed and assist a doctor in finishing biological dose estimation.
In order to solve the technical problem, the invention is solved by the following technical scheme:
the double centromere aberration chromosome analysis and prediction method based on the multi-scale fusion method comprises the following steps,
s1: reading a chromosome image to be analyzed by the trained neural network model and extracting deep features of the input chromosome image to be analyzed through a trunk neural network CSPN;
s2: refining key features of the deep features through Drop Block operation and spatial pyramid pooling operation;
s3: transmitting and fusing the chromosome image to be analyzed after the spatial pyramid pooling operation by adopting a characteristic pyramid network from top to bottom in a manner of up-sampling characteristic information of a high layer, then adding a bottom-up path aggregation network at the output end of the characteristic pyramid network to supplement the position characteristics of the double centromere distorted chromosome, and outputting three characteristic tensors on three scales after transmitting and fusing the strong positioning characteristics of a low layer upwards;
s4: outputting three predicted tensors after three feature tensors extracted by the neural network model are subjected to Drop Block operation;
s5: merging the regression-predicted double centromere aberration chromosome bounding boxes in the three prediction tensors, and screening the predicted bounding boxes by adopting a DIOU NMS algorithm;
s6: and finally, counting the number of the double centromere distorted chromosomes predicted by the neural network model and the information of the boundary box, marking the identified double centromere distorted chromosomes on the input chromosome image to be analyzed, and outputting the marked image.
Further, the neural network model training step is as follows,
s01: manually labeling double centromere distorted chromosomes in a plurality of sample images to generate a label file, wherein the label file mainly stores the central coordinates and width and height information of a double centromere distorted chromosome boundary frame in the sample images and is used as a training sample of a neural network model for automatic analysis of the double centromere distorted chromosomes;
s02: combining an original image of the sample image and the label file to prepare a training set for training a neural network model;
s03: training a sample to output three feature tensors through a designed neural network model, calculating a Loss value between the output feature tensor and a label file of a sample image by adopting a Loss function Loss, reversely transmitting the Loss value to each node of the neural network model, and updating weight parameters in the neural network model; and obtaining a neural network model capable of accurately performing regression prediction on the double centromere aberration chromosome bounding box after multiple iterative training.
Further, in step S1, the chromosome image to be analyzed is divided into S × S grid cells in advance, and each grid cell is responsible for predicting the probability of the existence of the double centromere aberration chromosome at the position.
Further, in step S4, the sizes (width × height × depth) of the three prediction tensors are 19 × 19 × 15, 38 × 38 × 15, and 76 × 76 × 15, respectively, where the widths and heights of the prediction tensors respectively represent the width and height sizes of the bounding box regression prediction performed by the neural network model, and the depth 15 represents that the prediction tensor predicts 3 bounding boxes on each grid unit according to the anchor box size, and each bounding box predicts one confidence value c, the width of the bounding box, the height of the bounding box, and the central coordinate value (x, y), so that each prediction tensor outputs 15 prediction values on each grid unit.
Further, in step S5, the DIOU NMS algorithm ranks all regression prediction bounding boxes according to their confidence values c, labels the bounding box with the largest confidence value c, and then calculates the DIOU values of all the bounding boxes obtained by prediction and the bounding box with the largest confidence value c; and deleting the bounding boxes with the DIOU values exceeding the threshold range, then continuing to sequence the bounding boxes according to the confidence values c and repeating the steps, and finally taking all the marked bounding boxes as the bounding boxes of the double centromere aberration chromosome targets output by the neural network model.
Further, the air conditioner is provided with a fan,
Figure BDA0002784694240000031
in the formula, DC represents a diagonal distance of a minimum closure area including both the prediction bounding box and the actual bounding box, and DP represents a distance between center points of the prediction bounding box and the actual bounding box.
Further, in step S03, the Loss function Loss for training the neural network model mainly includes a confidence Loss for determining whether the chromosome with double-colored granule aberration existscAnd bounding Box regression Loss of aberrated chromosomesbTwo parts, Loss function Loss ═ Lossc+λLossbIn the formula, the confidence coefficient Loss function Loss is equalized by adding the weight value lambdacAnd bounding Box regression Loss function LossbOccupation in the total Loss function Loss.
Further, a confidence loss function
Figure BDA0002784694240000032
In the formula, a gamma factor is added to reduce the Loss of confidence of grid cells without chromosome with double-colored granule aberrationc(ii) a Adding a balance factor alpha to balance the confidence coefficient loss proportion of the grid unit with the double-colored-particle distorted chromosome and the grid unit without the double-colored-particle distorted chromosome in the chromosome image to be analyzed; p represents the probability that the neural network model predicts whether a double centromere aberration chromosome exists in each grid cell, and y represents the corresponding label file.
Further, bounding box regression loss function
Figure BDA0002784694240000033
In the formula, DC represents a diagonal distance of a minimum closure area including both the prediction bounding box and the actual bounding box, DP represents a distance between center points of the prediction bounding box and the actual bounding box, and the influence factor δ is a parameter for measuring the aspect ratio consistency of the prediction bounding box and the actual bounding box.
Further, an influence factor
Figure BDA0002784694240000034
In the formula, w and h respectively represent the actual width and height of the bounding box of the double centromere aberration chromosome tag file used for the training of the neural network model, and w 'and h' respectively represent the predicted values of the neural network model for the width and height of the bounding box of the double centromere aberration chromosome.
Due to the adoption of the technical scheme, the invention has the remarkable technical effects that:
1. the detection method provided by the invention can be used for building a neural network model by adopting a deep learning method without depending on manual analysis, completing the training of the neural network model for detecting the double centromere aberrated chromosome through a large number of double centromere aberrated chromosome sample images, quickly and accurately completing the automatic detection and analysis of the double centromere aberrated chromosome, and greatly improving the analysis efficiency of the aberrated chromosome. The method for automatically detecting the double centromere aberration chromosome estimated dose can replace the conventional manual analysis and estimation dose method, the error caused by automatic analysis is smaller, and the analysis speed is 30 times faster.
2. The method quantifies the position characteristics of the double centromere distorted chromosomes, divides the chromosome image to be analyzed into S multiplied by S grid units, quickly analyzes whether the distorted chromosomes exist in the chromosome image to be analyzed by predicting the probability of the double centromere distorted chromosomes existing in each grid unit, and then predicts the central coordinates and width and height information of the distorted chromosomes relative to the grid units to further determine the boundary frame of the distorted chromosomes.
3. According to the method, the CSP module is adopted to construct the trunk neural network to extract the characteristics, so that the deep characteristics of the double centromere aberrance chromosome can be more effectively extracted, meanwhile, the gradient flow is cut off to prevent excessive repeated gradient information from being used for training the neural network model, and the extraction capability of the trunk neural network on the double centromere aberrance chromosome characteristics is improved by adopting a hierarchical characteristic fusion strategy.
4. The traditional Drop out mode does not consider the spatial characteristics of image characteristics when performing characteristic reduction, so that the effect of improving the robustness of a neural network model is not obvious.
5. Because the features are extracted by adopting the spatial pyramid pooling operation, the receiving range of the subsequent network structure to the trunk features is expanded, and the features of the distorted chromosomes with different sizes are fused, the trained neural network model can have higher detection accuracy on the double centromere distorted chromosomes with different sizes.
6. The neural network model designed by the invention adopts a feature pyramid network FPN to transmit and fuse feature information of a high layer in an up-sampling mode from top to bottom, then adds a bottom-up path aggregation network PAN at the output end of the FPN to supplement the position feature of the distorted chromosome, and transmits the strong positioning feature of a low layer upwards. By adopting a feature fusion strategy combining FPN and PAN, the accuracy of the neural network model for detecting the double centromere aberration chromosome is greatly improved.
7. The method divides a loss function used for training a neural network model into a confidence coefficient loss function and a boundary frame regression loss function, strengthens the learning capacity of the neural network model to a characteristic complex region in an input image by adding influence factors gamma and alpha in the confidence coefficient loss function, and improves the accuracy of the neural network model to the regression prediction of the double centromere distortion chromosome boundary frame according to the width-height ratio consistency of a real boundary frame and a predicted boundary frame by adding the influence factor delta in the boundary frame regression loss function. And finally, the proportion of the two loss values in the total loss value is balanced through a weight coefficient lambda, and the designed loss function can train the neural network model more effectively.
8. The invention can realize the end-to-end automatic analysis of the chromosome image to be analyzed by inputting the chromosome image to be analyzed, directly mark the chromosome with distortion on the input chromosome image to be analyzed, and count the number of the distorted chromosomes. The analysis speed is faster, the detection accuracy is higher, and the robustness is stronger.
Drawings
FIG. 1 is an analytical flow chart of an embodiment of the present invention;
FIG. 2 is a network architecture diagram of a neural network model of the present invention;
FIG. 3 is a CSP module architecture of the present invention;
FIG. 4 is a block diagram of an SPP of the present invention;
FIG. 5 is an image of a chromosome to be analyzed according to the present invention;
FIG. 6 is an image of a chromosome after analysis according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
Examples
As shown in fig. 1-2, the network structure of the designed neural network model mainly comprises three parts, namely a trunk neural network CSPN, an FPN + PAN feature fusion network and an output prediction end. The trunk neural network CSPN mainly comprises a plurality of CSPs, and the specific trunk neural network CSPN comprises five Cross Stage Partial neural network modules (CSPs). The CSP module has a structure as shown in fig. 3, and is composed of several CBM modules and n cyclic residual units (residual units), wherein the CBM module mainly performs convolution operation on an input feature map and then processes the features by using a Mish activation function.
The invention provides a double centromere aberration chromosome analysis and prediction method based on a multi-scale fusion method, wherein a neural network model is trained before regression prediction is carried out on double centromere aberration chromosomes, and the training steps are as follows:
s01: manually labeling double centromere distorted chromosomes in a plurality of sample images to generate a label file, wherein the label file mainly stores the central coordinates and width and height information of a double centromere distorted chromosome boundary frame in the sample images and is used as a training sample of a neural network model for automatic analysis of the double centromere distorted chromosomes; the number of sample images is more than 3.
S02: combining the sample image original image and the label file to prepare a training set for training the neural network model;
s03: in the training process, a training sample outputs three feature tensors through a designed neural network model, a Loss value between the output feature tensor and a label file of a sample image is calculated by adopting a Loss function Loss, the Loss value is reversely transmitted to each node of the neural network model, and a weight parameter in the neural network model is updated by utilizing an Adam optimization algorithm. And obtaining a neural network model capable of accurately performing regression prediction on the double centromere aberration chromosome bounding box through multiple iterative training.
The Loss function Loss for neural network model training mainly comprises confidence coefficient Loss for judging whether double-stained abnormal chromosome existscAnd bounding Box regression Loss of aberrated chromosomesbTwo parts.
The method for training the neural network model by adopting the cross entropy loss function can lead the neural network model to be slowly converged and possibly not be optimized to be optimal in the iteration process of a large number of simple sample images, and the invention adds a gamma (gamma) on the basis of the original cross entropy loss function>0) Factor to reduce the Loss of confidence Loss of grid cells in the absence of doubly-pigmented aberrated chromosomescThe neural network model is focused on the position of chromosome with double-stained granule aberration difficult to distinguish on the sample image. In addition, a balance factor α is added to balance grid cells in the image for the presence of chromosomes with double-colored grain aberrations and for chromosomes without double-colored grain aberrationsThe confidence loss ratio of the grid cell. Loss of confidence function LosscAs shown below, where p represents the probability that the neural network model predicts whether a double centromere distorted chromosome is present in each grid cell, and y represents the corresponding signature file:
Figure BDA0002784694240000061
in order to better perform regression prediction on the boundary frame of the double centromere aberrated chromosome, the invention adopts a Complete Intersection ratio Loss function (Complete Intersection Over Union Loss) to calculate the Loss values of the regression-predicted double centromere aberrated chromosome boundary frame and an actual boundary frame, and measures whether the regression prediction on the double centromere aberrated chromosome boundary frame is accurate or not by calculating the overlapping area, the distance of a central point and the length-width ratio of the predicted boundary frame and the actual boundary frame. Bounding box regression Loss function LossbAs shown below, DC represents the diagonal distance of the minimum closure area containing both the prediction bounding box and the actual bounding box, DP represents the distance between the center points of the prediction bounding box and the actual bounding box, and the influence factor δ is a parameter used to measure the aspect ratio consistency of the prediction bounding box and the actual bounding box. The IOU is an intersection-to-union ratio, which is calculated as the ratio of the intersection and union of the predicted bounding box and the actual bounding box.
Figure BDA0002784694240000062
The formula of the influence factor delta is shown as follows, wherein w and h respectively represent the actual width and height of the boundary box of the double centromere aberration chromosome tag file used for training the neural network model, w 'and h' respectively represent the predicted values of the width and height of the boundary box of the double centromere aberration chromosome tag file by the neural network model, and the influence factor delta is set by calculating the difference of the width-height ratio of the boundary box of the tag file and the width-height ratio of the regression prediction boundary box. The neural network model learns the aspect ratio of the double centromere aberration chromosome through delta, and the accuracy of regression prediction of the neural network model on the double centromere aberration chromosome bounding box is improved.
Figure BDA0002784694240000071
To solve the confidence LosscAnd bounding Box regression Loss of aberrated chromosomesbFor the problem that the total Loss value has different influence weights, the invention balances the confidence coefficient Loss function Loss by adding the weight value lambdacAnd bounding Box regression Loss function LossbOccupation in the total Loss function Loss. Finally, the Loss function Loss of the neural network model is designed as follows.
Loss=Lossc+λLossb
In order to improve the training speed and accuracy of the neural network model, the method carries out cluster analysis on the width and height values of the double centromere distorted chromosome target boundary box in the training sample, counts 9 cluster centers with the width and the height of the double centromere distorted chromosome target boundary box in the training sample, and sets the size of an anchor frame adopted by the neural network model in prediction on each scale according to the width and height values of the 9 cluster centers so as to strengthen the accuracy of the regression prediction of the training neural network model on the double centromere distorted chromosome boundary box.
And (3) adopting a Drop Block (convolution) mode in the training of the neural network model to prevent the neural network model from being over-fitted. By adopting the method, the characteristics of the adjacent area units in the image can be reduced in a spatial mode. The Drop Block mode is applied to the convolutional layer and the jump connection layer, so that the screening capability of the neural network model on the double centromere distortion chromosome characteristics can be improved, and the robustness of the neural network model is improved. In addition, the accuracy of the neural network model for detecting the double centromere aberration chromosome and the robustness for super-parameter selection can be improved by gradually increasing the number of Drop Block units in the training process.
Training the designed neural network model on a training set, calculating the model Loss value by adopting a Loss function Loss, reversely transmitting and updating the weight value of the model, and training for multiple times until the neural network model is completely converged to obtain the neural network model for detecting the double centromere distortion chromosome.
After the neural network model is trained, the following steps are carried out to carry out prediction analysis on the double centromere aberration chromosome:
s1: dividing a chromosome image to be analyzed into S multiplied by S grid units in advance, wherein each grid unit is responsible for predicting the probability of double centromere distortion chromosomes at the position; reading a chromosome image to be analyzed by using a trained neural network model and extracting deep features of the input chromosome image to be analyzed through a trunk neural network CSPN; the input chromosome image to be analyzed is shown in fig. 5. The trunk neural network CSPN can repeatedly utilize the extracted double centromere aberration chromosome features, meanwhile, the excessive repeated gradient information is prevented from being used for training a neural network model by intercepting the gradient flow, and the extraction capability of the trunk neural network CSPN to the double centromere aberration chromosome features is improved by adopting a hierarchical feature fusion strategy.
S2: after deep features are extracted by the CSPN of the trunk neural network, key features are extracted through Drop Block (convolution) operation and Spatial Pyramid Pooling (SPP) operation in sequence.
The Drop Block mode is adopted to spatially reduce the image characteristics of the double centromere distortion chromosome, so that the problem of over-fitting of the neural network model caused by redundant characteristics can be effectively prevented, and the robustness of the neural network model is enhanced. SPP operations can more efficiently refine the double centromere aberrated chromosome features extracted by the trunk neural network CSPN. In addition, the characteristics of the double centromere aberration chromosomes with different sizes can be fused in the spatial pyramid pooling operation, and the detection accuracy of the neural network model on the double centromere aberration chromosomes with different sizes is improved. The structure of the SPP operation module is shown in fig. 4, the SPP performs pooling operation in three dimensions on the input feature map by using a filling mode, merges the pooling result with the input feature map, and outputs a feature tensor with the depth expanded to four times, the feature tensor integrates the pooling operation results of the distorted chromosome features in different scales, and the feature range received by a subsequent feature fusion network is expanded.
S3: and transmitting and fusing the high-level Feature information of the chromosome image to be analyzed after the spatial Pyramid pooling operation in an up-sampling mode from top to bottom by adopting a Feature Pyramid Network (FPN). The FPN can only transmit the strong semantic features of the double centromere aberrated chromosomes extracted by the high-level neural network downwards, only enhance semantic information and do not transmit the positioning information of the detected double centromere aberrated chromosomes, so that the regression prediction of the neural network model on the double centromere aberrated chromosome bounding boxes is not facilitated. Therefore, the invention adds a bottom-up Path Aggregation Network (PAN) behind the FPN, supplements the position information of the double centromere aberrated chromosome to the FPN through the PAN, and transmits the low-level strong localization characteristic upwards so as to strengthen the detection accuracy of the neural Network model on the double centromere aberrated chromosome. And fusing semantic features and position features of the chromosome with double chromosome aberration through FPN and PAN, and outputting three feature tensors on three scales respectively.
S4: three feature tensors extracted by a neural network model are subjected to Drop Block (convolution processing) operation to output three prediction tensors, the sizes (width × height × depth) of the three prediction tensors are respectively 19 × 19 × 15, 38 × 38 × 15 and 76 × 76 × 15, the width and the height of the prediction tensors respectively represent the width and the height of a bounding box regression prediction of the neural network model, the depth 15 represents that the prediction tensors predict 3 bounding boxes on each grid unit according to the size of an anchor box, each bounding box predicts a confidence value c, the width (w) of the bounding box, the height (h) of the bounding box and a central coordinate value (x, y), and therefore each prediction tensor outputs 15 prediction values on each grid unit. The anchor size settings for the bounding box prediction on three scales are shown below, with anchor1 for the prediction on the 19 × 19 scale, anchor2 for the prediction on the 38 × 38 scale, and anchor3 for the prediction on the 76 × 76 scale.
anchor1=[97,116,80,170,145,120]
anchor2=[78,87,110,66,156,61]
anchor3=[48,73,76,50,53,116]
S5: merging the regression predicted double centromere aberration chromosome bounding boxes in the three prediction tensors, and screening the predicted bounding boxes by adopting a distance cross-over ratio non-maximum value inhibition algorithm (DIOU NMS). The algorithm firstly sorts all regression prediction bounding boxes according to the confidence values c of the regression prediction bounding boxes, marks the bounding box with the maximum confidence value c, and then calculates the DIOU values of all the predicted bounding boxes and the bounding box with the maximum confidence value c. And deleting the bounding boxes with the DIOU values exceeding the threshold range, then continuing to sequence the bounding boxes according to the confidence values c and repeating the steps, and finally taking all the marked bounding boxes as the bounding boxes of the double centromere aberration chromosome targets output by the neural network model. The DIOU calculation is as follows:
Figure BDA0002784694240000091
s6: and finally, counting the number of the double centromere distorted chromosomes predicted by the neural network model and the information of the boundary box, marking the recognized double centromere distorted chromosomes on the input chromosome image to be analyzed, and outputting the marked image to finish the automatic analysis and prediction of the double centromere distorted chromosomes. The output image is shown in fig. 6.
In the embodiment, a neural network model for automatically analyzing the double centromere aberrated chromosome is trained by using an artificially labeled double centromere aberrated chromosome data set, and regression prediction is performed on a bounding box of a double centromere aberrated chromosome target by using the neural network model obtained by training, so that the double centromere aberrated chromosome possibly existing in a chromosome image to be analyzed is automatically predicted and analyzed.
In summary, the above-mentioned embodiments are only preferred embodiments of the present invention, and all equivalent changes and modifications made in the claims of the present invention should be covered by the claims of the present invention.

Claims (10)

1. The double centromere aberration chromosome analysis and prediction method based on the multi-scale fusion method is characterized by comprising the following steps,
s1: reading a chromosome image to be analyzed by the trained neural network model and extracting deep features of the input chromosome image to be analyzed through a trunk neural network CSPN;
s2: refining key features of the deep features through Drop Block operation and spatial pyramid pooling operation;
s3, transmitting and fusing the high-level feature information of the chromosome image to be analyzed after the spatial pyramid pooling operation from top to bottom by adopting a feature pyramid network in an up-sampling manner, then adding a bottom-up path aggregation network at the output end of the feature pyramid network to supplement the position features of the double centromere distorted chromosome, and outputting three feature tensors on three scales respectively after transmitting and fusing the low-level strong positioning features upwards;
s4: outputting three predicted tensors after three feature tensors extracted by the neural network model are subjected to Drop Block operation;
s5: merging the regression-predicted double centromere aberration chromosome bounding boxes in the three prediction tensors, and screening the predicted bounding boxes by adopting a DIOU NMS algorithm;
s6: and finally, counting the number of the double centromere distorted chromosomes predicted by the neural network model and the information of the boundary box, marking the identified double centromere distorted chromosomes on the input chromosome image to be analyzed, and outputting the marked image.
2. The method for analyzing and predicting the chromosome with double centromere aberration based on the multi-scale fusion method according to claim 1, wherein the training step of the neural network model is as follows,
s01: manually labeling double centromere distorted chromosomes in a plurality of sample images to generate a label file, wherein the label file mainly stores the central coordinates and width and height information of a double centromere distorted chromosome boundary frame in the sample images and is used as a training sample of a neural network model for automatic analysis of the double centromere distorted chromosomes;
s02: combining an original image of the sample image and the label file to prepare a training set for training a neural network model;
s03: training a sample to output three feature tensors through a designed neural network model, calculating a Loss value between the output feature tensor and a label file of a sample image by adopting a Loss function Loss, reversely transmitting the Loss value to each node of the neural network model, and updating weight parameters in the neural network model; and obtaining a neural network model capable of accurately performing regression prediction on the double centromere aberration chromosome bounding box after multiple iterative training.
3. The method for analyzing and predicting double centromere aberrance chromosomes according to claim 1, wherein in step S1, the chromosome image to be analyzed is divided into S × S grid cells in advance, and each grid cell is responsible for predicting the probability of the existence of the double centromere aberrance chromosome at the position.
4. The method for analyzing and predicting double centromere aberration chromosome based on multi-scale fusion method according to claim 3, wherein in step S4, the sizes (width x height x depth) of three prediction tensors are 19 x 15, 38 x 15 and 76 x 15 respectively, wherein the width and the height of the prediction tensors respectively represent the width and the height of the bounding box regression prediction of the neural network model, the depth 15 represents that the prediction tensors predict 3 bounding boxes on each grid unit according to the anchor box size, each bounding box predicts a confidence value c, the width of the bounding box, the height of the bounding box and the central coordinate value (x, y), so that each prediction tensor will output 15 prediction values on each grid unit.
5. The method for analyzing and predicting the chromosome with double centromere aberration based on multi-scale fusion method according to claim 1, wherein in step S5, the DIOU NMS algorithm is to first sort all the bounding boxes of regression prediction according to their confidence values c, label the bounding box with the highest confidence value c, and then calculate the DIOU values of all the bounding boxes obtained by prediction and the bounding box with the highest confidence value c; and deleting the bounding boxes with the DIOU values exceeding the threshold range, then continuing to sequence the bounding boxes according to the confidence values c and repeating the steps, and finally taking all the marked bounding boxes as the bounding boxes of the double centromere aberration chromosome targets output by the neural network model.
6. The method for analyzing and predicting double centromere aberration chromosome based on multi-scale fusion method according to claim 5,
Figure FDA0002784694230000021
in the formula, DC represents a diagonal distance of a minimum closure area including both the prediction bounding box and the actual bounding box, and DP represents a distance between center points of the prediction bounding box and the actual bounding box.
7. The method for analyzing and predicting double centromere aberrated chromosomes according to claim 2, wherein in step S03, the Loss function Loss for training the neural network model mainly comprises a confidence Loss for determining whether double centromere aberrated chromosomes existcAnd bounding Box regression Loss of aberrated chromosomesbTwo parts, Loss function Loss ═ Lossc+λLossbIn the formula, the confidence coefficient Loss function Loss is equalized by adding the weight value lambdacAnd bounding Box regression Loss function LossbOccupation in the total Loss function Loss.
8. The method for analyzing and predicting double centromere aberration chromosome based on multi-scale fusion method as claimed in claim 7, wherein the confidence coefficient loss function
Figure FDA0002784694230000022
In the formula, a gamma factor is added to reduce the Loss of confidence of grid cells without chromosome with double-colored granule aberrationc(ii) a Adding a balance factor alpha to balance grid cells of chromosomes with double-chromogran aberration in chromosome images to be analyzed and balance grid cells of chromosomes without double-chromogran aberrationThe confidence loss ratio of the grid cells of the variant chromosomes; p represents the probability that the neural network model predicts whether a double centromere aberration chromosome exists in each grid cell, and y represents the corresponding label file.
9. The method for analyzing and predicting double centromere aberration chromosome based on multi-scale fusion method as claimed in claim 7, wherein the bounding box regression loss function
Figure FDA0002784694230000031
In the formula, DC represents a diagonal distance of a minimum closure area including both the prediction bounding box and the actual bounding box, DP represents a distance between center points of the prediction bounding box and the actual bounding box, and the influence factor δ is a parameter for measuring the aspect ratio consistency of the prediction bounding box and the actual bounding box.
10. The method for analyzing and predicting double centromere aberration chromosome based on multi-scale fusion method as claimed in claim 9, wherein the influence factor
Figure FDA0002784694230000032
In the formula, w and h respectively represent the actual width and height of the bounding box of the double centromere aberration chromosome tag file used for the training of the neural network model, and w 'and h' respectively represent the predicted values of the neural network model for the width and height of the bounding box of the double centromere aberration chromosome.
CN202011293549.6A 2020-11-18 2020-11-18 Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method Pending CN112381806A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011293549.6A CN112381806A (en) 2020-11-18 2020-11-18 Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011293549.6A CN112381806A (en) 2020-11-18 2020-11-18 Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method

Publications (1)

Publication Number Publication Date
CN112381806A true CN112381806A (en) 2021-02-19

Family

ID=74585095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011293549.6A Pending CN112381806A (en) 2020-11-18 2020-11-18 Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method

Country Status (1)

Country Link
CN (1) CN112381806A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990373A (en) * 2021-04-28 2021-06-18 四川大学 Convolution twin point network blade profile splicing system based on multi-scale feature fusion
CN113537182A (en) * 2021-09-17 2021-10-22 北京慧荣和科技有限公司 Automatic identification method and system for metaphase mitosis microscopic image of chromosome
CN113807259A (en) * 2021-09-18 2021-12-17 上海北昂医药科技股份有限公司 Chromosome division facies positioning and sequencing method based on multi-scale feature fusion
JP2023527615A (en) * 2021-04-28 2023-06-30 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Target object detection model training method, target object detection method, device, electronic device, storage medium and computer program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879996A (en) * 2019-12-03 2020-03-13 上海北昂医药科技股份有限公司 Chromosome split phase positioning and sequencing method
CN111832513A (en) * 2020-07-21 2020-10-27 西安电子科技大学 Real-time football target detection method based on neural network
CN111951266A (en) * 2020-09-01 2020-11-17 厦门汉舒捷医疗科技有限公司 Artificial intelligence recognition analysis method for chromosome aberration

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879996A (en) * 2019-12-03 2020-03-13 上海北昂医药科技股份有限公司 Chromosome split phase positioning and sequencing method
CN111832513A (en) * 2020-07-21 2020-10-27 西安电子科技大学 Real-time football target detection method based on neural network
CN111951266A (en) * 2020-09-01 2020-11-17 厦门汉舒捷医疗科技有限公司 Artificial intelligence recognition analysis method for chromosome aberration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
智能算法: ""目标检测算法YOLOv4详解"", 《微信公众号》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990373A (en) * 2021-04-28 2021-06-18 四川大学 Convolution twin point network blade profile splicing system based on multi-scale feature fusion
CN112990373B (en) * 2021-04-28 2021-08-03 四川大学 Convolution twin point network blade profile splicing system based on multi-scale feature fusion
JP2023527615A (en) * 2021-04-28 2023-06-30 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Target object detection model training method, target object detection method, device, electronic device, storage medium and computer program
CN113537182A (en) * 2021-09-17 2021-10-22 北京慧荣和科技有限公司 Automatic identification method and system for metaphase mitosis microscopic image of chromosome
CN113807259A (en) * 2021-09-18 2021-12-17 上海北昂医药科技股份有限公司 Chromosome division facies positioning and sequencing method based on multi-scale feature fusion

Similar Documents

Publication Publication Date Title
CN108830188B (en) Vehicle detection method based on deep learning
CN109977812B (en) Vehicle-mounted video target detection method based on deep learning
CN109583425B (en) Remote sensing image ship integrated recognition method based on deep learning
CN112381806A (en) Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method
CN112288706B (en) Automatic chromosome karyotype analysis and abnormality detection method
CN111259930A (en) General target detection method of self-adaptive attention guidance mechanism
CN110018524B (en) X-ray security inspection contraband identification method based on vision-attribute
CN111695482A (en) Pipeline defect identification method
CN108564085B (en) Method for automatically reading of pointer type instrument
CN111310756B (en) Damaged corn particle detection and classification method based on deep learning
CN108009518A (en) A kind of stratification traffic mark recognition methods based on quick two points of convolutional neural networks
CN108711148B (en) Tire defect intelligent detection method based on deep learning
CN112365497A (en) High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
CN111783819B (en) Improved target detection method based on region of interest training on small-scale data set
CN112819821B (en) Cell nucleus image detection method
CN116342894B (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN115439458A (en) Industrial image defect target detection algorithm based on depth map attention
CN106845458A (en) A kind of rapid transit label detection method of the learning machine that transfinited based on core
CN115359264A (en) Intensive distribution adhesion cell deep learning identification method
CN115719475A (en) Three-stage trackside equipment fault automatic detection method based on deep learning
CN115439654A (en) Method and system for finely dividing weakly supervised farmland plots under dynamic constraint
CN117315380B (en) Deep learning-based pneumonia CT image classification method and system
CN116630301A (en) Strip steel surface small target defect detection method and system based on super resolution and YOLOv8
CN110889418A (en) Gas contour identification method
CN115830302A (en) Multi-scale feature extraction and fusion power distribution network equipment positioning identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210219