CN114998230A - Pharynx swab oral cavity nucleic acid sampling area image identification method - Google Patents

Pharynx swab oral cavity nucleic acid sampling area image identification method Download PDF

Info

Publication number
CN114998230A
CN114998230A CN202210563500.0A CN202210563500A CN114998230A CN 114998230 A CN114998230 A CN 114998230A CN 202210563500 A CN202210563500 A CN 202210563500A CN 114998230 A CN114998230 A CN 114998230A
Authority
CN
China
Prior art keywords
oral cavity
convolution
layer
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210563500.0A
Other languages
Chinese (zh)
Inventor
朱天军
张闯
梁建国
李伟豪
韩诗婷
陈悦任
蔡淼纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhaoqing University
Original Assignee
Zhaoqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhaoqing University filed Critical Zhaoqing University
Priority to CN202210563500.0A priority Critical patent/CN114998230A/en
Publication of CN114998230A publication Critical patent/CN114998230A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Abstract

The invention discloses a method for identifying an image of a nucleic acid sampling area of a throat swab oral cavity, which relates to the technical field of image segmentation and has the technical scheme that: s1: acquiring oral cavity images of people of different ages by a robot in different environments, and taking the acquired images as a training set, a verification set and a test set; s2: training the acquired images based on a Deeplab V3+ network model; s3: and carrying out verification analysis on the trained oral cavity M region segmentation model. According to the conclusion obtained through experiments, the method can effectively distinguish the oral cavity M area.

Description

Pharyngeal swab oral nucleic acid sampling area image identification method
Technical Field
The invention relates to the technical field of image segmentation, in particular to a method for identifying an image of a nucleic acid sampling area of a throat swab oral cavity.
Background
At present, the COVID-19 detection method mainly adopts a throat swab method for sample collection. During sample sampling, medical personnel need to be in intimate contact with the patient; when a patient coughs, a large amount of spray or aerosol is produced, and the medical staff has a high risk of infection. In addition, in the collection process of the pharyngeal swab, due to different levels and irregular collection operations of samples collected by medical staff, differences exist in the quality of the pharyngeal swab, false negative cannot be avoided, and misdiagnosis risks exist.
To avoid prolonged exposure of medical personnel to high risk areas, pharyngeal swab sample collection is accomplished with a sampling robot. In the process of throat swab sampling by the robot, how to accurately identify the sampling area (M area) of the oral cavity is extremely important, and the robot plays a leading role in the sampling process.
The sampled picture needs to be subjected to image segmentation, and the purpose of the image segmentation is to separate an object from a background, but the existing image identification method has the problem that the boundary of the extracted oral cavity M region is discontinuous or fuzzy.
Disclosure of Invention
The invention aims to provide a method for identifying an image of a pharyngeal swab oral cavity nucleic acid sampling area, which can effectively distinguish an M area from a background area and has good segmentation performance.
The technical purpose of the invention is realized by the following technical scheme: a throat swab oral cavity nucleic acid sampling area image identification method specifically comprises the following steps:
s1: acquiring oral cavity images of people of different ages by a robot in different environments, and taking the acquired images as a training set, a verification set and a test set;
s2: training the acquired images based on a Deeplab V3+ network model;
s3: carrying out verification analysis on the trained oral cavity M region segmentation model;
the specific steps for training the Deeplab V3+ model in the S2 are as follows:
1) and (3) the training set, the verification set and the test set are adjusted according to the following conditions of 8: 1: 1, firstly, inputting an image into an Encoder, and obtaining two characteristic layers which are respectively a shallow effective characteristic layer and a deep effective characteristic layer after processing by a Deep Convolutional Neural Network (DCNN);
2) after carrying out 1 × 1 convolution on the shallow effective characteristic layer, entering a Decoder, and stacking results of four times of up-sampling with the characteristic layer with high semantic information;
3) after 3 × 3 convolution is carried out on the stacked characteristic layers, up-sampling is carried out for four times to obtain a final effective characteristic layer, namely the characteristic concentration of the whole picture;
4) utilizing resize to enable the height and width of a final output layer to be the same as the size of an original picture;
the specific steps of obtaining the feature layer of the high semantic information in the step 2) are as follows:
(1) respectively performing 1 × 1 convolution on the deep effective characteristic layers obtained in the step 1) by utilizing an ASPP structure, performing 3 × 3 convolution according to expansion ratios of 6, 12 and 18 respectively, and performing image pooling to obtain 5 effective characteristic layers.
(2) And stacking the 5 effective feature layers, and adjusting the number of channels by using 1 × 1 convolution to obtain the feature layer with high semantic information.
2. The method of claim 1, wherein the method comprises the steps of: the Xeception backbone network of the Deeplab V3+ model, Xception is the extreme of the inclusion.
In conclusion, the invention has the following beneficial effects:
1. according to the depth separable convolution, a Deeplab V3+ model is provided, the expansion convolution with different expansion rates is adopted for feature extraction, the receptive field of the network is improved, the network has different feature perception conditions, and therefore M areas with discontinuous or fuzzy pharyngeal swab image boundaries can be effectively segmented, and the segmentation precision is higher than that of other networks;
2. the Deeplab V3+ model adopts an Xception network, the Xception is completely decoupled into depth separable convolution, and the mapping of cross channel correlation and spatial correlation in the convolutional neural network feature mapping can be completely decoupled.
Drawings
FIG. 1 is a schematic flow chart of a method for identifying an image of a pharyngeal swab oral nucleic acid sampling area according to an embodiment of the present invention;
FIG. 2 is a network architecture diagram of an Xconcept in an embodiment of the present invention;
FIG. 3 is a graph of the loss function obtained from training in an embodiment of the present invention;
FIG. 4 is a comparison graph of an input lost frame image and filled gray levels in an embodiment of the present invention;
FIG. 5 is a comparison graph of M region results obtained by segmenting the mobilenetv2 and the Xconcentration network of the Deeplab V3+ model in the embodiment of the present invention;
FIG. 6 is a graph of the test results of a standard oral cavity in an embodiment of the invention;
FIG. 7 is a graph of the effectiveness of non-standard oral tests in an embodiment of the present invention, wherein the non-standard oral tests do not show M region;
FIG. 8 is a training set picture in an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to figures 1-4.
Example (b): a method for identifying an image of a pharyngeal swab oral cavity nucleic acid sampling area is disclosed, as shown in figures 1 to 4, and specifically comprises the following steps:
s1: acquiring oral cavity images of people of different ages by a robot in different environments, and taking the acquired images as a training set, a verification set and a test set;
s2: training the acquired images based on a Deeplab V3+ network model; the Deeplab V3+ model is shown in FIG. 1, and the Deeplab V3+ model comprises an Encoder part and a Decode part;
s3: carrying out verification analysis on the trained oral cavity M region segmentation model;
the specific steps for training the Deeplab V3+ model in S2 are as follows:
1) and (3) the training set, the verification set and the test set are adjusted according to the following conditions of 8: 1: 1, firstly, inputting an image into an Encoder, and obtaining two characteristic layers which are respectively a shallow effective characteristic layer and a deep effective characteristic layer after processing by a Deep Convolutional Neural Network (DCNN);
2) after carrying out 1 × 1 convolution on the shallow effective characteristic layer, entering a Decoder, and stacking results of four times of up-sampling with the characteristic layer with high semantic information;
3) after 3 × 3 convolution is carried out on the stacked characteristic layers, up-sampling is carried out for four times to obtain a final effective characteristic layer, namely the characteristic concentration of the whole picture;
4) utilizing resize to enable the height and width of a final output layer to be the same as the size of an original picture;
2) the specific steps of obtaining the feature layer of the high semantic information are as follows:
(1) performing 1 × 1 convolution on the deep effective characteristic layers obtained in the step 1) by using an ASPP structure, performing 3 × 3 convolution on the deep effective characteristic layers according to expansion ratios of 6, 12 and 18 respectively, and performing image pooling to obtain 5 effective characteristic layers.
(2) And stacking the 5 effective feature layers, and adjusting the number of channels by using 1 × 1 convolution to obtain the feature layer with high semantic information.
The Xeception backbone network of the Deeplab V3+ model, Xception is the extreme of the inclusion.
An Xception backbone network of the deplab v3+ model; the Xceptance is extremely induced, the inclusion structure is an intermediate form between the traditional convolution and the depth separable convolution, the Xceptance is completely decoupled into the depth separable convolution, and the mapping of the cross-channel correlation and the space correlation in the convolutional neural network feature mapping can be completely decoupled. The Xception network system structure has 36 convolution layers which form a network feature extraction library, and the 36 convolution layers have 14 modules, and the rest modules except the first module and the last module are in linear connection. Data first passes through the ingress stream, then through the intermediate stream, which is repeated 8 times, and finally through the egress stream.
Experimental data:
oral cavity images of 81 volunteers such as children, young people, middle-aged people and old people are collected in different environments, 1569 oral cavity images are collected in total, and a data set is expanded to 7845 throat swab images through different horizontal turning and rotation; a professional throat swab nucleic acid sampling doctor labels each picture by using a labelme labeling tool to generate a corresponding label file, and a manufactured training set is shown in FIG. 8, wherein 80% of images are used as the training set, 10% of images are used as a verification set, and 10% of images are used as a test set.
The environment configuration of model training is shown as the following table:
Figure BDA0003656758150000061
the trunk extraction network used by the Backbone is Xconvergence, and the model training is divided into a freezing stage and a thawing stage; a freezing stage: the main network of the model is frozen, the feature extraction network is not changed, the occupied video memory is small, and only the network is finely adjusted; and (3) a thawing stage: the main network of the model is not frozen, the feature extraction network is changed, the occupied video memory is large, and all parameters of the network are changed. Training the model by utilizing the environment of the pyrrch-gpu, wherein the proportion of a training set, a verification set and a test set in the training process is 8: 1: the LOSS function adopted by training consists of two parts, namely a common Cross Entropy LOSS function (Cross Entropy LOSS) and a set similarity measurement function (Dice LOSS); the common cross entropy loss function formula is:
Figure BDA0003656758150000062
wherein C represents the number of classes, the number of classes in this document is 1, p i As true value, q i Is a predicted value; the set similarity metric function is generally used to calculate the similarity between two samples, and is formulated as:
Figure BDA0003656758150000063
| X ≦ Y | is the intersection between X and Y; | X | YAnd | Y | represent the number of elements of X and Y, respectively. Among them, the coefficient 2 in the numerator is a reason why the denominator exists to repeatedly calculate the common element between X and Y. The larger the Dice coefficient is, the better the score is, the larger the degree of coincidence between the predicted result and the true result is. As LOSS, the smaller the LOSS, the better the die LOSS becomes 1-die, and LOSS is regarded as semantic segmentation. The loss function resulting from training is shown in fig. 3. From the curves, it can be seen that the loss value curve (value 0.08613) of the training set and the loss value curve (value 0.10026) of the test set gradually decrease and converge, and the trained model is ideal.
Image processing:
for the trained model, firstly, resize processing is performed on the input image, the processing mode can generate image distortion phenomenon, in order to ensure that the image is not distorted in the resize process, gray bars are filled in the vacant positions of the image, and the comparison between the distorted image and the image filled with the gray bars is shown in fig. 4. For the image added with the gray bars, normalization, adjustment of the number of channels to the first dimension and addition of the batch-size dimension are required. Because the prediction result comprises a gray strip part, the gray strip needs to be cut, and a new graph is created by utilizing seg-img to be the same as the original graph in size.
Evaluation indexes are as follows:
in order to evaluate the segmentation effect of the M region, Accuracy (Accuracy), Recall (Recall) and precision (precision) are used as evaluation indexes, and the calculation formula is as follows:
Figure BDA0003656758150000071
Figure BDA0003656758150000072
Figure BDA0003656758150000073
wherein tp (true positive) predicts the positive class, i.e. true class; FP (false positive) negative class is predicted positive class, and is false positive class; TN (true negative) predicts the negative class, i.e. true negative class; FN (false negative) positive class is predicted as negative class, and is false negative class. Judging the segmentation result of the image, judging and classifying by a professional throat swab nucleic acid sampling doctor, and determining a corresponding value.
Carrying out further test analysis on the model, collecting 205 new oral cavity images as a test set, wherein the new oral cavity images comprise 182 standard oral cavity images and 23 non-standard oral cavity images, judging whether the prediction result meets the standard by a professional throat swab nucleic acid sampling doctor, and the effect of a standard oral cavity prediction M area and a background area is shown in fig. 6; the effect of the non-standard oral outcome prediction M region with the background region is shown in fig. 7. The detection classification results are shown in table 1, and the model classification results are: 169 TP, 4 FP, 19 TN and 13 FN, and the calculated accuacy, Recall and Precision are respectively 92.12%, 92.86% and 97.69%.
N P
F 13 4
T 18 169
TABLE 1 results of the classification
And (3) experimental comparison: in order to verify the segmentation effect of the Deeplab V3+ model, the mobilenetv2 and Xception networks of the U-Net and Deeplab V3+ models are selected for comparison, and the values of Accuracy, Recall and Precision in different networks after the test of the test set are shown in Table 2;
Accuracy Recall Precision
U-Net 75.45% 79.56% 89.67%
Mobilenetv2 77.07% 81.87% 91.41%
Xception 92.12% 92.86% 97.69%
TABLE 2
The results of segmentation of Mobilenetv2 and Xception are shown in fig. 5. The first two rows of the first column contain pharyngeal swab images of the M regions, and the second two rows do not contain pharyngeal swab images of the M regions; the first two rows of the second column are M area outlines marked by doctors, and the non-standard acquisition areas of the last two rows are not marked; the third column and the fourth column are the effects of segmenting the M region and the background region by Mobilenetv2 and Xception, respectively. Experimental results show that the improved Xception can effectively distinguish an M region from a background region, and the M region contour prediction is more accurate.
The working principle is as follows: according to the depth separable convolution, a Deeplab V3+ model is provided, the expansion convolution with different expansion rates is adopted for feature extraction, the receptive field of the network is improved, the network has different feature receptive conditions, and therefore M areas with discontinuous or fuzzy pharyngeal swab image boundaries can be effectively segmented, and the segmentation progress is higher than that of other networks; the Deeplab V3+ model adopts an Xception network, the Xception is completely decoupled into depth separable convolution, and the mapping of cross channel correlation and spatial correlation in the convolutional neural network feature mapping can be completely decoupled.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.

Claims (2)

1. A throat swab oral cavity nucleic acid sampling area image identification method is characterized in that: the method specifically comprises the following steps:
s1: acquiring oral cavity images of people of different ages by a robot in different environments, and taking the acquired images as a training set, a verification set and a test set;
s2: training the acquired images based on a Deeplab V3+ network model;
s3: carrying out verification analysis on the trained oral cavity M region segmentation model;
the specific steps for training the Deeplab V3+ model in the S2 are as follows:
1) and (3) the training set, the verification set and the test set are adjusted according to the following conditions of 8: 1: 1, firstly, inputting an image into an Encoder, and obtaining two characteristic layers which are respectively a shallow effective characteristic layer and a deep effective characteristic layer after processing by a Deep Convolutional Neural Network (DCNN);
2) after carrying out 1 multiplied by 1 convolution on the shallow effective characteristic layer, entering the Decoder, and stacking the result of carrying out up-sampling for four times with the characteristic layer with high semantic information;
3) after 3 × 3 convolution is carried out on the stacked characteristic layers, up-sampling is carried out for four times to obtain a final effective characteristic layer, namely the characteristic concentration of the whole picture;
4) utilizing resize to enable the height and width of a final output layer to be the same as the size of an original picture;
the specific steps of obtaining the feature layer of the high semantic information in the step 2) are as follows:
(1) respectively performing 1 × 1 convolution on the deep effective characteristic layers obtained in the step 1) by utilizing an ASPP structure, performing 3 × 3 convolution according to expansion ratios of 6, 12 and 18 respectively, and performing image pooling to obtain 5 effective characteristic layers.
(2) And stacking 5 effective feature layers, and adjusting the number of channels by using 1 multiplied by 1 convolution to obtain the feature layer with high semantic information.
2. The method of claim 1, wherein the method comprises the steps of: the Xeception backbone network of the Deeplab V3+ model, Xception is the extreme of the inclusion.
CN202210563500.0A 2022-05-23 2022-05-23 Pharynx swab oral cavity nucleic acid sampling area image identification method Pending CN114998230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210563500.0A CN114998230A (en) 2022-05-23 2022-05-23 Pharynx swab oral cavity nucleic acid sampling area image identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210563500.0A CN114998230A (en) 2022-05-23 2022-05-23 Pharynx swab oral cavity nucleic acid sampling area image identification method

Publications (1)

Publication Number Publication Date
CN114998230A true CN114998230A (en) 2022-09-02

Family

ID=83027697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210563500.0A Pending CN114998230A (en) 2022-05-23 2022-05-23 Pharynx swab oral cavity nucleic acid sampling area image identification method

Country Status (1)

Country Link
CN (1) CN114998230A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116109982A (en) * 2023-02-16 2023-05-12 哈尔滨星云智造科技有限公司 Biological sample collection validity checking method based on artificial intelligence
CN116129112A (en) * 2022-12-28 2023-05-16 深圳市人工智能与机器人研究院 Oral cavity three-dimensional point cloud segmentation method of nucleic acid detection robot and robot

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489019A (en) * 2020-12-01 2021-03-12 合肥工业大学 Method for rapidly identifying chopped fibers in GFRC image based on deep learning
CN112508977A (en) * 2020-12-29 2021-03-16 天津科技大学 Deep learning-based semantic segmentation method for automatic driving scene

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112489019A (en) * 2020-12-01 2021-03-12 合肥工业大学 Method for rapidly identifying chopped fibers in GFRC image based on deep learning
CN112508977A (en) * 2020-12-29 2021-03-16 天津科技大学 Deep learning-based semantic segmentation method for automatic driving scene

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129112A (en) * 2022-12-28 2023-05-16 深圳市人工智能与机器人研究院 Oral cavity three-dimensional point cloud segmentation method of nucleic acid detection robot and robot
CN116109982A (en) * 2023-02-16 2023-05-12 哈尔滨星云智造科技有限公司 Biological sample collection validity checking method based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN114998230A (en) Pharynx swab oral cavity nucleic acid sampling area image identification method
CN111985536B (en) Based on weak supervised learning gastroscopic pathology image Classification method
CN102096819B (en) Method for segmenting images by utilizing sparse representation and dictionary learning
CN110021425B (en) Comparison detector, construction method thereof and cervical cancer cell detection method
CN108596038B (en) Method for identifying red blood cells in excrement by combining morphological segmentation and neural network
CN110097974A (en) A kind of nasopharyngeal carcinoma far-end transfer forecasting system based on deep learning algorithm
CN111524137A (en) Cell identification counting method and device based on image identification and computer equipment
CN110111895A (en) A kind of method for building up of nasopharyngeal carcinoma far-end transfer prediction model
CN113069080A (en) Difficult airway assessment method and device based on artificial intelligence
CN115984850A (en) Lightweight remote sensing image semantic segmentation method based on improved Deeplabv3+
CN114782753A (en) Lung cancer histopathology full-section classification method based on weak supervision learning and converter
CN114201632A (en) Label noisy data set amplification method for multi-label target detection task
CN112085742B (en) NAFLD ultrasonic video diagnosis method based on context attention
CN114140437A (en) Fundus hard exudate segmentation method based on deep learning
CN112927215A (en) Automatic analysis method for digestive tract biopsy pathological section
CN115690704B (en) LG-CenterNet model-based complex road scene target detection method and device
CN115908421A (en) Active learning medical image segmentation method based on superpixels and diversity
CN116563205A (en) Wheat spike counting detection method based on small target detection and improved YOLOv5
CN111259914B (en) Hyperspectral extraction method for characteristic information of tea leaves
CN114565762A (en) Weakly supervised liver tumor segmentation based on ROI and split fusion strategy
CN111783571A (en) Cervical cell automatic classification model establishment and cervical cell automatic classification method
CN110992309A (en) Fundus image segmentation method based on deep information transfer network
CN116386857B (en) Pathological analysis system and method
CN115841847B (en) Microorganism information determination and extraction system and method
CN116758068B (en) Marrow picture cell morphology analysis method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220902

RJ01 Rejection of invention patent application after publication