CN112686329A - Electronic laryngoscope image classification method based on dual-core convolution feature extraction - Google Patents
Electronic laryngoscope image classification method based on dual-core convolution feature extraction Download PDFInfo
- Publication number
- CN112686329A CN112686329A CN202110013954.6A CN202110013954A CN112686329A CN 112686329 A CN112686329 A CN 112686329A CN 202110013954 A CN202110013954 A CN 202110013954A CN 112686329 A CN112686329 A CN 112686329A
- Authority
- CN
- China
- Prior art keywords
- image
- convolution
- layer
- features
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000000605 extraction Methods 0.000 title claims abstract description 7
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 14
- 238000010606 normalization Methods 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 10
- 210000001260 vocal cord Anatomy 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000013145 classification model Methods 0.000 claims description 5
- 230000001965 increasing effect Effects 0.000 claims description 5
- 210000001989 nasopharynx Anatomy 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000003213 activating effect Effects 0.000 claims description 4
- 210000002409 epiglottis Anatomy 0.000 claims description 4
- 210000003928 nasal cavity Anatomy 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- 230000003014 reinforcing effect Effects 0.000 claims description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 2
- 238000002790 cross-validation Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 230000008447 perception Effects 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 230000006835 compression Effects 0.000 claims 1
- 238000007906 compression Methods 0.000 claims 1
- 238000012795 verification Methods 0.000 claims 1
- 230000010354 integration Effects 0.000 abstract description 2
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 abstract 1
- 238000013527 convolutional neural network Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 210000001331 nose Anatomy 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002576 laryngoscopy Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses an electronic laryngoscope image classification method based on dual-core convolution feature extraction. The invention belongs to the field of computer vision and pattern recognition. The method comprises the following steps: firstly, preprocessing a laryngoscope image, including frame dismantling and image cutting, reserving effective image information, and then adjusting the image size to 224 x 224; secondly, designing a deep convolution neural network capable of acquiring the detailed information characteristics of the image, inputting the image into the network, and extracting the high-order image characteristics with detailed information; and then, training the obtained image characteristic information into an integrated classifier by using an extreme gradient boost (Xgboost) integration method, and obtaining an electronic laryngoscope image classification result. The image features extracted by the invention have abundant detail features such as texture features, shape features, position information and the like, and the accuracy of the classification of the electronic laryngoscope image is effectively improved by combining an Xgboost integrated classification method.
Description
Technical Field
The invention belongs to the field of computer vision and pattern recognition. The method specifically comprises the following steps: the invention discloses a deep Convolutional Neural Network (CNN) capable of extracting rich detail features and an extreme gradient boosting (Xgboost) integrated classification method for classifying and identifying an electronic laryngoscope image. The method can classify and identify the electronic laryngoscope image parts with unobvious signs, such as low-order texture features, shape features, position information and the like, thereby improving the classification and identification of the nasopharynx and vocal cord closed parts in the electronic laryngoscope image and improving the accuracy of the whole classification.
Background
The electronic laryngoscope is a main auxiliary tool for diagnosing the diseases of the ear, nose and throat by doctors in the department of otolaryngology, and the analysis of images of the electronic laryngoscope is a direct reference basis for the doctors to judge the diseases of the ear, nose and throat. A large number of electronic laryngoscope video images are generated during the use of electronic laryngoscopy, where clear and high quality pictures of major organ sites are the primary basis for technicians to compose examination reports and doctors to diagnose disease. At present, photos of different parts are manually intercepted mainly by human eyes in the process of electronic laryngoscope examination, and the problems of selection omission, low efficiency, poor reliability and the like exist. Therefore, the automatic classification of the electronic laryngoscope images by using a computer-aided method is an important means for improving the diagnosis efficiency and accuracy, and is also a mainstream direction of intelligent medical treatment.
The current image classification research aiming at the electronic laryngoscope can be roughly classified into two types: one is to use the traditional image classification method, such as extracting a plurality of low-order image characteristics of information of color characteristics, texture characteristics, geometric shape characteristics, image intensity gradient direction, frequency content and the like of the laryngeal image, and then use a Support Vector Machine (SVM) to train a classification model to perform classification and identification on the laryngeal of the electronic laryngoscope image. The method only extracts low-order image features, does not extract deeper image features, and is easy to generate an overfitting phenomenon in the model. The other type is a deep learning method, and the existing classical convolutional neural network or a transfer learning method is used for carrying out classification and identification on the electronic laryngoscope image. However, the method often ignores the particularity and detail information characteristics of the electronic laryngoscope image, and easily causes low efficiency and accuracy of classification and identification.
Disclosure of Invention
Aiming at the problems of the existing electronic laryngoscope image classification and identification, the invention provides a deep CNN capable of extracting high-order image features with detail information and low-order feature enhancement, the network can extract image features with rich detail information, and then an Xgboost integration method is used for classification, so that the efficiency and the accuracy of classification and identification of the electronic laryngoscope image by a deep learning model are effectively improved. In order to achieve the purpose, the technical scheme of the invention is as follows:
step 1: and (4) preprocessing data. Specifically, the method comprises the following steps of;
and separating the acquired electronic laryngoscope video frame by frame to obtain all image frames of the laryngoscope video and obtain the electronic laryngoscope image, wherein the image frames comprise 6 types of nasal cavity, nasopharynx, epiglottis, vocal cord closure, vocal cord opening and extracorporeal/fuzzy. And cutting the redundant part of the image, removing a black area without a laryngoscope part around the image and keeping useful information of the image. In addition, the retained image data is flipped, color adjusted and randomly cropped using the tensrflow data enhancement method to expand the training data set and adjust the image to 224 × 224 using the resize method.
Step 2: and (4) extracting image features by using a dual-core convolution method. Specifically, the method comprises the following steps of;
step 2.1: inputting the processed electronic laryngoscope image into a convolution layer with convolution kernel of 3 x 3 for convolution processing, constructing a feature sequence 1, providing more sufficient features for increasing the classification of the electronic laryngoscope image, extracting image features by using the convolution layers with convolution kernel sizes of 3 and 5 respectively, wherein each convolution layer is convolved by 2 dimensions. Relu activating functions are used in each layer of convolution, the Relu functions not only have activating functions, but also enable the output of a part of neurons to be 0, namely the part of neurons cannot be activated and lose effects, the network becomes sparse, and the computing efficiency of the network is improved. And a Batch Normalization (BN) layer is used in the network, and the characteristics obtained by the convolution layer are subjected to normalization processing, so that the convergence speed in the model training process is increased, and the model precision is improved. Each convolution kernel is two-dimensional, and their calculation is the same, and the formula is as follows:
wherein, Cond(i, j) represents a two-dimensional convolution; d represents the convolution kernel size; xkRepresents the kth input matrix; wkRepresents the kth weight matrix; b represents a deviation term; k representsInputting the number of filters; i represents the abscissa of the image matrix; j represents the ordinate of the image matrix.
Relu activation function is used in the convolutional layer, the formula is as follows:
Relud(i,j)=max(0,Cond(i,j)) (2)
wherein, Relud(i, j) represents the Relu activation function; max represents the maximum operator for the collection element. The batch normalization formula is as follows:
wherein BNd(i, j) represents batch normalization; e2]Represents the mean of the input matrix; var [ alpha ], [ alpha]Representing the variance of the input matrix.
The networks used were as follows:
Conv_layer(kernel_size=5)+BN+Relu (4)
Conv_layer(kernel_size=3)+BN+Relu (5)
Conv_layer(kernel_size=1)+BN+Relu (6)
MaxPooling_layer(pool_size=2) (7)
step 2.2: and mutually fusing the convolution characteristic of the previous layer and the characteristic extracted by the dual-core CNN into a new characteristic. The low-order features of the image are transferred to the next layer, so that various low-order feature information such as textures, positions and shapes can be provided for the next unit, and the low-order feature transfer performance is improved. The model learns high-order features containing detail information, and the perception capability of the model to image details is improved. And then, performing pooling operation with the size of 2 on the high-dimensional features obtained by fusion, reducing the dimension and compressing the features, improving the training speed and improving the fault tolerance of the model. The feature fusion formula is as follows:
where Output (i, j) represents different volumesAccumulating to obtain a feature fused output; BN3(i, j) representing a matrix obtained by convolution and normalization with a convolution kernel size of 3; BN5(i, j) representing a matrix obtained by normalization of convolution with convolution kernel size of 5; concatenate represents a feature linkage;representing a splicing operator; cond(i, j) represents a two-dimensional convolution. The pooling layer formula is as follows:
MaxPooling(i,j)=max(Output(i,j)) (9)
where MaxPooling (i, j) represents the maximum pooled output.
Step 2.3: and further fusing and enhancing the characteristic sequence 1 and the characteristic sequence 2, and then performing convolution pooling operation on the fused characteristics by using a convolution layer with convolution kernel of 1 × 1 and a pooling layer with convolution kernel of 2 × 2 to obtain further image characteristics 2.3, so as to provide more image high-order characteristics for a subsequent training classification model and obtain characteristics with strong semantic property.
And repeating the step 2.2 and the step 2.3 for four times, fusing and reinforcing the features of each layer, inputting the fused and reinforced features into the next module, providing more sufficient features for image classification of the electronic laryngoscope, changing the last operation of the pooling layer into a full connection layer, and providing various texture and position information features for the next module.
And step 3: and training a classification network model. Specifically, the method comprises the following steps of;
and (3) using an Xgboost ensemble learner as a classifier of the model, performing model training, and training to obtain an electronic endoscope image classification model with high precision and good generalization. And (3) inputting the image features obtained in the step (2) into an Xgboost training classifier, randomly extracting images by adopting a random small-batch training strategy, and forming a small batch with the size of 60 before each training. The raw data were divided into training, validation and test sets at a ratio of 7:2:1 during the experiment. 10-fold cross validation was used during the experiment to evaluate the predictive performance of the model. And selecting a Relu activation function at the characteristic extraction stage, and optimizing parameters by using an Adam optimization method in the training process. To avoid overfitting, the Dropout function was used to operate before the fully connected layer of the network and L2 regularization was also applied to all weight parameters. In the whole training process, the weights of Dropout and L2 regularization are set to 0.5 and 0.0005 respectively, the learning rate is initialized to be set to 0.0001, and the iteration number is 500, so as to obtain the final fusion feature.
Drawings
FIG. 1 is an image of a data pre-processing stage in an embodiment of the present invention;
FIG. 2 is a view showing the nasopharyngeal area in an electronic laryngoscope image according to an embodiment of the invention;
FIG. 3 is a diagram of a network architecture of the present invention in an embodiment of the present invention;
FIG. 4 is a view showing the visualization of the feature of the nasopharyngeal area in the embodiment of the present invention;
FIG. 5 is a confusion matrix of recognition results in an embodiment of the present invention;
FIG. 6 is an overall process of the present invention for classifying electronic laryngoscope video images;
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Step 1: a data preprocessing stage, as shown in fig. 1;
step 2: firstly, we extract a picture to be processed currently from step 1, as shown in fig. 2, and input the picture into the dual-core convolution feature extraction network of fig. 3. Secondly, in order to show the effect of extracting features by the network, the previous layer of features of the full connection layer are obtained and visualized, and the result shown in fig. 4 is obtained. Finally, the image is passed through the full connected layers to obtain a 1 x 512 dimensional image feature F.
F=[0,0,0.24433999,0,0.7739287,2.735432,1.2492355,0,0,5.9589076,…, 2.735432,1.2492355,0,0,0,5.9589076]
And step 3: and (3) inputting the F obtained in the step (2) into a trained Xgboost classifier to obtain the probability P of a group of budget categories.
P=[1.0717337e-02,9.7289306e-01,1.3869060e-02,2.4896192e-03, 3.8891599e-06,2.7013968e-05]
From P we can see that the maximum value is 9.7289306e-01, in combination with the distribution of the tag, FIG. 2 is eventually predicted to be nasopharyngeal, in full agreement with the true tag.
And 4, step 4: to verify the performance of the model, 3382 images of 6 categories of nasal cavity, nasopharynx, epiglottis, vocal cords closure, vocal cords opening and blur were used, and the images were input into the trained model for prediction to draw a confusion matrix for evaluating the performance of the model, as shown in fig. 5. Wherein the accuracy of nasal cavity, nasopharynx, epiglottis, vocal cords closure, vocal cords opening and vague is 96.180%, 84.399%, 97.104%, 87.742%, 87.290% and 99.572%, respectively.
Claims (1)
1. The electronic laryngoscope image classification method based on the dual-core convolution feature extraction comprises the following steps:
step 1: preprocessing data, specifically;
separating the collected electronic laryngoscope video frame by frame to obtain all image frames of the laryngoscope video to obtain an electronic laryngoscope image, wherein the electronic laryngoscope image comprises 6 types of nasal cavity, nasopharynx, epiglottis, vocal cords closure, vocal cords opening and fuzziness; cutting the redundant part of the image, removing a black area without a laryngoscope part around the image, and keeping useful information of the image; in addition, reserved image data is subjected to turning, color adjustment and random cropping processing by using a TensorFlow data enhancement method, a training data set is expanded, and an image is adjusted to 224 × 224 by using a resize method;
step 2: extracting image features by using a dual-core convolution method, specifically;
step 2.1: inputting the processed electronic laryngoscope image into a convolution layer with convolution kernel of 3 x 3 for convolution processing, constructing a feature sequence 1, providing more sufficient features for increasing the classification of the electronic laryngoscope image, extracting image features by using the convolution layers with convolution kernel sizes of 3 and 5 respectively, wherein each convolution layer is convolved by 2 dimensions; relu activating functions are used in each layer of convolution, the Relu functions not only have activating functions, but also enable the output of a part of neurons to be 0, namely the part of neurons can not be activated and lose effects, so that the network becomes sparse, and the calculation efficiency of the network is improved; a Batch Normalization (BN) layer is used in the network, and the characteristics obtained by the convolution layer are subjected to normalization processing, so that the convergence speed in the model training process is increased, and the model precision is improved; each convolution kernel is two-dimensional, and their calculation is the same, and the formula is as follows:
wherein, Cond(i, j) represents a two-dimensional convolution, d represents a convolution kernel size, XkRepresents the kth input matrix, WkRepresenting the kth weight matrix, b representing a deviation item, k representing the number of input filters, i representing the abscissa of the image matrix, and j representing the ordinate of the image matrix;
relu activation function is used in the convolutional layer, the formula is as follows:
Relud(i,j)=max(0,Cond(i,j)) (2)
wherein, Relud(i, j) represents the Relu activation function; max represents the maximum operator of the collection element; the batch normalization formula is as follows:
wherein BNd(i, j) represents batch normalization; e2]Represents the mean of the input matrix; var [ alpha ], [ alpha]Representing the variance of the input matrix;
the networks used were as follows:
Conv_layer(kernel_size=5)+BN+Relu (4)
Conv_layer(kernel_size=3)+BN+Relu (5)
Conv_layer(kernel_size=1)+BN+Relu (6)
MaxPooling_layer(pool_size=2) (7)
step 2.2: mutually fusing the convolution characteristic of the previous layer and the characteristic extracted by the dual-core CNN into a new characteristic; the low-order features of the image are transmitted to the next layer, so that various low-order feature information such as textures, positions and shapes are provided for the next unit, and the low-order feature transmissibility is improved; the model learns high-order features containing detail information, and the perception capability of the model to image details is improved; then, pooling operation with the size of 2 is further performed on the high-dimensional features obtained through fusion, dimension reduction and compression are performed on the features, the training speed is increased, and meanwhile the fault tolerance of the model is improved; the feature fusion formula is as follows:
wherein Output (i, j) represents the Output of feature fusion obtained by different convolutions; BN3(i, j) representing a matrix obtained by convolution and normalization with a convolution kernel size of 3; BN5(i, j) representing a matrix obtained by normalization of convolution with convolution kernel size of 5; concatenate represents a feature linkage;representing a splicing operator; cond(i, j) represents a two-dimensional convolution; the pooling layer formula is as follows:
MaxPooling(i,j)=max(Output(i,j)) (9)
wherein MaxPooling (i, j) represents the maximum pooled output;
step 2.3: further fusing and reinforcing the characteristic sequence 1 and the characteristic sequence 2, then carrying out convolution pooling operation on the fused characteristics by using a convolution layer with convolution kernel 1 x 1 and a pooling layer with convolution kernel 2 x 2 to obtain further image characteristics 2.3, providing more image high-order characteristics for a subsequent training classification model, and obtaining characteristics with strong semantic property;
repeating the step 2.2 and the step 2.3 for four times, fusing and reinforcing the features of each layer, inputting the fused and reinforced features into the next module, providing more sufficient features for image classification of the electronic laryngoscope, changing the last operation of the pooling layer into a full connection layer, and providing various texture and position information features for the next module;
and step 3: training a classification network model, specifically;
an extreme gradient boost (Xgboost) ensemble learner is used as a classifier of the model to carry out model training, and an electronic endoscope image classification model with high precision and good generalization is obtained through training; inputting the image characteristics obtained in the step 2 into an Xgboost training classifier, randomly extracting images by adopting a random small-batch training strategy, and forming a small batch with the size of 60 before each training; in the experimental process, original data are divided into a training set, a verification set and a test set according to the ratio of 7:2: 1; 10-fold cross validation was used during the experiment to evaluate the predictive performance of the model; selecting a Relu activation function in a characteristic extraction stage, and optimizing parameters by using an Adam optimization method in a training process; to avoid overfitting, the Dropout function was used to operate before the fully connected layer of the network and L2 regularization was also applied to all weight parameters; in the whole training process, the weights of Dropout and L2 regularization are set to 0.5 and 0.0005 respectively, the learning rate is initialized to be set to 0.0001, and the iteration number is 500, so as to obtain the final fusion feature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110013954.6A CN112686329A (en) | 2021-01-06 | 2021-01-06 | Electronic laryngoscope image classification method based on dual-core convolution feature extraction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110013954.6A CN112686329A (en) | 2021-01-06 | 2021-01-06 | Electronic laryngoscope image classification method based on dual-core convolution feature extraction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112686329A true CN112686329A (en) | 2021-04-20 |
Family
ID=75456025
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110013954.6A Pending CN112686329A (en) | 2021-01-06 | 2021-01-06 | Electronic laryngoscope image classification method based on dual-core convolution feature extraction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112686329A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113744266A (en) * | 2021-11-03 | 2021-12-03 | 武汉楚精灵医疗科技有限公司 | Method and device for displaying focus detection frame, electronic equipment and storage medium |
CN114062997A (en) * | 2021-11-05 | 2022-02-18 | 中国南方电网有限责任公司超高压输电公司广州局 | Method, system and device for checking electric energy meter |
CN115578335A (en) * | 2022-09-29 | 2023-01-06 | 西安理工大学 | Vocal cord white spot image classification method based on multi-scale feature extraction |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229455A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Object detecting method, the training method of neural network, device and electronic equipment |
CN109919011A (en) * | 2019-01-28 | 2019-06-21 | 浙江工业大学 | A kind of action video recognition methods based on more duration informations |
CN109977904A (en) * | 2019-04-04 | 2019-07-05 | 成都信息工程大学 | A kind of human motion recognition method of the light-type based on deep learning |
CN110427990A (en) * | 2019-07-22 | 2019-11-08 | 浙江理工大学 | A kind of art pattern classification method based on convolutional neural networks |
CN111476713A (en) * | 2020-03-26 | 2020-07-31 | 中南大学 | Intelligent weather image identification method and system based on multi-depth convolution neural network fusion |
CN111611968A (en) * | 2020-05-29 | 2020-09-01 | 中国科学院西北生态环境资源研究院 | Processing method of remote sensing image and remote sensing image processing model |
US20200334819A1 (en) * | 2018-09-30 | 2020-10-22 | Boe Technology Group Co., Ltd. | Image segmentation apparatus, method and relevant computing device |
-
2021
- 2021-01-06 CN CN202110013954.6A patent/CN112686329A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229455A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Object detecting method, the training method of neural network, device and electronic equipment |
WO2018153319A1 (en) * | 2017-02-23 | 2018-08-30 | 北京市商汤科技开发有限公司 | Object detection method, neural network training method, apparatus, and electronic device |
US20200334819A1 (en) * | 2018-09-30 | 2020-10-22 | Boe Technology Group Co., Ltd. | Image segmentation apparatus, method and relevant computing device |
CN109919011A (en) * | 2019-01-28 | 2019-06-21 | 浙江工业大学 | A kind of action video recognition methods based on more duration informations |
CN109977904A (en) * | 2019-04-04 | 2019-07-05 | 成都信息工程大学 | A kind of human motion recognition method of the light-type based on deep learning |
CN110427990A (en) * | 2019-07-22 | 2019-11-08 | 浙江理工大学 | A kind of art pattern classification method based on convolutional neural networks |
CN111476713A (en) * | 2020-03-26 | 2020-07-31 | 中南大学 | Intelligent weather image identification method and system based on multi-depth convolution neural network fusion |
CN111611968A (en) * | 2020-05-29 | 2020-09-01 | 中国科学院西北生态环境资源研究院 | Processing method of remote sensing image and remote sensing image processing model |
Non-Patent Citations (5)
Title |
---|
CLYDE MATAVA 等,: "A Convolutional Neural Network for Real Time Classification, Identification, and Labelling of Vocal Cord and Tracheal Using Laryngoscopy and Bronchoscopy Video", 《JOURNAL OF MEDICAL SYSTEMS》 * |
JINMING ZHANG 等,: "Inception DenseNet With Hybrid Activations For Image Classification", 《THE 2019 6TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI 2019)》 * |
XUDIE REN 等,: "A Novel Image Classification Method with CNN-XGBoost Model", 《INTERNATIONAL WORKSHOP ON DIGITAL WATERMARKING》 * |
宣琦 等,: "基于多时长特征融合的人体行为识别方法", 《浙江工业大学学报》 * |
王龙 等,: "反馈学习高斯表观网络的视频目标分割", 《自动化学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113744266A (en) * | 2021-11-03 | 2021-12-03 | 武汉楚精灵医疗科技有限公司 | Method and device for displaying focus detection frame, electronic equipment and storage medium |
CN113744266B (en) * | 2021-11-03 | 2022-02-08 | 武汉楚精灵医疗科技有限公司 | Method and device for displaying focus detection frame, electronic equipment and storage medium |
CN114062997A (en) * | 2021-11-05 | 2022-02-18 | 中国南方电网有限责任公司超高压输电公司广州局 | Method, system and device for checking electric energy meter |
CN114062997B (en) * | 2021-11-05 | 2024-03-19 | 中国南方电网有限责任公司超高压输电公司广州局 | Electric energy meter verification method, system and device |
CN115578335A (en) * | 2022-09-29 | 2023-01-06 | 西安理工大学 | Vocal cord white spot image classification method based on multi-scale feature extraction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112686329A (en) | Electronic laryngoscope image classification method based on dual-core convolution feature extraction | |
CN109886273B (en) | CMR image segmentation and classification system | |
CN110084318B (en) | Image identification method combining convolutional neural network and gradient lifting tree | |
CN107967456A (en) | A kind of multiple neural network cascade identification face method based on face key point | |
CN109886161B (en) | Road traffic identification recognition method based on likelihood clustering and convolutional neural network | |
CN112784801A (en) | Text and picture-based bimodal gastric disease classification method and device | |
CN112766376A (en) | Multi-label eye fundus image identification method based on GACNN | |
CN116779091B (en) | Automatic generation method of multi-mode network interconnection and fusion chest image diagnosis report | |
CN116311483B (en) | Micro-expression recognition method based on local facial area reconstruction and memory contrast learning | |
CN116168348B (en) | Security monitoring method, system and storage medium based on image processing | |
CN113240655A (en) | Method, storage medium and device for automatically detecting type of fundus image | |
CN116468935A (en) | Multi-core convolutional network-based stepwise classification and identification method for traffic signs | |
CN116525075A (en) | Thyroid nodule computer-aided diagnosis method and system based on few sample learning | |
Prabha et al. | Analysis of Cognitive Emotional and Behavioral Aspects of Alzheimer's Disease Using Hybrid CNN Model | |
CN112233017B (en) | Method for enhancing pathological face data based on generation countermeasure network | |
CN115147303A (en) | Two-dimensional ultrasonic medical image restoration method based on mask guidance | |
CN116012903A (en) | Automatic labeling method and system for facial expressions | |
Almana et al. | Real-time Arabic Sign Language Recognition using CNN and OpenCV | |
CN113343770A (en) | Face anti-counterfeiting method based on feature screening | |
CN112819133A (en) | Construction method of deep hybrid neural network emotion recognition model | |
Mallek et al. | Deep learning with sparse prior-application to text detection in the wild | |
CN115983986B (en) | Clothing exposure level identification method for video surface examination portrait | |
CN116188879B (en) | Image classification and image classification model training method, device, equipment and medium | |
CN117951632B (en) | PU contrast learning anomaly detection method and system based on multi-mode prototype network | |
CN113688799B (en) | Facial expression recognition method for generating confrontation network based on improved deep convolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210420 |
|
WD01 | Invention patent application deemed withdrawn after publication |