CN113555087A - Artificial intelligence film reading method based on convolutional neural network algorithm - Google Patents

Artificial intelligence film reading method based on convolutional neural network algorithm Download PDF

Info

Publication number
CN113555087A
CN113555087A CN202110813909.9A CN202110813909A CN113555087A CN 113555087 A CN113555087 A CN 113555087A CN 202110813909 A CN202110813909 A CN 202110813909A CN 113555087 A CN113555087 A CN 113555087A
Authority
CN
China
Prior art keywords
target
network
neural network
model
artificial intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110813909.9A
Other languages
Chinese (zh)
Inventor
薛帅
张丽
陈向
左万利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Hospital Jinlin University
Original Assignee
First Hospital Jinlin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Hospital Jinlin University filed Critical First Hospital Jinlin University
Priority to CN202110813909.9A priority Critical patent/CN113555087A/en
Publication of CN113555087A publication Critical patent/CN113555087A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10132Ultrasound image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Radiology & Medical Imaging (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of artificial intelligence, and particularly relates to an artificial intelligence film reading method based on a convolutional neural network algorithm, which comprises the following steps: acquiring a target image to be detected; step two: selecting an image area; step three: extracting target features; step four: classifying the target according to the characteristics; step five: regression is conducted on the target boundary box; step six: optimizing the structure; step seven: the method has the advantages that the target detection is completed, the structure is reasonable, the accuracy of thyroid color Doppler ultrasound diagnosis is improved, misdiagnosis of preoperative thyroid color Doppler ultrasound is prevented, the input and the output are all program operation, the effect of human factors in image diagnosis is eliminated, and the image characteristics can be better identified compared with the traditional method. Moreover, the ability of gradually learning and improving the algorithm in a grading way is more suitable for the improvement of the program, and is the method with the highest diagnosis accuracy rate of the medical image in the AI at present.

Description

Artificial intelligence film reading method based on convolutional neural network algorithm
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence film reading method based on a convolutional neural network algorithm.
Background
The incidence of Thyroid Cancer (TC) is rapidly increasing worldwide, and surgery remains the first treatment option for physicians to treat TC in China. However, most TCs grow slowly and have good prognosis, and their fatality rate is not reduced by active surgical treatment, but rather, the quality of life of patients with TCs is greatly reduced by surgery, so overdiagnosis and overdiagnment of TCs become the key points of clinical attention. The preoperative diagnosis of TC mostly depends on the color Doppler ultrasound of the thyroid, however, the experience and the technology of color Doppler ultrasound doctors are different, and the accuracy of the color Doppler ultrasound diagnosis of the thyroid is directly influenced. Studies have shown that misdiagnosis of preoperative thyrotoxicosis is the primary cause of erroneous punctures and over-operative treatment.
With the development of computer technology, Artificial Intelligence (AI) plays an increasingly important role in medical image picture recognition and disease diagnosis. Deep learning based on Convolutional Neural Networks (CNN) algorithm shows remarkable application prospect, especially in visual structure and language recognition task. A number of studies have shown that CNNs can provide more accurate diagnostic information than conventional methods in the learning and diagnostic tasks of medical images. Some advanced clinical medicine centers have used deep-learning AI algorithms to diagnose breast, lung, brain and liver disease. However, to date, no AI software for thyroid color Doppler has been developed.
Therefore, an artificial intelligence film reading method based on a convolutional neural network algorithm is provided to solve the problems.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
Therefore, the invention aims to provide an artificial intelligence radiograph reading method based on a convolutional neural network algorithm, so that the accuracy of thyroid color Doppler ultrasound diagnosis is improved, and misdiagnosis of preoperative thyroid color Doppler ultrasound is prevented.
To solve the above technical problem, according to an aspect of the present invention, the present invention provides the following technical solutions:
an artificial intelligence film reading method based on a convolutional neural network algorithm comprises the following steps:
the method comprises the following steps: acquiring a target image to be detected;
step two: selecting an image area;
step three: extracting target features;
step four: classifying the target according to the characteristics;
step five: regression is conducted on the target boundary box;
step six: optimizing the structure;
step seven: and finishing target detection.
As a preferred scheme of the artificial intelligence interpretation method based on the convolutional neural network algorithm, the method comprises the following steps: the algorithm is realized mainly by means of a model of the nanoDet, and the nanoDet comprises a feature extraction backbone network, a feature fusion network and a detection head.
As a preferred scheme of the artificial intelligence interpretation method based on the convolutional neural network algorithm, the method comprises the following steps: in a neural network, particularly in the field of CV (computer vision), features of an image are generally extracted, and the part is the root of the whole CV task, so that the part of the network structure is called a backbone; NanoDet selects ShuffleNetV21.0x as the backbone. ShuffLeNetV21.0x is a modified version of ShuffLeNetV1, and the modification process follows the following 4 guidelines:
(1) the input and output of the same channel size can minimize MAC (memory access cost), and the model speed is the fastest at the moment;
(2) excessive use of group convolution (group convolution) increases the MAC and slows the model speed;
(3) the fewer the number of model branches, the simpler the model and the faster the speed;
(4) element-wise operation can also negatively impact model speed.
As a preferred scheme of the artificial intelligence interpretation method based on the convolutional neural network algorithm, the method comprises the following steps: the Feature fusion layer selects an improved Network PAN (Path Augmentation Network) of FPN (Feature farm Network, FPN) and makes lightweight modification on the basis of the improved Network PAN (Path Augmentation Network); a Feature Pyramid Network (FPN) is an efficient CNN Feature extraction method, a conventional convolutional neural Network is progressive from bottom to top, scale and semantic information are changed continuously, and the FPN is enhanced by adding Feature supplement of top-down paths, so that finally output features better represent multi-dimensional information of an input picture.
As a preferred scheme of the artificial intelligence interpretation method based on the convolutional neural network algorithm, the method comprises the following steps: the method comprises the following steps that a Detection head (head) of an FCOS (fuzzy conditional One-Stage Object Detection) model is selected for the nanoDet, the idea of the FCOS model is to predict a target class and a target frame to which each point in an input image belongs, the overall architecture of the FCOS model is similar to an FPN (feature Pyramid network) structure, and 5 fused feature layers are predicted.
Compared with the prior art, the invention has the beneficial effects that: the accuracy of thyroid color Doppler ultrasound diagnosis is improved, misdiagnosis of preoperative thyroid color Doppler ultrasound is prevented, input and output are all program operation, the effect of human factors in image diagnosis is eliminated, and image characteristics can be better identified compared with the traditional method. Moreover, the ability of gradually learning and improving the algorithm in a grading way is more suitable for the improvement of the program, and is the method with the highest diagnosis accuracy rate of the medical image in the AI at present.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the present invention will be described in detail with reference to the accompanying drawings and detailed embodiments, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise. Wherein:
FIG. 1 is a schematic structural view of the present invention;
FIG. 2 is a view of the construction of the nanoDet model of the present invention;
FIG. 3 is a diagram of a backbone network model architecture according to the present invention; (a) a basic ShuffleNet unit of ShuffleNet V1 (b), a ShuffleNet unit of ShuffleNet V1 spatial down-sampling (c), a basic ShuffleNet V2 unit (d), a ShuffleNet V2 spatial down-sampling unit;
FIG. 4 is a diagram of a characteristic pyramid network of the present invention;
FIG. 5 is a graph of FPN calculations according to the present invention;
FIG. 6 is a PAN calculation graph in accordance with the invention;
FIG. 7 is a diagram of an ultra lightweight PAN configuration of the present invention;
FIG. 8 is a diagram of the FCOS model architecture of the present invention;
FIG. 9 is a schematic view of the nanoDet detection head of the present invention;
FIG. 10 is a flowchart illustrating android app target detection according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described herein, and it will be apparent to those of ordinary skill in the art that the present invention may be practiced without departing from the spirit and scope of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Next, the present invention will be described in detail with reference to the drawings, wherein for convenience of illustration, the cross-sectional view of the device structure is not enlarged partially according to the general scale, and the drawings are only examples, which should not limit the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Example 1
An artificial intelligence film reading method based on a convolutional neural network algorithm comprises the following steps:
the method comprises the following steps: acquiring a target image to be detected;
step two: selecting an image area;
step three: extracting target features;
step four: classifying the target according to the characteristics;
step five: regression is conducted on the target boundary box;
step six: optimizing the structure;
step seven: and finishing target detection.
Specifically, the algorithm of the invention is realized mainly by means of a model of the nanoDet, and the nanoDet comprises a feature extraction backbone network, a feature fusion network and a detection head.
Specifically, in a neural network, especially in the field of CV (computer vision), features of an image are generally extracted first, and this part is the root of the whole CV task, so this part of the network structure is called a backbone; NanoDet selects ShuffleNetV21.0x as the backbone. ShuffLeNetV21.0x is a modified version of ShuffLeNetV1, and the modification process follows the following 4 guidelines:
(1) the input and output of the same channel size can minimize MAC (memory access cost), and the model speed is the fastest at the moment;
(2) excessive use of group convolution (group convolution) increases the MAC and slows the model speed;
(3) the fewer the number of model branches, the simpler the model and the faster the speed;
(4) element-wise operation can also negatively impact model speed.
Based on the 4 guidelines obtained from the experimental verification and theoretical demonstration, the analysis found that the module of V1 largely uses 1x1 convolution of group (group) operation, which is contrary to the 2 nd guideline. In addition, V1 adopts a bottleneck layer (bottle layer) similar to ResNet, and the number of input and output channels is different and is contrary to the 1 st principle. There is a large amount of element-wise operation in the short-circuited connection, which is contrary to the 4 th principle. Based on the above findings, ShuffleNet V21.0x authors newly introduced a channel splitting (channel split) operation. The input feature map is divided into two branches in the channel dimension: c ' and c-c ', with the specific implementation being c ' ═ c/2. The left branch is mapped equally, the right branch contains 3 continuous convolutions, and the number of input and output channels is the same, following the 1 st principle. The group operation in two 1x1 convolutional layers is eliminated, and the 2 nd principle is followed. In addition, the output of the two branches does not use element-wise operation, but uses concat operation instead, and follows the 4 th principle. The structure of ShuffleNet V2 is shown in Table 1, the structure is basically similar to ResNet, and is divided into several stages, and each stage replaces the Residual block with ShuffleNet unit. Table 1 the number of output channels is changed by changing the number of group operations on the premise that complexity is defined, and generally the more output channels can extract more features.
TABLE 1 Structure Table of ShuffleNet V2
Figure BDA0003169216590000061
Figure BDA0003169216590000071
In the nanoDet, ShuffleNet V21.0x is improved in light weight, the last layer of convolution operation is removed, and 8, 16 and 32 times of down-sampling features are extracted and input into PAN for multi-scale feature fusion.
Specifically, the Feature fusion layer selects an improved Network PAN (Path Augmentation Net) of FPN (Feature Central Network, FPN) and is modified in a light weight manner on the basis of the improved Network PAN (Path Augmentation Net); a Feature Pyramid Network (FPN) is an efficient CNN Feature extraction method, a conventional convolutional neural Network is progressive from bottom to top, scale and semantic information are changed continuously, and the FPN is enhanced by adding Feature supplement of top-down paths, so that finally output features better represent multi-dimensional information of an input picture.
The left side of the FPN calculation graph is a common ResNet network from bottom to top, is used for extracting semantic information and is a process for concentrating and expressing features layer by layer. The lower layer may reflect the image information of the shallow layer, and the higher layer may reflect the contour or category information of the image object of the deep layer.
The top-down path on the right side of the FPN calculation graph takes the key role of the information of the higher layer in the subsequent target detection task into consideration, and the information output of the upper layer is up-sampled (linear interpolation is used here) to be used as the input of the next adjacent layer. Firstly, carrying out 1 × 1 convolution dimensionality reduction channel on the uppermost feature map to obtain P5, and then sequentially carrying out upsampling on P5 to obtain P4 and P3. The middle of the graph is connected transversely, so that the high-level semantic information after up-sampling is fused with the positioning detail characteristics with the corresponding sizes before down-sampling. C3 and C4 are adjusted through 1 multiplied by 1 convolution and are consistent with the channel numbers of P3 and P4, element-by-element addition is carried out to obtain P3 and P4, and finally the P3 and the P4 are output together with P5 to carry out subsequent tasks.
The FPN layer in the nanoDet model selects a PANET network, the PAN network is an improved version of the FPN, and the PAN provides Bottom-up Path enhancement (Bottom-up Path Augmentation) on the basis of the FPN, so that the utilization rate of Bottom information is improved.
In addition, in order to lighten the model, all convolution operations in the PAN are completely removed, only 1x1 convolution after extracting the backbone network features is reserved for aligning the feature channel dimensions, and the up-sampling and the down-sampling are completed by interpolation. In addition, the Feature Map (Feature Map) of multi-scale does not use the connection (linkage) operation, but selects direct addition, so the calculation amount of the whole Feature fusion model can be greatly reduced.
Specifically, the nanoDet selects a Detection head (head) of a full relational One-Stage Object Detection (FCOS) model, the FCOS model is based on the idea of predicting a target class and a target frame to which each point in an input image belongs, and the overall architecture of the model is similar to an fpn (feature Pyramid network) structure and predicts 5 fused feature layers.
In the figure, 3 output layers are a classification branch, a Center-less branch and a regression branch.
1. H × W in the classification branch indicates the size of the feature, and C indicates the number of classes.
2. The Center-less branch is used for calculating the distance between each point and the target Center point and excluding the prediction points far away from the target Center.
3. The regression branch will output 4 values (l, t, r, b) representing the distance of the point to the 4 edges of the target frame, respectively.
Defining the label frame and category information of the target in an image as
Figure BDA0003169216590000081
The first 4 values represent coordinates of upper left corner and lower right corner points respectively, the last value is integer type of category information, and then the category information of each pixel point on the input image can determine a regression target according to whether the category information falls into the labeling frame, and the calculation mode is as formula (1)
Figure BDA0003169216590000082
Figure BDA0003169216590000091
In the above formula, (x, y) is the coordinates of the pixel points, the points falling outside the labeling frame are negative samples, and the category is set to be 0; the pixel point in the label box is a positive sample, and the label box type is the target type (non-0 integer) of the point. In addition, in order to solve the problem of identification of image overlapping parts in fig. 8, an FPN structure is introduced into an FCOS model to realize prediction of target frames based on different feature layers and different scales, most of the target frames with coincidence can be stripped, and a pixel point (l) is used for identifying the overlapping parts of the images*,t*,r*,b*) Whether the maximum value of 4 values is in the preset range or not is used for distinguishing which points are in which feature layer, so that each feature layer has a preset scale range, for example, the maximum value range [0,64 ] corresponding to the P3 layer]The maximum value range for the P4 layer is [64,128 ]]And for target frames that still cannot be peeled off, the training target is based onThe box with the smallest area in the coincident target boxes is calculated.
In addition, a 'centrality' concept is introduced into the FCOS to solve the problem that part of false detection frames are far away from the central point of a real frame, and the solution idea is to multiply the centrality and the corresponding classification score and calculate a final score (used for sequencing the detected boundary frames). The specific implementation is that a Center-less branch and a classification branch are set to be parallel, the centrality value is between 0 and 1, the closer the distance between each point in a target frame and a central point is, the larger the weight is, and finally, a low-quality boundary frame is filtered through a Non-Maximum Suppression (NMS) process. The process of calculating the centrality is as shown in formula (2)
Figure BDA0003169216590000092
The detection of FCOS is carried out lightweight transformation in the nanoDet, a Center-less branch which is difficult to converge in training is removed firstly, in addition, the convolution of 4 256 channels is used as a branch in the detection head of FCOS, the compression is carried out in the nanoDet into the convolution of 2 96 channels, the deep separable convolution is used for replacing the common convolution, the frame regression and classification are calculated by using the same group of convolution, and finally, split is divided into two parts. In addition, since inter-detection-head weight sharing in FCOS can reduce the parameter amount but reduce the model detection capability, a single set of convolution is used independently for each detection head in nanoDet. In the nanoDet, the normalization mode of GN (group normalization) in FCOS is changed into BN (batch normalization), because the normalized parameters of BN can be directly fused into convolution during reasoning, the calculation amount is reduced.
While the invention has been described above with reference to an embodiment, various modifications may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In particular, the various features of the disclosed embodiments of the invention may be used in any combination, provided that no structural conflict exists, and the combinations are not exhaustively described in this specification merely for the sake of brevity and resource conservation. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims (5)

1. An artificial intelligence film reading method based on a convolutional neural network algorithm is characterized by comprising the following steps:
the method comprises the following steps: acquiring a target image to be detected;
step two: selecting an image area;
step three: extracting target features;
step four: classifying the target according to the characteristics;
step five: regression is conducted on the target boundary box;
step six: optimizing the structure;
step seven: and finishing target detection.
2. The artificial intelligence film reading method based on the convolutional neural network algorithm as claimed in claim 1, wherein: the algorithm is realized mainly by means of a model of the nanoDet, and the nanoDet comprises a feature extraction backbone network, a feature fusion network and a detection head.
3. The artificial intelligence film reading method based on the convolutional neural network algorithm as claimed in claim 2, wherein: in a neural network, especially in the field of CV (computational fluid), features of an image are generally extracted, and this part is the root of the whole CV task, so this part of the network structure is called a backbone; NanoDet selects ShuffleNetV21.0x as the backbone. ShuffLeNetV21.0x is a modified version of ShuffLeNetV1, and the modification process follows the following 4 guidelines:
(1) the input and output of the same channel size can minimize MAC (memory access cost), and the model speed is the fastest at the moment;
(2) excessive use of group convolution (group convolution) increases the MAC and slows the model speed;
(3) the fewer the number of model branches, the simpler the model and the faster the speed;
(4) element-wise operation can also negatively impact model speed.
4. The artificial intelligence film reading method based on the convolutional neural network algorithm as claimed in claim 2, wherein: the Feature fusion layer selects an improved Network PAN (Path Augmentation Network) of FPN (Feature farm Network, FPN) and makes lightweight modification on the basis of the improved Network PAN (Path Augmentation Network); a Feature Pyramid Network (FPN) is an efficient CNN Feature extraction method, a conventional convolutional neural Network is progressive from bottom to top, scale and semantic information are changed continuously, and the FPN is enhanced by adding Feature supplement of top-down paths, so that finally output features better represent multi-dimensional information of an input picture.
5. The artificial intelligence film reading method based on the convolutional neural network algorithm as claimed in claim 2, wherein: the method comprises the following steps that a Detection head (head) of an FCOS (fuzzy conditional One-Stage Object Detection) model is selected for the nanoDet, the idea of the FCOS model is to predict a target class and a target frame to which each point in an input image belongs, the overall architecture of the FCOS model is similar to an FPN (feature Pyramid network) structure, and 5 fused feature layers are predicted.
CN202110813909.9A 2021-07-19 2021-07-19 Artificial intelligence film reading method based on convolutional neural network algorithm Pending CN113555087A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110813909.9A CN113555087A (en) 2021-07-19 2021-07-19 Artificial intelligence film reading method based on convolutional neural network algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110813909.9A CN113555087A (en) 2021-07-19 2021-07-19 Artificial intelligence film reading method based on convolutional neural network algorithm

Publications (1)

Publication Number Publication Date
CN113555087A true CN113555087A (en) 2021-10-26

Family

ID=78132096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110813909.9A Pending CN113555087A (en) 2021-07-19 2021-07-19 Artificial intelligence film reading method based on convolutional neural network algorithm

Country Status (1)

Country Link
CN (1) CN113555087A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689472A (en) * 2021-10-26 2021-11-23 城云科技(中国)有限公司 Moving target detection method, device and application
CN114463759A (en) * 2022-04-14 2022-05-10 浙江霖研精密科技有限公司 Lightweight character detection method and device based on anchor-frame-free algorithm
CN114495109A (en) * 2022-01-24 2022-05-13 山东大学 Grabbing robot based on matching of target and scene characters and grabbing method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490892A (en) * 2019-07-03 2019-11-22 中山大学 A kind of Thyroid ultrasound image tubercle automatic positioning recognition methods based on USFaster R-CNN
CN112613508A (en) * 2020-12-24 2021-04-06 深圳市杉川机器人有限公司 Object identification method, device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490892A (en) * 2019-07-03 2019-11-22 中山大学 A kind of Thyroid ultrasound image tubercle automatic positioning recognition methods based on USFaster R-CNN
CN112613508A (en) * 2020-12-24 2021-04-06 深圳市杉川机器人有限公司 Object identification method, device and equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
NINGNING MA 等: "ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design", 《ARXIV:1807.11164V1 [CS.CV]》 *
RANGILYU: "YOLO之外的另一选择,手机端97FPS的Anchor-Free目标检测模型NanoDet现已开源~", 《知乎HTTPS://ZHUANLAN.ZHIHU.COM/P/306530300》 *
TSUNG-YI LIN 等: "Feature Pyramid Networks for Object Detection", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
ZHI TIAN 等: "FCOS: Fully Convolutional One-Stage Object Detection", 《2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689472A (en) * 2021-10-26 2021-11-23 城云科技(中国)有限公司 Moving target detection method, device and application
CN114495109A (en) * 2022-01-24 2022-05-13 山东大学 Grabbing robot based on matching of target and scene characters and grabbing method and system
CN114463759A (en) * 2022-04-14 2022-05-10 浙江霖研精密科技有限公司 Lightweight character detection method and device based on anchor-frame-free algorithm

Similar Documents

Publication Publication Date Title
Dong et al. Classification of cataract fundus image based on deep learning
CN113555087A (en) Artificial intelligence film reading method based on convolutional neural network algorithm
CN110211087B (en) Sharable semiautomatic marking method for diabetic fundus lesions
CN109635846A (en) A kind of multiclass medical image judgment method and system
CN111767952B (en) Interpretable lung nodule benign and malignant classification method
CN109544507A (en) A kind of pathological image processing method and system, equipment, storage medium
CN109858429A (en) A kind of identification of eye fundus image lesion degree and visualization system based on convolutional neural networks
Cao et al. Gastric cancer diagnosis with mask R-CNN
CN113724206B (en) Fundus image blood vessel segmentation method and system based on self-supervision learning
CN114004811A (en) Image segmentation method and system based on multi-scale residual error coding and decoding network
CN110570419A (en) Method and device for acquiring characteristic information and storage medium
CN113160120A (en) Liver blood vessel segmentation method and system based on multi-mode fusion and deep learning
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image
CN112233085A (en) Cervical cell image segmentation method based on pixel prediction enhancement
CN111784713A (en) Attention mechanism-introduced U-shaped heart segmentation method
CN115147640A (en) Brain tumor image classification method based on improved capsule network
Magpantay et al. A transfer learning-based deep CNN approach for classification and diagnosis of acute lymphocytic leukemia cells
CN112508827A (en) Deep learning-based multi-scene fusion endangered organ segmentation method
CN113393445B (en) Breast cancer image determination method and system
CN114359308A (en) Aortic dissection method based on edge response and nonlinear loss
CN112967269A (en) Pulmonary nodule identification method based on CT image
Zheng et al. WPNet: Wide Pyramid Network for Recognition of HER2 Expression Levels in Breast Cancer Evaluation
Wu et al. Mscan: Multi-scale channel attention for fundus retinal vessel segmentation
CN112668668B (en) Postoperative medical image evaluation method and device, computer equipment and storage medium
Essaf et al. Review on deep learning methods used for computer-aided lung cancer detection and diagnosis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211026