CN112232448B - Image classification method and device, electronic equipment and storage medium - Google Patents

Image classification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112232448B
CN112232448B CN202011462359.2A CN202011462359A CN112232448B CN 112232448 B CN112232448 B CN 112232448B CN 202011462359 A CN202011462359 A CN 202011462359A CN 112232448 B CN112232448 B CN 112232448B
Authority
CN
China
Prior art keywords
convolution
feature map
feature
generate
image set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011462359.2A
Other languages
Chinese (zh)
Other versions
CN112232448A (en
Inventor
李卫超
赵雷
唐轶
李博超
钟利伟
金蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Daheng Prust Medical Technology Co ltd
Original Assignee
Beijing Daheng Prust Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Daheng Prust Medical Technology Co ltd filed Critical Beijing Daheng Prust Medical Technology Co ltd
Priority to CN202011462359.2A priority Critical patent/CN112232448B/en
Publication of CN112232448A publication Critical patent/CN112232448A/en
Application granted granted Critical
Publication of CN112232448B publication Critical patent/CN112232448B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image classification method, an image classification device, an electronic device and a storage medium, wherein the method comprises the following steps: obtaining an original image set, wherein the original image set comprises a plurality of original images; cleaning, cutting, data enhancing and normalizing the original image set to generate a preprocessed image set; inputting the preprocessed image set into a backbone network for training to generate a characteristic image set; and inputting the extracted feature map into a full connection layer, classifying and outputting a classification result. A new network is designed, the network integrates the characteristics of the pictures under different granularities and different scales, and the detection effect can be further improved.

Description

Image classification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image recognition, and in particular, to an image classification method, apparatus, electronic device, and storage medium.
Background
Diabetic macular edema is a vision threatening diabetic retinopathy, and OCT (optical coherence tomography) technology can be used for diagnosis and guidance of age-related macular degeneration and diabetic macular edema. The existing automatic detection method can be divided into two types according to different implementation forms, namely a traditional image classification method and a processing method based on deep learning. The traditional image classification method needs manual feature extraction and is time-consuming and labor-consuming. In the current technical scheme of adopting deep learning to detect, the realization is mostly based on transfer learning, namely, a model is initialized by adopting pre-training weights in an ImageNet competition, and training is continued on the basis. However, the ImageNet competition is directed to natural pictures, and semantic information of medical pictures is greatly different from the natural pictures, so that when the amount of medical image data is small, the adjustment on the ImageNet pre-training weight is more suitable; on the premise that the amount of medical image data is sufficient, a better method is retraining instead of transfer learning.
Disclosure of Invention
An object of the embodiments of the present application is to provide an image classification method, an image classification device, an electronic device, and a non-transitory readable storage medium of an electronic device, so as to solve technical problems in the prior art.
In a first aspect, an embodiment of the present invention provides an image classification method, including: obtaining an original image set, wherein the original image set comprises a plurality of original images; cleaning, cutting, data enhancing and normalizing the original image set to generate a preprocessed image set; inputting the preprocessed image set into a backbone network for training to generate a characteristic diagram; and inputting the extracted feature map into a full connection layer, classifying and outputting a classification result.
In one embodiment, after the original image set is subjected to the cleaning, clipping, data enhancement and normalization processes, a preprocessed image set is generated, which includes: performing data enhancement and data equalization on the target area by adopting a data enhancement method to obtain data enhancement data, wherein the number of all classified pictures in the data enhancement data is the same; and normalizing the data enhancement data to generate a preprocessing image set.
In one embodiment, inputting the pre-processing image set into a backbone network for training to generate a feature map, includes: establishing a plurality of convolution blocks, wherein convolution kernels with different sizes are adopted in the convolution blocks to extract different scale characteristics, merging and adding operation is carried out on characteristic graphs output by convolution operation with different sizes of convolution kernels, and the number of channels describing the characteristics of the input convolution blocks and semantic information rich in a single channel are increased; merging the feature maps output by each rolling block to generate a feature map; the sizes of the feature maps output by the adjacent convolution blocks are sequentially halved, and the number of channels of the feature maps is sequentially doubled.
In an embodiment, inputting the extracted feature map into the full connection layer, classifying and outputting a classification result, including: inputting the extracted feature map into a full connection layer, and generating a probability value corresponding to a preset category; and outputting the classification result corresponding to the maximum probability value according to the probability value.
In a second aspect, an embodiment of the present invention provides an image classification apparatus, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an original image set, and the original image set comprises a plurality of original images; the first generation module is used for generating a preprocessing image set after cleaning, cutting, data enhancement and normalization processing are carried out on an original image set; the second generation module is used for inputting the preprocessed image set into a backbone network for training to generate a feature map; and the first output module is used for inputting the extracted characteristic diagram into the full connection layer, classifying and outputting a classification result.
In an embodiment, the first generating module is further configured to: performing data enhancement and data equalization on the target area by adopting a data enhancement method to obtain data enhancement data, wherein the number of all classified pictures in the data enhancement data is the same; and normalizing the data enhancement data to generate a preprocessing image set.
In an embodiment, the second generating module is further configured to: establishing a plurality of convolution blocks, wherein convolution kernels with different sizes are adopted in the convolution blocks to extract different scale characteristics, merging and adding operation is carried out on characteristic graphs output by convolution operation with different sizes of convolution kernels, and the number of channels describing the characteristics of the input convolution blocks and semantic information rich in a single channel are increased; merging the feature maps output by each rolling block to generate a feature map; the sizes of the feature maps output by the adjacent convolution blocks are sequentially halved, and the number of channels of the feature maps is sequentially doubled.
In an embodiment, the first output module is further configured to: inputting the extracted feature map into a full connection layer, and generating a probability value corresponding to a preset category; and outputting the classification result corresponding to the maximum probability value according to the probability value.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory to store a computer program; a processor configured to perform the method of any of the preceding embodiments.
In a fourth aspect, an embodiment of the present invention provides a non-transitory electronic device readable storage medium, including: a program which, when run by an electronic device, causes the electronic device to perform the method of any of the preceding embodiments.
The embodiments of the image classification method, the image classification device, the electronic device and the non-transitory electronic device readable storage medium provided by the application design a new network, the network is realized by adopting one branch without additionally constructing an image pyramid, and in addition, in the aspect of feature fusion, features of the image under different granularities and different scales are simultaneously fused in the embodiments, so that the detection effect can be further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
fig. 2 is a schematic view of an application scenario of an image classification method according to an embodiment of the present application;
fig. 3 is a flowchart of an image classification method according to an embodiment of the present application;
FIG. 4 is a flowchart of another image classification method provided in the embodiments of the present application;
fig. 5 is a structural diagram of an image classification apparatus according to an embodiment of the present application.
Icon: the system comprises an electronic device 1, a bus 10, a processor 11, a memory 12, a user terminal 100, a server 200, an image classification device 500, a first acquisition module 501, a first generation module 502, a second generation module 503 and a first output module 504.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
As shown in fig. 1, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor being exemplified in fig. 1. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11 and the instructions are executed by the processor 11.
In an embodiment, the electronic device 1 may be a mobile phone, a tablet computer, or a personal computer, the electronic device 1 may receive an externally-transmitted image, then pre-process the received image through the steps of data cleaning, picture cropping, data equalization, and picture normalization, scale the pre-processed image, perform convolution, normalization, linear rectification, and down-sampling, combine channels to output a feature map, and finally classify the feature map through a convolution layer, a full connection layer, and Dropout, and output a classification result.
The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.
The present application also provides a computer-readable storage medium storing a computer program executable by the processor 11 to perform the image classification method provided by the present application.
Fig. 2 is an application scenario schematic diagram of a configuration method of an intelligent device according to an embodiment of the present application. As shown in fig. 2, the application scenario includes a user terminal 100 and a server 200. The image information can be transmitted and output between the user terminal 100 and the server 200 through a wired data transmission mode. Image information can be transmitted and output through wireless communication modes such as WIFI, 2.4G, 433M, GPRS (General Packet Radio Service) and the like.
The user terminal 100 may be a Personal Computer (PC) having an application installed therein, a tablet PC, a smart phone, a Personal Digital Assistant (PDA), or the like. The server 200 may be a server, a server cluster, or a cloud computing center. The user terminal 100 and the server 200 are connected through a wired or wireless network.
Please refer to fig. 3, which is a flowchart illustrating an image classification method according to an embodiment of the present application, which can be executed by the electronic device 1 shown in fig. 1 and used in the interactive scenario shown in fig. 2. The method comprises the following steps:
step 301: an original image set is acquired.
In this step, the raw image set contains several raw images, and the raw images may be OCT images that have been subjected to data annotation, and the OCT images are used for the examination of diabetic macular edema and age-related macular degeneration. Data labeling can be performed by an ophthalmologist, and since the OCT works by detecting light from refractive tissues of an eye to the retina, acquiring tissue thickness and distance information provided by different tissue interface reflections in the eye and restoring the information to images and data, occlusion or obstruction of any part or property in the optical path of intraocular plankton, corneal or lens opacity, intraocular fillers and the like can interfere with the reception of optical signals, and reduce signal intensity and image quality. The original image labeled with artificial data cannot be fully used for subsequent learning.
Step 302: and cleaning, cutting, enhancing data and normalizing the original image set to generate a preprocessed image set.
In this step, since the label of each original image is manually labeled, errors and labeling errors are easily generated. Therefore, screening is required according to the label information of the image, and images with wrong labels and images for measurement are filtered.
In an embodiment, the sizes of the images may be inconsistent, and the images may also be shifted when the same region is photographed, so that the original image needs to be cropped to intercept the target region. For example, the original OCT image is roughly divided into three parts, the left side is an en-face OCT image which generally shows the position of the section to be made, the right side is an OCT B scan image which shows the section information, and the lower side is basic information, where the target region may be the OCT B scan image.
In one embodiment, the classification accuracy is further improved by data enhancement, which includes random rotation of the picture, horizontal flipping, and brightness variation. In addition, during model training, imbalance of samples of Drusen (Drusen), CNV (choroidal neo vascular), DME (diabetic macular edema) and Normal (normalcy) causes the model prediction to be biased to the most abundant type, and further influences the model effect, so that data equalization is performed after data enhancement in the application, and the number of pictures in each category is guaranteed to be the same.
In an embodiment, the target image after data enhancement and data equalization may be normalized, and the normalization method may be mean variance normalization.
Step 303: and inputting the preprocessed image set into a backbone network for training to generate a characteristic diagram.
In this step, the preprocessed image is subjected to feature training through a neural network, and features of different sizes at the current granularity are obtained. By means of BN (batch normalization), accelerated convergence of BN can be adopted during training, then ReLu (Rectified Linear Unit) is carried out, convolution kernels with different sizes are adopted for the same feature map, and feature maps obtained through convolution are combined, so that the number of channels for describing features is increased, and semantic information of images is enriched.
In an embodiment, the convolution kernel in the backbone network may adopt convolution kernels 1 whose convolution kernels have sizes of 3x3, 5x5, and 1x1, and the convolution kernel 2 and the convolution kernel 3 perform convolution operations, which are denoted as convolution operation 1, convolution operation 2, and convolution operation 3, and then combine features output by the convolution operation 1 and the convolution operation 2 to obtain a feature map 1, and then adjust the feature map 2 output by the feature map 1 and the convolution operation 3 to feature maps 3 and 4 having the same number of channels through convolution of 1x1, and finally perform a summation operation on the feature maps 3 and 4 to generate a feature map as an output of a convolution block.
Step 304: and inputting the extracted feature map into a full connection layer, classifying and outputting a classification result.
In this step, a new feature map is formed after integration, which already contains features at various scales and various granularities, and then the features can be classified through the full connection layer and/or Dropout, and a classification result is output. For example, the extracted feature map is compressed into 1 dimension, and the classification result is output through a full connection layer, wherein the classified output is the probability that the input picture belongs to 4 classes.
Please refer to fig. 4, which is a flowchart illustrating an image classification method according to an embodiment of the present application, which can be executed by the electronic device 1 shown in fig. 1 and used in the interactive scenario shown in fig. 2. The method comprises the following steps:
step 401: an original image set is acquired. Please refer to the description of step 301 in the above embodiments.
Step 402: and performing data enhancement and data equalization on the target area by adopting a data enhancement method to obtain data enhancement data.
In this step, the sizes of the original images may be inconsistent, and after the same region is photographed for different original images, the region finally displayed in the image may also be shifted, and the original image needs to be cropped, and only the required portion is cropped, in an embodiment, the portion of the OCT B scanning region is cropped.
Step 403: and normalizing the data enhancement data to generate a preprocessing image set.
In order to further improve the accuracy, data enhancement operation needs to be performed on the target area image, and the data enhancement mode comprises random rotation of the image, horizontal turning and brightness change. In addition, during model training, imbalance of samples of Drusen (Drusen), CNV (choroidal neo vascular), DME (diabetic macular edema) and Normal (normalcy) causes the model prediction to be biased to the most abundant type, and further influences the model effect, so that data equalization is performed on the basis of data enhancement in the application, and the number of the pictures of each type is guaranteed to be the same.
In an embodiment, the target area image after data enhancement and data equalization may be further normalized, and the normalization method may be mean variance normalization.
Step 404: and establishing a plurality of volume blocks and extracting different scale features.
In this step, the convolution block is used to perform convolution operation on the images in the preprocessed image set, and feature vectors are extracted. The sizes of the feature maps output by the adjacent convolution blocks are sequentially halved, and the number of channels of the feature maps is sequentially doubled.
In an embodiment, the convolution block performs convolution operations with different convolution kernel sizes, and extracts scale features of different sizes at the current granularity.
In this step, convolution kernels of different sizes are used in the convolution block to perform convolution operation, so as to extract features of different scales at the current granularity. The feature graphs output by convolution operations with different convolution kernel sizes are merged and added, and the number of channels for describing features of input convolution blocks and semantic information rich in a single channel are increased. In an embodiment, different sizes of convolution kernels are used to extract different scale features, so it is more important to highlight that convolution is performed by using convolution kernels of different sizes, and the most direct method is to set the convolution step size to be 2 if the feature graph output by convolution is halved.
The method specifically comprises the following steps: firstly, convolution operations are carried out on a convolution kernel 1, a convolution kernel 2 and a convolution kernel 3 with convolution kernel sizes of 3x3, 5x5 and 1x1 respectively, the convolution operations are marked as convolution operation 1, convolution operation 2 and convolution operation 3, then features output by the convolution operations 1 and 2 are combined to obtain a feature map 1, then the feature maps 2 output by the feature map 1 and the convolution operation 3 are adjusted to feature maps 3 and 4 with the same number of channels through convolution of 1x1 respectively, and finally the feature maps 3 and 4 are subjected to a summing operation, and the generated feature maps serve as the output of a convolution block.
Step 405: and merging the channels of the feature maps extracted from the volume blocks.
In an embodiment, the number of the convolution blocks in the above step may be 4, and the convolution blocks are respectively a first convolution block, a second convolution block, a third convolution block, and a fourth convolution block, where the four convolution blocks have the same structure, and each convolution block adopts convolution kernels with different sizes to perform convolution on input features, extract features of different scales at the current granularity, perform BN and ReLu calculations after each convolution, and perform channel merging on the extracted features. The size of the feature map output by the 4 convolution blocks is reduced by half step by step, and the number of channels of the feature map is increased by multiple times.
In an embodiment, the way of channel merging may be that the feature results of two adjacent convolution blocks are channel merged, for example, after the feature result of a first convolution block is twice down sampled, the first merging result is channel merged with the feature result of a second convolution block to generate a first merging result, after the first merging result is 2 times down sampled, the first merging result is merged with the feature result of a 3 rd convolution block to generate a second merging result, after the second merging result is 2 times down sampled, the second merging result is merged with a 4 th convolution block to generate a final merging result, at this time, each feature image already contains features at each scale and each granularity, and the final merging result is passed through one layer of convolution layer, BN, and Relu to generate a final feature map.
In an embodiment, if the size of the feature map is reduced by half step by step, the output of the first convolution block needs to be downsampled by 8 times, the output of the second convolution block needs to be downsampled by 4 times, and the output of the third convolution block needs to be downsampled by 2 times to be channel-merged with the feature map of the 4 th convolution block.
Step 406: and inputting the extracted characteristic diagram into the full-connection layer to generate a probability value corresponding to a preset category.
In this step, the final feature map is compressed into 1 dimension, and then classified by the full link layer and Dropout, and a classification result is output. In an embodiment, the classification may be divided into 4 classes, for each picture, the network finally outputs 4 decimal numbers corresponding to the probability that the current picture belongs to the 4 classes, then the class with the highest probability is selected as the final classification result, and the activation function adopts softmax.
Step 407: and outputting the classification result corresponding to the maximum probability value according to the probability value.
Please refer to fig. 5, which is a flowchart of an image classification apparatus 500 provided in an embodiment of the present application, and implemented by the electronic device 1 shown in fig. 1 and used in the interactive scene shown in fig. 2 to receive an externally-transmitted image, pre-process the received image through the steps of data cleaning, picture cropping, data equalization and picture normalization, scale the pre-processed image, perform convolution, normalization, linear rectification and down-sampling, combine channels to output a feature map, and classify the image according to features via a full connection layer and Dropout, thereby outputting a classification result. The image classification apparatus 500 includes: the specific principle relationship among the first obtaining module 501, the first generating module 502, the second generating module 503, and the first output module 504 is as follows:
a first obtaining module 501 is configured to obtain an original image set, where the original image set includes a plurality of original images. Please refer to the description of step 301 in the above embodiments.
The first generating module 502 is configured to perform cleaning, clipping, data enhancement, and normalization on the original image set to generate a preprocessed image set. Please refer to the description of step 302 in the above embodiment.
In an embodiment, the first generating module 502 is further configured to: performing data enhancement and data equalization on the target area by adopting a data enhancement method to obtain data enhancement data, wherein the number of all classified pictures in the data enhancement data is the same; and normalizing the data enhancement data to generate a preprocessing image set. Please refer to the description of steps 402-403 in the above embodiment.
And a second generating module 503, configured to input the preprocessed image set into the backbone network for training, so as to generate a feature map. Please refer to the description of step 303 in the above embodiments.
In an embodiment, the second generating module 503 is further configured to: establishing a plurality of convolution blocks, wherein convolution kernels with different sizes are adopted in the convolution blocks to extract different scale characteristics, merging and adding operation is carried out on characteristic graphs output by convolution operation with different sizes of convolution kernels, and the number of channels describing the characteristics of the input convolution blocks and semantic information rich in a single channel are increased; merging the feature maps output by each rolling block to generate a feature map; the sizes of the feature maps output by the adjacent convolution blocks are sequentially halved, and the number of channels of the feature maps is sequentially doubled. Please refer to the description of step 404 and step 405 in the above embodiments.
And a first output module 504, configured to input the extracted feature map to a full connection layer, perform classification, and output a classification result. Please refer to the description of step 305 in the above embodiment.
In one embodiment, the first output module 504 is further configured to: inputting the extracted feature map into a full connection layer, and generating a probability value corresponding to a preset category; and outputting the classification result corresponding to the maximum probability value according to the probability value. Please refer to the description of steps 406 and 407 in the above embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a division of one logic function, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
It should be noted that the functions, if implemented in the form of software functional modules and sold or used as independent products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
The above embodiments are merely examples of the present application and are not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (8)

1. An image classification method, comprising:
obtaining an original image set, wherein the original image set comprises a plurality of original images;
cleaning, cutting, data enhancing and normalizing the original image set to generate a preprocessed image set;
inputting the preprocessed image set into a backbone network for training to generate a feature map;
inputting the extracted feature map into a full connection layer, classifying and outputting a classification result;
inputting the preprocessed image set into a backbone network for training to generate a feature map, wherein the feature map comprises:
establishing a plurality of convolution blocks, wherein convolution kernels with different sizes are adopted in the convolution blocks to extract different scale features, merging and adding operation is carried out on feature graphs output by convolution operation with different sizes of convolution kernels, and the number of channels describing the features of the input convolution blocks and semantic information rich in a single channel are increased;
merging the feature maps output by each rolling block to generate the feature maps; wherein the content of the first and second substances,
the sizes of the feature graphs output by the adjacent rolling blocks are sequentially halved, and the number of channels of the feature graphs is sequentially doubled;
the convolution block adopts convolution kernels with different sizes to extract different scale features, and the feature graphs output by convolution operations with different sizes of convolution kernels are merged and added, and the method comprises the following steps:
performing convolution operation by using a first convolution kernel, a second convolution kernel and a third convolution kernel with convolution kernel sizes of 3x3, 5x5 and 1x1 respectively, wherein the convolution operation is marked as a first convolution operation, a second convolution operation and a third convolution operation respectively;
combining the features output by the first convolution operation and the second convolution operation to obtain a first feature map;
the first feature map and the second feature map output by the third convolution operation are respectively adjusted into a third feature map and a fourth feature map with the same channel number through convolution of 1x 1;
adding the third feature map and the fourth feature map to generate a feature map as an output of the volume block;
the number of the volume blocks is four, and the merging of the feature maps output by the volume blocks includes:
after the feature result of the first convolution block is sampled twice, channel combination is carried out on the feature result of the first convolution block and the feature result of the second convolution block to generate a first combination result;
after the first combination result is sampled twice, the first combination result and the characteristic result of the third volume block are combined to generate a second combination result;
after the second combination result is sampled twice, the second combination result is combined with a fourth convolution block to generate a final combination result;
and after the final merging result passes through a convolution layer, BN and Relu, generating the characteristic diagram.
2. The method of claim 1, wherein the cleaning, cropping, data enhancement, and normalization of the original image set to generate a pre-processed image set comprises:
performing data enhancement and data equalization on a target area by adopting a data enhancement method to obtain data enhancement data, wherein the number of all classified pictures in the data enhancement data is the same;
and normalizing the data enhancement data to generate the preprocessing image set.
3. The method according to claim 1, wherein the inputting the extracted feature map into a full connection layer, classifying and outputting a classification result comprises:
inputting the extracted feature map into a full connection layer to generate a probability value corresponding to a preset category;
and outputting the classification result corresponding to the maximum probability value according to the probability value.
4. An image classification apparatus, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an original image set, and the original image set comprises a plurality of original images;
the first generation module is used for generating a preprocessing image set after cleaning, cutting, data enhancement and normalization processing are carried out on an original image set;
the second generation module is used for inputting the preprocessed image set into a backbone network for training to generate a feature map;
the first output module is used for inputting the extracted feature map into a full connection layer, classifying and outputting a classification result;
the second generation module is further to:
establishing a plurality of convolution blocks, wherein convolution kernels with different sizes are adopted in the convolution blocks to extract different scale features, merging and adding operation is carried out on feature graphs output by convolution operation with different sizes of convolution kernels, and the number of channels describing the features of the input convolution blocks and semantic information rich in a single channel are increased;
merging the feature maps output by each rolling block to generate the feature maps; wherein the content of the first and second substances,
the sizes of the feature graphs output by the adjacent rolling blocks are sequentially halved, and the number of channels of the feature graphs is sequentially doubled;
the convolution block adopts convolution kernels with different sizes to extract different scale features, and the feature graphs output by convolution operations with different sizes of convolution kernels are merged and added, and the method comprises the following steps:
performing convolution operation by using a first convolution kernel, a second convolution kernel and a third convolution kernel with convolution kernel sizes of 3x3, 5x5 and 1x1 respectively, wherein the convolution operation is marked as a first convolution operation, a second convolution operation and a third convolution operation respectively;
combining the features output by the first convolution operation and the second convolution operation to obtain a first feature map;
the first feature map and the second feature map output by the third convolution operation are respectively adjusted into a third feature map and a fourth feature map with the same channel number through convolution of 1x 1;
adding the third feature map and the fourth feature map to generate a feature map as an output of the volume block;
the number of the volume blocks is four, and the merging of the feature maps output by the volume blocks includes:
after the feature result of the first convolution block is sampled twice, channel combination is carried out on the feature result of the first convolution block and the feature result of the second convolution block to generate a first combination result;
after the first combination result is sampled twice, the first combination result and the characteristic result of the third volume block are combined to generate a second combination result;
after the second combination result is sampled twice, the second combination result is combined with a fourth convolution block to generate a final combination result;
and after the final merging result passes through a convolution layer, BN and Relu, generating the characteristic diagram.
5. The apparatus of claim 4, wherein the first generating module is further configured to:
performing data enhancement and data equalization on a target area by adopting a data enhancement method to obtain data enhancement data, wherein the number of all classified pictures in the data enhancement data is the same;
and normalizing the data enhancement data to generate the preprocessing image set.
6. The apparatus of claim 4, wherein the first output module is further configured to:
inputting the extracted feature map into a full connection layer to generate a probability value corresponding to a preset category;
and outputting the classification result corresponding to the maximum probability value according to the probability value.
7. An electronic device, comprising:
a memory to store a computer program;
a processor to perform the method of any one of claims 1 to 3.
8. A non-transitory electronic device readable storage medium, comprising: program which, when run by an electronic device, causes the electronic device to perform the method of any one of claims 1 to 3.
CN202011462359.2A 2020-12-14 2020-12-14 Image classification method and device, electronic equipment and storage medium Active CN112232448B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011462359.2A CN112232448B (en) 2020-12-14 2020-12-14 Image classification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011462359.2A CN112232448B (en) 2020-12-14 2020-12-14 Image classification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112232448A CN112232448A (en) 2021-01-15
CN112232448B true CN112232448B (en) 2021-04-23

Family

ID=74124610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011462359.2A Active CN112232448B (en) 2020-12-14 2020-12-14 Image classification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112232448B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435454B (en) * 2021-05-21 2023-07-25 厦门紫光展锐科技有限公司 Data processing method, device and equipment
WO2023044612A1 (en) * 2021-09-22 2023-03-30 深圳先进技术研究院 Image classification method and apparatus
CN114140637B (en) * 2021-10-21 2023-09-12 阿里巴巴达摩院(杭州)科技有限公司 Image classification method, storage medium and electronic device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530295A (en) * 2016-11-07 2017-03-22 首都医科大学 Fundus image classification method and device of retinopathy

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728224B (en) * 2019-10-08 2022-03-11 西安电子科技大学 Remote sensing image classification method based on attention mechanism depth Contourlet network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106530295A (en) * 2016-11-07 2017-03-22 首都医科大学 Fundus image classification method and device of retinopathy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
糖尿病性视网膜图像的深度神经网络分类方法;丁蓬莉等;《计算机应用》;20170310;第37卷(第3期);第699-704页 *

Also Published As

Publication number Publication date
CN112232448A (en) 2021-01-15

Similar Documents

Publication Publication Date Title
CN112232448B (en) Image classification method and device, electronic equipment and storage medium
CN110662484B (en) System and method for whole body measurement extraction
US20220076420A1 (en) Retinopathy recognition system
CN109670532B (en) Method, device and system for identifying abnormality of biological organ tissue image
CN107437092B (en) The classification method of retina OCT image based on Three dimensional convolution neural network
Bilal et al. Diabetic retinopathy detection and classification using mixed models for a disease grading database
US9779492B1 (en) Retinal image quality assessment, error identification and automatic quality correction
CN110211087B (en) Sharable semiautomatic marking method for diabetic fundus lesions
KR20200004841A (en) System and method for guiding a user to take a selfie
JP2022521844A (en) Systems and methods for measuring weight from user photos using deep learning networks
US11967181B2 (en) Method and device for retinal image recognition, electronic equipment, and storage medium
CN112017185B (en) Focus segmentation method, device and storage medium
CN111860169B (en) Skin analysis method, device, storage medium and electronic equipment
CN111402217B (en) Image grading method, device, equipment and storage medium
CN109344864B (en) Image processing method and device for dense object
US20220254134A1 (en) Region recognition method, apparatus and device, and readable storage medium
CN110197474B (en) Image processing method and device and training method of neural network model
CN107958453A (en) Detection method, device and the computer-readable storage medium of galactophore image lesion region
CN113011450B (en) Training method, training device, recognition method and recognition system for glaucoma recognition
US20240112329A1 (en) Distinguishing a Disease State from a Non-Disease State in an Image
CN113763348A (en) Image quality determination method and device, electronic equipment and storage medium
Jana et al. A semi-supervised approach for automatic detection and segmentation of optic disc from retinal fundus image
Perez et al. A new method for online retinal optic-disc detection based on cascade classifiers
Yang et al. Blood vessel segmentation of fundus images via cross-modality dictionary learning
Mohammedhasan et al. A new deeply convolutional neural network architecture for retinal blood vessel segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant