WO2021003938A1 - 图像分类方法、装置、计算机设备和存储介质 - Google Patents

图像分类方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2021003938A1
WO2021003938A1 PCT/CN2019/118339 CN2019118339W WO2021003938A1 WO 2021003938 A1 WO2021003938 A1 WO 2021003938A1 CN 2019118339 W CN2019118339 W CN 2019118339W WO 2021003938 A1 WO2021003938 A1 WO 2021003938A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sub
images
region
interest
Prior art date
Application number
PCT/CN2019/118339
Other languages
English (en)
French (fr)
Inventor
王健宗
魏文琦
贾雪丽
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021003938A1 publication Critical patent/WO2021003938A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Definitions

  • This application relates to an image classification method, device, computer equipment and storage medium.
  • Computer vision refers to the use of cameras and computers instead of human eyes to identify, track, and measure machine vision for targets, and further process graphics to make computer processing more suitable for human eyes to observe or transmit to instrument detection. It has broad application prospects in security, risk control, medical treatment and military. For example, in the medical neighborhood, medical images, such as MRI scan images, can be identified and classified based on computer vision to assist medical treatment.
  • an image classification method, apparatus, computer equipment, and storage medium are provided.
  • An image classification method including:
  • the image to be classified is classified to obtain the image classification result.
  • An image classification device including:
  • the ROI determination module is used to obtain the image to be classified and determine the region of interest in the image to be classified;
  • the scale sub-image module is used to generate multiple scale sub-images according to the region of interest
  • the feature extraction module is used to perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image;
  • the feature connection module is used to connect feature sub-vectors to obtain the image feature vector corresponding to the region of interest.
  • the image classification processing module is used to classify the image to be classified according to the image feature vector and the preset classifier to obtain the image classification result.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps: Obtain the image to be classified and determine the region of interest in the image to be classified; generate multiple scale sub-images according to the region of interest; perform feature extraction on the multiple scale sub-images to obtain the feature sub-images corresponding to each scale sub-image Vector; connect the feature sub-vectors to obtain the image feature vector corresponding to the region of interest; and according to the image feature vector and the preset classifier, classify the image to be classified to obtain the image classification result.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors perform the following steps: Obtain the image to be classified , And determine the region of interest in the image to be classified; generate multiple scale sub-images according to the region of interest; perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image; connect feature sub-vectors, Obtain the image feature vector corresponding to the region of interest; and perform classification processing on the image to be classified according to the image feature vector and the preset classifier to obtain the image classification result.
  • Fig. 1 is an application scenario diagram of an image classification method according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of an image classification method according to one or more embodiments.
  • Fig. 3 is a schematic diagram of a process of generating multiple scale sub-images according to one or more embodiments.
  • Fig. 4 is a schematic flowchart of an image classification method in another embodiment.
  • Fig. 5 is a block diagram of an image classification device according to one or more embodiments.
  • Figure 6 is a block diagram of a computer device according to one or more embodiments.
  • the image classification method provided in this application can be applied to the application environment as shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through the network through the network.
  • the terminal 102 sends the image to be classified to the server 104, and the server 104 generates multiple scale sub-images according to the region of interest determined from the image to be classified, performs feature extraction on the multiple scale sub-images, and connects the corresponding features of each scale sub-image
  • the sub-vector obtains the image feature vector, and combines the preset classifier to classify the image to be classified to obtain the image classification result.
  • the server 104 may also feed back the image classification result to the terminal 102.
  • the terminal 102 can also directly generate multiple scale sub-images according to the region of interest determined from the image to be classified, perform feature extraction on the multiple scale sub-images, and connect the feature sub-vectors corresponding to the sub-images to obtain the image feature vector. , And combine the preset classifier to classify the image to be classified to obtain the image classification result.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • an image classification method is provided. Taking the method applied to the server or terminal in FIG. 1 as an example, the method includes the following steps:
  • Step S201 Obtain an image to be classified, and determine a region of interest in the image to be classified.
  • the image to be classified is an image that needs to be recognized and classified.
  • the image to be classified may be a medical scan image in the medical field, such as a pathological slice image.
  • the region of interest may be a region with obvious image characteristics in the image to be classified.
  • the region of interest may be a partial region of a human face; and when the image to be classified is a pathological slice image, the region of interest may be an area with obvious pathological features.
  • the pathological slice image is a full tumor slice image
  • the region of interest may be the most severe tumor proliferation region.
  • Step S203 Generate multiple scale sub-images according to the region of interest.
  • the region of interest is segmented, and scale transformation is performed to generate multiple scale sub-images.
  • Multiple scale sub-images can better reflect the characteristics of the region of interest, which is beneficial to improve the accuracy of image classification.
  • Step S205 Perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image.
  • Multiple scale sub-images contain different image features of the region of interest, and feature extraction is performed on multiple scale sub-images respectively to obtain feature sub-vectors corresponding to each scale sub-image.
  • the feature sub-vector reflects the image feature of the corresponding scale sub-image.
  • the residual network model may specifically be ResNet101 (Residual Neural Network), which is composed of five BLOCK modules, each BLOCK module contains three bottleneck blocks, and each bottleneck block is composed of three layers of convolution.
  • ResNet101 Residual Neural Network
  • the residual network uses a large convolution (7*7) at the beginning to extract rough features, and then uses a stack of 1*1, 3*3, 1*1 convolution kernels to extract finer features.
  • Step S207 Connect the feature sub-vectors to obtain the image feature vector corresponding to the region of interest.
  • the feature sub-vectors After obtaining the feature sub-vectors corresponding to the sub-images of each scale, the feature sub-vectors are connected to obtain the image feature vector corresponding to the region of interest.
  • the image feature vector represents the image feature of the region of interest.
  • the feature subvectors can be pooled through the P-norm pooling operation. P can be set according to actual needs. For example, P can be 3, and then the pooled feature subvectors can be connected. Obtain the image feature vector corresponding to the region of interest.
  • Step S209 Perform classification processing on the image to be classified according to the image feature vector and the preset classifier to obtain an image classification result.
  • the image feature vector corresponding to the region of interest is obtained, it is input into a preset classifier, and the classifier performs classification processing on the image to be classified according to the image feature vector to obtain the image classification result.
  • the image classification standard is a person
  • the image feature vector of the region of interest in the image to be classified is input into the classifier, and the classifier outputs a yes or no image classification result.
  • the classification criteria can be pathological severity.
  • the classifier After inputting the image feature vector of the region of interest in the image to be classified into the classifier, the classifier outputs the image classification result of the pathological severity level, and the image classification result Can be used for auxiliary medical care.
  • multiple scale sub-images are generated according to the region of interest determined from the image to be classified, feature extraction is performed on the multiple scale sub-images, and the corresponding feature sub-vectors of each scale sub-image are connected to obtain the image feature vector, And combine the preset classifier to classify the image to be classified to obtain the image classification result.
  • multiple scale sub-images are generated according to the region of interest determined in the image to be classified, and feature sub-vectors are extracted respectively, which can refine the feature details of the image to be classified and improve the accuracy of image classification.
  • determining the region of interest in the image to be classified includes: dividing the image to be classified according to preset region division parameters to obtain the image of each region; determining the confidence probability corresponding to the image of each region; The images are sorted, and a preset number of regions of interest are determined from the images of each region according to the sorting results.
  • the image to be classified can be divided, and a certain number of regions of interest can be determined from each region according to the confidence probability of each region.
  • the preset area division parameter and the preset number are further queried.
  • the area division parameter may be the size of the region of interest, and the preset number is the number of the desired region of interest. Divide the images to be classified according to the area division parameters, and obtain the images of each area. Then determine the confidence probability corresponding to each region image.
  • the confidence probability reflects the extent to which the region image can be used as the region of interest. The higher the confidence probability, the more likely the corresponding region image is to be the region of interest.
  • the confidence probability corresponding to each region image can be obtained through the analysis of the preset convolution recognition network, for example, the VGG16 (Visual Geometry Group Network, visual geometry group network) convolutional neural network model can be used to analyze and process each region image get.
  • the VGG16 Visual Geometry Group Network, visual geometry group network
  • the confidence probability corresponding to each region image sort the region images according to the confidence probability, for example, sort the region images in descending order of the confidence probability, and determine the preset number from each region image according to the sorting result For example, when the preset number is 3, the region image with the top 3 confidence probability is taken as the region of interest from the ranking result.
  • determining the confidence probability corresponding to each region image includes: recognizing each region image through a preset region of interest convolution recognition network to obtain the confidence probability corresponding to each region image; and the region of interest convolution recognition network Trained through the following steps: Obtain historical region of interest data, which includes historical region images and historical confidence probabilities corresponding to historical region images; train convolutional recognition network models through historical region images and historical confidence probabilities to obtain model training Output; when the model training output meets the training end condition, the training ends, and the convolutional recognition network for the region of interest is obtained.
  • the image of each region is processed through a preset region of interest convolution recognition network to determine the confidence probability corresponding to the image of each region.
  • the region-of-interest convolution recognition network is trained based on historical region-of-interest data including historical region images and historical confidence probabilities corresponding to historical region images.
  • the region of interest convolution recognition network when determining the confidence probability corresponding to each region image, query the preset region of interest convolution recognition network, input each region image into the region of interest convolution recognition network, and the region of interest convolution recognition network Recognize each area image and output the confidence probability corresponding to each area image.
  • the region-of-interest convolutional recognition network obtain historical region-of-interest data.
  • the historical region-of-interest data includes the historical region image and the historical confidence probability corresponding to the historical region image.
  • the historical region image is used as the input of the convolutional recognition network model.
  • the historical confidence The probability is used as the output control of the convolutional recognition network model.
  • Train the convolutional recognition network model through historical region images and historical confidence probabilities to obtain the model training output.
  • the model training output meets the training end conditions, the training ends, and the region of interest convolution recognition network is obtained.
  • the region of interest convolution recognition network can be based on The input area image outputs its corresponding confidence probability.
  • the image to be classified is a pathological slice image
  • the preset number of regions of interest is 3
  • the region division parameter is 6000*6000px, that is, the required size of the region of interest is 6000*6000px.
  • the image to be classified that is, the pathological slice image
  • carries a label and the pathological slice image can be controlled according to the label of the pathological slice image.
  • Divide the pathological slice image according to 6000*6000px to obtain the image of each area, and then recognize the image of each area through the VGG16 convolutional neural network model to obtain the confidence probability corresponding to the image of each area, and take the three area images with the highest confidence probability as Region of Interest (ROI, Region of Interest).
  • ROI Region of Interest
  • the VGG16 convolutional neural network model contains five convolutional layers.
  • the size of the convolution kernel is 3*3.
  • After each convolutional layer there is a pooling layer.
  • the size of the pooling layer is 2*2, and the step size is 2.
  • After passing through such a convolutional layer the size of the image will not change.
  • After passing through the pooling layer the length and width of the image become half of the original, and the entire area becomes a quarter of the previous.
  • input the VGG16 convolutional neural network model for training Since the size of each region of interest is different, a certain transformation is required to obtain the same output.
  • the VGG16 convolutional neural network model divides the input image into 7*7 parts, performs a pooling operation for each part, and retains the maximum value in 49 regions. Use this as the output of the VGG16 convolutional neural network model. Because the area of interest varies in size, squares will not always appear. For example, irregular rectangles such as 2*7 will appear. In the direction of length 2, they will still be divided into 7 parts (each 2/7 ), rounding is used for parts that are not integers, and for parts with a length of 7, it will be divided normally. After this method, the output part of the VGG16 convolutional neural network model is mostly consistent.
  • generating multiple scale sub-images according to the region of interest includes:
  • Step S301 Acquire sub-image parameters and image scale parameters.
  • the sub-image parameter may be a size parameter of the sub-image block
  • the image scale parameter may be a transformation parameter for performing scale transformation on the sub-image block.
  • the sub-image parameter can be 1000*1000px
  • the image scale parameter can be 1x, 1.5x, and 2x.
  • Step S303 Segment the region of interest according to the sub-image parameters to obtain each sub-image block.
  • the region of interest is segmented according to the sub-image parameters, and each sub-image block that represents the different image details of the region of interest is obtained. For example, a region of interest with a size of 6000*6000px is divided into sub-image blocks with a size of 1000*1000px.
  • Step S305 Scale each sub-image block according to the image scale parameter to obtain multiple scale sub-images.
  • each sub-image block After obtaining each sub-image block after segmentation of the region of interest, scale transformation is performed on each sub-image block to obtain multiple scale sub-images. For example, each sub-image block with a size of 1000*1000px is sequentially enlarged at a magnification ratio of 1, 1.5, and 2 times to obtain multiple scale sub-images. Specifically, here is the concept of a receptive field. For example, the sub-image block is 1000*1000px.
  • the center of the sub-image block is still the center of the new image, and the length and width are respectively 0.712 times the previous In this way, the image area obtained by multiplying the length and width is half of the previous one, and then the length and width are stretched to the previous size, and the multi-scale sub-image can be regarded as three sub-images with magnified proportions.
  • performing feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image includes: querying a preset image recognition residual network model, which is based on historical scales The sub-image data is obtained through training; the feature extraction of multiple scale sub-images is performed through the image recognition residual network model, and the feature sub-vector corresponding to each scale sub-image is obtained.
  • feature extraction is performed on multiple scale sub-images through a preset image recognition residual network model to obtain feature sub-vectors corresponding to each scale sub-image.
  • the preset image recognition residual network model is queried.
  • the residual network model may specifically be ResNet101, and the image recognition residual network model is trained based on historical scale sub-image data.
  • the image recognition residual network model is used to extract the features of multiple scale sub-images.
  • multiple scale sub-images are input into the image recognition residual network model in turn, and the image recognition residual network model outputs the corresponding sub-images of each scale.
  • Feature sub-vectors which reflect the image characteristics of sub-images of various scales.
  • the image to be classified is classified according to the image feature vector and the preset classifier, and the image classification result obtained includes: querying the preset classifier, and the classifier is trained based on historical image data carrying type labels Obtain; Input the image feature vector into the classifier to obtain the image classification result.
  • the feature sub-vectors can be pooled first, and then the pooled feature sub-vectors can be connected to obtain the image feature vector corresponding to the region of interest. For example, if the number of image scale parameters is 3, then the feature sub-vectors corresponding to the three scale sub-images of different scales are connected to obtain the image feature vector, and the image to be classified is classified through the preset classifier to obtain Image classification result.
  • the preset classifier is queried, and the classifier is obtained by training based on historical image data carrying type labels, for example, it may be a support vector machine (SVM) classifier.
  • the image feature vector is input into the classifier to obtain the image classification result.
  • the classifier can classify the image to be classified into different categories according to the input image feature vector, so as to realize the classification processing of the image to be classified.
  • the classification criterion can be the severity of pathology.
  • the classifier After inputting the image feature vector of the region of interest in the image to be classified into the classifier, the classifier will output the image classification result of the pathological severity level.
  • the image classification result can be used for auxiliary medical treatment, for example Prognosis can be based on the results of image classification, that is, predict the possible course and outcome of the disease, such as predicting the survival time of cancer patients.
  • the classifier is RankSVM with a Rank structure added to the SVM.
  • SVM itself is a classifier, and the structure of SVM itself is an operation to find the maximum distance between two categories, and to solve linear separable problems, linear inseparable problems, nonlinear problems, etc. by constructing Laplacian operators .
  • RankSVM uses such a method to convert all numerical ranking problems into a classification problem, so as to achieve accurate classification processing of the image to be classified.
  • the method further includes: performing statistics on the image classification result to obtain image classification accuracy.
  • statistics can be performed on the obtained image classification results to obtain image classification accuracy.
  • the image classification results can reflect the predicted survival time
  • the Spearman coefficient is used to characterize the image classification accuracy of the image classification results, that is, the difference between the actual patient survival time and the predicted time is analyzed degree.
  • Spearman's correlation coefficient is a non-parametric measurement method for evaluating rank correlation, that is, evaluating the statistical correlation between the ranking of two variables. Spearman's formula can be:
  • Spearman r s is the score, d i represents a difference between the actual patient survival time and the predicted time-ordered, n is the number of prediction, Spearman higher scores indicate greater degree of difference.
  • d i is the 2.
  • an image classification method including:
  • Step S401 Divide the image to be classified according to preset area division parameters to obtain images of each area;
  • Step S402 Recognize the images of each region through a preset region of interest convolution recognition network, and obtain the confidence probability corresponding to the image of each region;
  • Step S403 Sort the images of each region according to the confidence probability, and determine a preset number of regions of interest from the images of each region according to the sorting result.
  • the image to be classified is a pathological slice image
  • the region of interest may be an area with obvious pathological features.
  • Step S404 Generate multiple scale sub-images according to the region of interest
  • Step S405 Query a preset image recognition residual network model, and the image recognition residual network model is trained based on historical scale sub-image data;
  • Step S406 Perform feature extraction on multiple scale sub-images respectively through the image recognition residual network model to obtain feature sub-vectors corresponding to each scale sub-image;
  • Step S407 Connect the feature sub-vectors to obtain the image feature vector corresponding to the region of interest.
  • generating multiple scale sub-images according to the region of interest includes: obtaining sub-image parameters and image scale parameters; segmenting the region of interest according to the sub-image parameters to obtain each sub-image block; according to the image scale parameter, dividing each sub-image block Perform scale transformation to obtain multiple scale sub-images. Then, through the preset image recognition residual network model, that is, the ResNet101 model, feature extraction is performed on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image. After obtaining the feature sub-vectors corresponding to the sub-images of each scale, the feature sub-vectors can be pooled first, and then the pooled feature sub-vectors can be connected to obtain the image feature vector corresponding to the region of interest.
  • the preset image recognition residual network model that is, the ResNet101 model
  • Step S408 Query a preset classifier, which is obtained by training based on historical image data carrying type labels;
  • Step S409 Input the image feature vector into the classifier to obtain an image classification result
  • Step S410 Perform statistics on the image classification results to obtain image classification accuracy.
  • the image to be classified is classified through the preset classifier, RankSVM, to obtain the image classification result, and the obtained image classification result is counted to obtain the accuracy of image classification.
  • an image classification device including: an ROI determination module 501, a scale sub-image module 503, a feature extraction module 505, a feature connection module 507, and an image classification processing module 509:
  • the ROI determination module 501 is used to obtain the image to be classified and determine the region of interest in the image to be classified;
  • the scale sub-image module 503 is used to generate multiple scale sub-images according to the region of interest;
  • the feature extraction module 505 is configured to perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image;
  • the feature connection module 507 is used to connect feature sub-vectors to obtain the image feature vector corresponding to the region of interest;
  • the image classification processing module 509 is configured to perform classification processing on the image to be classified according to the image feature vector and the preset classifier to obtain the image classification result.
  • the ROI determination module 501 includes a region division unit, a confidence probability unit, and an ROI determination unit; the region division unit is used to divide the image to be classified according to preset region division parameters to obtain images of each region; the confidence probability unit , Used to determine the confidence probability corresponding to each region image; the ROI determination unit, used to sort each region image according to the confidence probability, and determine a preset number of regions of interest from each region image according to the sorting result.
  • the confidence probability unit includes a convolutional network processing unit for recognizing images of each region through a preset region of interest convolution recognition network to obtain the confidence probability corresponding to the image of each region;
  • the product recognition network is trained through the following steps: obtain historical area of interest data, which includes historical area images and historical confidence probabilities corresponding to historical area images; train convolutional recognition network models through historical area images and historical confidence probabilities, Obtain the model training output; when the model training output meets the training end condition, the training ends, and the convolutional recognition network for the region of interest is obtained.
  • the scale sub-image module 503 includes a parameter acquisition unit, an ROI segmentation unit, and a scale transformation unit; the parameter acquisition unit is used to acquire sub-image parameters and image scale parameters; the ROI segmentation unit is used to follow the sub-image parameters The region of interest is segmented to obtain each sub-image block; the scale transformation unit is used to scale each sub-image block according to the image scale parameter to obtain multiple scale sub-images.
  • the feature extraction module 505 includes a residual network query unit and a feature extraction unit; the residual network query unit is used to query a preset image recognition residual network model, and the image recognition residual network model is based on historical scales The sub-image data is obtained through training; the feature extraction unit is used to perform feature extraction on multiple scale sub-images through the image recognition residual network model to obtain feature sub-vectors corresponding to each scale sub-image.
  • the image classification processing module 509 includes a classifier query unit and a classification processing unit; the classifier query unit is used to query a preset classifier, and the classifier is trained based on historical image data carrying type labels; The classification processing unit is used to input the image feature vector into the classifier to obtain the image classification result.
  • a result statistics module is further included, which is used to perform statistics on the image classification results to obtain the accuracy of the image classification.
  • Each module in the above-mentioned image classification device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server or a terminal, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, and a network interface connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer readable instructions.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize an image classification method.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors perform the following steps: acquiring an image to be classified, And determine the region of interest in the image to be classified; generate multiple scale sub-images according to the region of interest; perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image; connect feature sub-vectors to obtain The image feature vector corresponding to the region of interest; and according to the image feature vector and the preset classifier, classify the image to be classified to obtain the image classification result.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors perform the following steps: Obtain the image to be classified , And determine the region of interest in the image to be classified; generate multiple scale sub-images according to the region of interest; perform feature extraction on multiple scale sub-images to obtain feature sub-vectors corresponding to each scale sub-image; connect feature sub-vectors, Obtain the image feature vector corresponding to the region of interest; and perform classification processing on the image to be classified according to the image feature vector and the preset classifier to obtain the image classification result.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Channel
  • memory bus Radbus direct RAM
  • RDRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种图像分类方法,包括:获取待分类图像,并确定待分类图像中的感兴趣区域;根据感兴趣区域生成多个尺度子图像;分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;连接特征子向量,得到感兴趣区域对应的图像特征向量;根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。

Description

图像分类方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请要求于2019年07月05日提交中国专利局,申请号为2019106033497,申请名称为“图像分类方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种图像分类方法、装置、计算机设备和存储介质。
背景技术
随着计算机技术的发展,计算机视觉技术得到越来越多的重视。计算机视觉是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。其在安防、风控、医疗和军事中均有广泛的应用前景。例如,在医疗邻域中,可以基于计算机视觉对医疗图像,如对核磁共振扫描图像进行识别、分类,以便辅助医疗。
然而,发明人意识到,目前的图像分类处理大多利用统计学方法进行分类,这种基于统计数据进行分类处理的准确度有限。
发明内容
根据本申请公开的各种实施例,提供一种图像分类方法、装置、计算机设备和存储介质。
一种图像分类方法,包括:
获取待分类图像,并确定待分类图像中的感兴趣区域;
根据感兴趣区域生成多个尺度子图像;
分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;
连接特征子向量,得到感兴趣区域对应的图像特征向量;及
根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
一种图像分类装置,包括:
ROI确定模块,用于获取待分类图像,并确定待分类图像中的感兴趣区域;
尺度子图像模块,用于根据感兴趣区域生成多个尺度子图像;
特征提取模块,用于分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;
特征连接模块,用于连接特征子向量,得到感兴趣区域对应的图像特征向量;及
图像分类处理模块,用于根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:获取待分类图像,并确定待分类图像中的感兴趣区域;根据感兴趣区域生成多个尺度子图像;分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;连接特征子向量,得到感兴趣区域对应的图像特征向量;及根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取待分类图像,并确定待分类图像中的感兴趣区域;根据感兴趣区域生成多个尺度子图像;分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;连接特征子向量,得到感兴趣区域对应的图像特 征向量;及根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中图像分类方法的应用场景图。
图2为根据一个或多个实施例中图像分类方法的流程示意图。
图3为根据一个或多个实施例中生成多个尺度子图像的流程示意图。
图4为另一个实施例中图像分类方法的流程示意图。
图5为根据一个或多个实施例中图像分类装置的框图。
图6为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的图像分类方法,可以应用于如图1所示的应用环境中。终端102通过网络与服务器104通过网络进行通信。终端102将待分类图像发送至服务器104,服务器104根据从待分类图像中确定的感兴趣区域生成多个尺度子图像,分别对多个尺度子图像进行特征提取,连接各尺度子图像对应的特征子向量得到图像特征向量,并结合预设的分类器对待分类图像进行分类处理,得到图像分类结果。服务器104还可以将图像分类结果反馈至终端102。此外,终端102也可以直接根据从待分类图像中确定的感兴趣区域 生成多个尺度子图像,分别对多个尺度子图像进行特征提取,连接各尺度子图像对应的特征子向量得到图像特征向量,并结合预设的分类器对待分类图像进行分类处理,得到图像分类结果。
终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在其中一个实施例中,如图2所示,提供了一种图像分类方法,以该方法应用于图1中的服务器或终端为例进行说明,包括以下步骤:
步骤S201:获取待分类图像,并确定待分类图像中的感兴趣区域。
待分类图像为需要进行识别分类的图像。例如,待分类图像可以为医疗领域中的医疗扫描图像,具体如病理切片图像。感兴趣区域可以为待分类图像中具有较明显图像特征的区域。例如,待分类图像为人物照片时,感兴趣区域可以为人脸部分区域;又如待分类图像为病理切片图像时,感兴趣区域可以为病理特征明显的区域。在具体应用中,若病理切片图像为肿瘤全切片图像,则感兴趣区域可以为肿瘤增殖最严重的区域。
步骤S203:根据感兴趣区域生成多个尺度子图像。
确定待分类图像中的感兴趣区域后,将感兴趣区域进行分割,并进行尺度变换,生成多个尺度子图像。多个尺度子图像可以更好地体现感兴趣区域的特征,从而有利于提高图像分类准确度。
具体地,如对于6000*6000px大小的感兴趣区域,可以生成多个尺度、大小为1000*1000px的尺度子图像。
步骤S205:分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量。
多个尺度子图像包含了感兴趣区域不同的图像特征,分别对多个尺度子图像进行特征提取,可以得到各尺度子图像对应的特征子向量。特征子向量反映了对应尺度子图像的图像特征。
在具体实现时,特征提取可以通过预先训练好的残差网络模型进行处理。残差网络模型具体可以为ResNet101(Residual Neural Network,残差网络),其由五个BLOCK模块所构成,每个BLOCK模块包含三个瓶颈块,每个瓶颈块由三层卷积构成。残差网络使用刚开始大卷积(7*7)提取粗略特征,之 后使用1*1,3*3,1*1卷积核的堆叠,来提取更精细的特征。
步骤S207:连接特征子向量,得到感兴趣区域对应的图像特征向量。
得到各尺度子图像对应的特征子向量后,连接各特征子向量,得到感兴趣区域对应的图像特征向量,图像特征向量表征了感兴趣区域的图像特征。具体实现时,可以先通过P-norm池化操作对特征子向量进行池化处理,P可以根据实际需求进行设定,如P可以取3,再将池化处理后的特征子向量连接起来,得到感兴趣区域对应的图像特征向量。
步骤S209:根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
得到感兴趣区域对应的图像特征向量后,将其输入预设的分类器中,由分类器根据图像特征向量对待分类图像进行分类处理,得到图像分类结果。例如,图像分类标准为是否为某人时,将待分类图像中感兴趣区域的图像特征向量输入分类器中后,由分类器输出是或否的图像分类结果。又如,对于病理切片图像,分类标准可以为病理严重程度,将待分类图像中感兴趣区域的图像特征向量输入分类器中后,由分类器输出病理严重程度等级的图像分类结果,图像分类结果可用于辅助医疗。
上述图像分类方法中,根据从待分类图像中确定的感兴趣区域生成多个尺度子图像,分别对多个尺度子图像进行特征提取,连接各尺度子图像对应的特征子向量得到图像特征向量,并结合预设的分类器对待分类图像进行分类处理,得到图像分类结果。在图像分类处理过程中,根据待分类图像中确定的感兴趣区域生成多个尺度子图像,并分别提取特征子向量,可以细化待分类图像的特征细节,提高图像分类的准确度。
在其中一个实施例中,确定待分类图像中的感兴趣区域包括:按照预设的区域划分参数划分待分类图像,得到各区域图像;确定各区域图像对应的置信概率;按照置信概率对各区域图像进行排序,并根据排序结果从各区域图像中确定预设数量的感兴趣区域。
本实施例中,可以将待分类图像进行划分,并根据各区域的置信概率从各区域中确定一定数目的感兴趣区域。具体地,在获得待分类图像后,进一步查询预设的区域划分参数和预设数量,区域划分参数可以为感兴趣区域的大小,预设数量为所需感兴趣区域的数目。按照区域划分参数划分待分类图 像,得到各区域图像。再确定各区域图像对应的置信概率,置信概率反映了该区域图像可以作为感兴趣区域的程度,置信概率越高,其对应的区域图像越可能作为感兴趣区域。在具体实现时,各区域图像对应的置信概率可以通过预设的卷积识别网络分析得到,例如可以通过VGG16(Visual Geometry Group Network,视觉几何组网络)卷积神经网络模型对各区域图像分析处理得到。得到各区域图像对应的置信概率后,按照置信概率对各区域图像进行排序,例如按照置信概率由大到小的顺序对各区域图像进行排序,并根据排序结果从各区域图像中确定预设数量的感兴趣区域,例如当预设数量为3时,从排序结果中将置信概率为前3的区域图像作为感兴趣区域。
在一些实施例中,确定各区域图像对应的置信概率包括:通过预设的感兴趣区域卷积识别网络对各区域图像进行识别,得到各区域图像对应的置信概率;感兴趣区域卷积识别网络通过以下步骤训练得到:获取历史感兴趣区域数据,历史感兴趣区域数据包括历史区域图像和历史区域图像对应的历史置信概率;通过历史区域图像和历史置信概率训练卷积识别网络模型,得到模型训练输出;当模型训练输出满足训练结束条件时结束训练,得到感兴趣区域卷积识别网络。
本实施例中,通过预设的感兴趣区域卷积识别网络对各区域图像进行处理,确定各区域图像对应的置信概率。感兴趣区域卷积识别网络基于包括历史区域图像和历史区域图像对应的历史置信概率的历史感兴趣区域数据训练得到。
具体地,确定各区域图像对应的置信概率时,查询预设的感兴趣区域卷积识别网络,将各区域图像分别输入该感兴趣区域卷积识别网络中,由感兴趣区域卷积识别网络对各区域图像进行识别,并输出各区域图像对应的置信概率。训练感兴趣区域卷积识别网络时,获取历史感兴趣区域数据,历史感兴趣区域数据包括历史区域图像和历史区域图像对应的历史置信概率,历史区域图像作为卷积识别网络模型的输入,历史置信概率作为卷积识别网络模型的输出对照。通过历史区域图像和历史置信概率训练卷积识别网络模型,得到模型训练输出,在模型训练输出满足训练结束条件时结束训练,得到感兴趣区域卷积识别网络,感兴趣区域卷积识别网络可以根据输入的区域图像输出其对应的置信概率。
在一个具体应用中,待分类图像为病理切片图像,感兴趣区域的预设数量为3,区域划分参数为6000*6000px,即所需的感兴趣区域的大小为6000*6000px。待分类图像即病理切片图像携带有标签(label),可以按照病理切片图像的标签对病理切片图像进行管控。将病理切片图像按照6000*6000px进行划分,得到各区域图像,再通过VGG16卷积神经网络模型对各区域图像进行识别,得到各区域图像对应的置信概率,将置信概率最高的3个区域图像作为感兴趣区域(ROI,Region of Interest)。
VGG16卷积神经网络模型包含五个卷积层,卷积核大小都为3*3,每一个卷积层之后都有一个池化层,池化层的大小为2*2,步长为2,通过这样的卷积层之后的图像大小不会发生变化,通过池化层之后,图像的长宽变为原来的一半,整个面积变为以前的四分之一。得到最终的特征图像后输入VGG16卷积神经网络模型进行训练。由于各个感兴趣区域的大小不同,所以需要经过一定的变换来得到大小相同的输出。VGG16卷积神经网络模型将输入其中的图像分割为7*7部分,对于每一部分进行池化操作,保留49个区域中的最大值。以此作为VGG16卷积神经网络模型的输出。由于感兴趣的区域大小不一,不会总是出现正方形,例如会出现2*7这种不规则的长方形,在长度为2的方向上,依旧会被分为7份(每份2/7),对于不是整数的部分采取四舍五入,而对于长度为7的部分,则会正常分割。通过这种方法后,VGG16卷积神经网络模型输出部分大多一致。
在其中一个实施例中,如图3所示,根据感兴趣区域生成多个尺度子图像包括:
步骤S301:获取子图像参数和图像尺度参数。
多个尺度子图像可以更好地体现感兴趣区域的特征,从而有利于提高图像分类准确度。本实施例中,在生成多个尺度子图像时,获取子图像参数和图像尺度参数。子图像参数可以为子图像块的大小参数,图像尺度参数可以为对子图像块进行尺度变换的变换参数。
例如,对于病理切片图像,子图像参数可以为1000*1000px,图像尺度参数可以为1倍、1.5倍和2倍。
步骤S303:按照子图像参数将感兴趣区域分割,得到各子图像块。
根据子图像参数将感兴趣区域分割进行分割,得到表征感兴趣区域不同 图像细节的各子图像块。例如,将6000*6000px大小的感兴趣区域划分为1000*1000px大小的各子图像块。
步骤S305:按照图像尺度参数,将各子图像块进行尺度变换,得到多个尺度子图像。
得到感兴趣区域分割后的各子图像块后,对各子图像块进行尺度变换,得到多个尺度子图像。例如,将1000*1000px大小的各子图像块依次按照1倍、1.5倍和2倍的放大比例进行放大,得到多个尺度子图像。具体地,这里放大的是一个感受野的概念,例如子图像块为1000*1000px,当放大比例为2时,子图像块的中心依旧为新图像的中心,长和宽分别为以前的0.712倍,这样长宽相乘得到的图像面积为以前的一半,再将长宽拉伸到以前大小,多尺度子图像即可以为3个放大比例的子图像。
在其中一个实施例中,分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量包括:查询预设的图像识别残差网络模型,图像识别残差网络模型基于历史尺度子图像数据训练得到;通过图像识别残差网络模型分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量。
本实施例中,通过预设的图像识别残差网络模型对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量。具体地,查询预设的图像识别残差网络模型,残差网络模型具体可以为ResNet101,图像识别残差网络模型基于历史尺度子图像数据训练得到。通过图像识别残差网络模型分别对多个尺度子图像进行特征提取,具体依次将多个尺度子图像输入图像识别残差网络模型中,由图像识别残差网络模型输出得到各尺度子图像对应的特征子向量,特征子向量反映了各尺度子图像的图像特征。
在其中一个实施例中,根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果包括:查询预设的分类器,分类器基于携带有类型标签的历史图像数据训练得到;将图像特征向量输入分类器中,得到图像分类结果。
得到各尺度子图像对应的特征子向量后,可以先对特征子向量进行池化处理,再将池化处理后的特征子向量连接,得到感兴趣区域对应的图像特征向量。例如,若图像尺度参数的数目为3,则将3个不同尺度的尺度子图像 对应的特征子向量连接,得到图像特征向量,并通过和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
具体地,查询预设的分类器,分类器基于携带有类型标签的历史图像数据训练得到,如可以为SVM(Support Vector Machine,支持向量机)分类器。将图像特征向量输入分类器中,得到图像分类结果,分类器可以根据输入的图像特征向量将待分类图像划分为不同的类别,从而实现对待分类图像的分类处理。例如,分类标准可以为病理严重程度,将待分类图像中感兴趣区域的图像特征向量输入分类器中后,由分类器输出病理严重程度等级的图像分类结果,图像分类结果可用于辅助医疗,例如可以根据图像分类结果进行预后,即预测疾病的可能病程和结局,如预测癌症病患的存活时间。
在一个具体应用中,分类器为SVM基础上增加Rank结构的RankSVM。具体地,SVM本身为分类器,SVM本身结构为寻找两种类别之间最大距离的操作,并通过构建拉普拉斯算子的方式来解决线性可分问题、线性不可分问题、非线性问题等。而RankSVM是在SVM的基础上增加了Rank结构,使得将一个排序问题转化为一个分类问题。例如x大于y,假设转化函数为f,则f(x)>f(y),即可得到F=f(x)-f(y),F只有大于0或者小于0两种情况,代表了两种不同的类别。RankSVM就是使用这样的方法,将所有的数字排序问题转化为了一个分类问题,从而实现对待分类图像的准确分类处理。
在其中一个实施例中,在得到图像分类结果之后,还包括:对图像分类结果进行统计,得到图像分类准确度。
本实施例中,可以对得到的图像分类结果进行统计,得到图像分类准确度。具体实现时,例如根据图像分类结果进行预后时,图像分类结果可以反映预测存活时间,并通过斯皮尔曼系数来表征图像分类结果的图像分类准确度,即分析实际患者存活时间与预测时间的差异程度。斯皮尔曼相关系数是评价秩相关的非参数测量方法,即评估两个变量排序之间的统计相关性。斯皮尔曼公式可以为:
Figure PCTCN2019118339-appb-000001
r s为斯皮尔曼得分,d i表示实际患者存活时间与预测时间排序的差异,n为预测数量,斯皮尔曼得分越高表明差异度越大。例如实际存活时间为1年,预 测存活时间为3年,d i即为2。
在其中一个实施例中,如图4所示,提供了一种图像分类方法,包括:
步骤S401:按照预设的区域划分参数划分待分类图像,得到各区域图像;
步骤S402:通过预设的感兴趣区域卷积识别网络对各区域图像进行识别,得到各区域图像对应的置信概率;
步骤S403:按照置信概率对各区域图像进行排序,并根据排序结果从各区域图像中确定预设数量的感兴趣区域。
本实施例中,待分类图像为病理切片图像,感兴趣区域可以为病理特征明显的区域。按照预设的区域划分参数和预设数量将待分类图像进行划分,并通过预设的感兴趣区域卷积识别网络,即VGG16卷积神经网络模型对各区域图像进行处理,确定各区域图像对应的置信概率,最后根据各区域的置信概率从各区域中确定一定数目的感兴趣区域。
步骤S404:根据感兴趣区域生成多个尺度子图像;
步骤S405:查询预设的图像识别残差网络模型,图像识别残差网络模型基于历史尺度子图像数据训练得到;
步骤S406:通过图像识别残差网络模型分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;
步骤S407:连接特征子向量,得到感兴趣区域对应的图像特征向量。
具体地,根据感兴趣区域生成多个尺度子图像包括:获取子图像参数和图像尺度参数;按照子图像参数将感兴趣区域分割,得到各子图像块;按照图像尺度参数,将各子图像块进行尺度变换,得到多个尺度子图像。再通过预设的图像识别残差网络模型,即ResNet101模型对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量。得到各尺度子图像对应的特征子向量后,可以先对特征子向量进行池化处理,再将池化处理后的特征子向量连接,得到感兴趣区域对应的图像特征向量。
步骤S408:查询预设的分类器,分类器基于携带有类型标签的历史图像数据训练得到;
步骤S409:将图像特征向量输入分类器中,得到图像分类结果;
步骤S410:对图像分类结果进行统计,得到图像分类准确度。
得到感兴趣区域对应的图像特征向量后,通过和预设的分类器,即 RankSVM对待分类图像进行分类处理,得到图像分类结果,并对得到的图像分类结果进行统计,得到图像分类准确度。
应该理解的是,虽然图2-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在其中一个实施例中,如图5所示,提供了一种图像分类装置,包括:ROI确定模块501、尺度子图像模块503、特征提取模块505、特征连接模块507和图像分类处理模块509:
ROI确定模块501,用于获取待分类图像,并确定待分类图像中的感兴趣区域;
尺度子图像模块503,用于根据感兴趣区域生成多个尺度子图像;
特征提取模块505,用于分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;
特征连接模块507,用于连接特征子向量,得到感兴趣区域对应的图像特征向量;
图像分类处理模块509,用于根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
在其中一个实施例中,ROI确定模块501包括区域划分单元、置信概率单元和ROI确定单元;区域划分单元,用于按照预设的区域划分参数划分待分类图像,得到各区域图像;置信概率单元,用于确定各区域图像对应的置信概率;ROI确定单元,用于按照置信概率对各区域图像进行排序,并根据排序结果从各区域图像中确定预设数量的感兴趣区域。
在其中一个实施例中,置信概率单元包括卷积网络处理单元,用于通过预设的感兴趣区域卷积识别网络对各区域图像进行识别,得到各区域图像对应的置信概率;感兴趣区域卷积识别网络通过以下步骤训练得到:获取历史 感兴趣区域数据,历史感兴趣区域数据包括历史区域图像和历史区域图像对应的历史置信概率;通过历史区域图像和历史置信概率训练卷积识别网络模型,得到模型训练输出;当模型训练输出满足训练结束条件时结束训练,得到感兴趣区域卷积识别网络。
在其中一个实施例中,尺度子图像模块503包括参数获取单元、ROI分割单元和尺度变换单元;参数获取单元,用于获取子图像参数和图像尺度参数;ROI分割单元,用于按照子图像参数将感兴趣区域分割,得到各子图像块;尺度变换单元,用于按照图像尺度参数,将各子图像块进行尺度变换,得到多个尺度子图像。
在其中一个实施例中,特征提取模块505包括残差网络查询单元和特征提取单元;残差网络查询单元,用于查询预设的图像识别残差网络模型,图像识别残差网络模型基于历史尺度子图像数据训练得到;特征提取单元,用于通过图像识别残差网络模型分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量。
在其中一个实施例中,图像分类处理模块509包括分类器查询单元和分类处理单元;分类器查询单元,用于查询预设的分类器,分类器基于携带有类型标签的历史图像数据训练得到;分类处理单元,用于将图像特征向量输入分类器中,得到图像分类结果。
在其中一个实施例中,还包括结果统计模块,用于对图像分类结果进行统计,得到图像分类准确度。
关于图像分类装置的具体限定可以参见上文中对于图像分类方法的限定,在此不再赘述。上述图像分类装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器或终端,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机可读指令。该内存储器为非易失 性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种图像分类方法。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:获取待分类图像,并确定待分类图像中的感兴趣区域;根据感兴趣区域生成多个尺度子图像;分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;连接特征子向量,得到感兴趣区域对应的图像特征向量;及根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取待分类图像,并确定待分类图像中的感兴趣区域;根据感兴趣区域生成多个尺度子图像;分别对多个尺度子图像进行特征提取,得到各尺度子图像对应的特征子向量;连接特征子向量,得到感兴趣区域对应的图像特征向量;及根据图像特征向量和预设的分类器,对待分类图像进行分类处理,得到图像分类结果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、 动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种图像分类方法,包括:
    获取待分类图像,并确定所述待分类图像中的感兴趣区域;
    根据所述感兴趣区域生成多个尺度子图像;
    分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量;
    连接所述特征子向量,得到所述感兴趣区域对应的图像特征向量;及
    根据所述图像特征向量和预设的分类器,对所述待分类图像进行分类处理,得到图像分类结果。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述待分类图像中的感兴趣区域,包括:
    按照预设的区域划分参数划分所述待分类图像,得到各区域图像;
    确定各所述区域图像对应的置信概率;及
    按照所述置信概率对各所述区域图像进行排序,并根据排序结果从各所述区域图像中确定预设数量的感兴趣区域。
  3. 根据权利要求2所述的方法,其特征在于,所述确定各所述区域图像对应的置信概率,包括:
    通过预设的感兴趣区域卷积识别网络对各所述区域图像进行识别,得到各所述区域图像对应的置信概率;
    所述感兴趣区域卷积识别网络通过以下步骤训练得到:
    获取历史感兴趣区域数据,所述历史感兴趣区域数据包括历史区域图像和所述历史区域图像对应的历史置信概率;
    通过所述历史区域图像和所述历史置信概率训练所述卷积识别网络模型,得到模型训练输出;及
    当所述模型训练输出满足训练结束条件时结束训练,得到所述感兴趣区域卷积识别网络。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述感兴趣区域 生成多个尺度子图像,包括:
    获取子图像参数和图像尺度参数;
    按照所述子图像参数将所述感兴趣区域分割,得到各子图像块;及
    按照所述图像尺度参数,将各所述子图像块进行尺度变换,得到多个尺度子图像。
  5. 根据权利要求1所述的方法,其特征在于,所述分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量,包括:
    查询预设的图像识别残差网络模型,所述图像识别残差网络模型基于历史尺度子图像数据训练得到;及
    通过所述图像识别残差网络模型分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量。
  6. 根据权利要求1所述的方法,其特征在于,所述连接所述特征子向量,得到所述感兴趣区域对应的图像特征向量,包括:
    对所述特征子向量进行池化处理,得到池化处理后的特征子向量;
    连接池化处理后的特征子向量,得到所述感兴趣区域对应的图像特征向量。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述图像特征向量和预设的分类器,对所述待分类图像进行分类处理,得到图像分类结果,包括:
    查询预设的分类器,所述分类器基于携带有类型标签的历史图像数据训练得到;及
    将所述图像特征向量输入所述分类器中,得到图像分类结果。
  8. 根据权利要求1所述的方法,其特征在于,在所述得到图像分类结果之后,所述方法还包括:
    对所述图像分类结果进行统计,得到图像分类准确度。
  9. 一种图像分类装置,包括:
    ROI确定模块,用于获取待分类图像,并确定所述待分类图像中的感兴 趣区域;
    尺度子图像模块,用于根据所述感兴趣区域生成多个尺度子图像;
    特征提取模块,用于分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量;
    特征连接模块,用于连接所述特征子向量,得到所述感兴趣区域对应的图像特征向量;及
    图像分类处理模块,用于根据所述图像特征向量和预设的分类器,对所述待分类图像进行分类处理,得到图像分类结果。
  10. 根据权利要求9所述的装置,其特征在于,所述ROI确定模块,包括:
    区域划分单元,用于按照预设的区域划分参数划分所述待分类图像,得到各区域图像;
    置信概率单元,用于确定各所述区域图像对应的置信概率;及
    ROI确定单元,用于按照所述置信概率对各所述区域图像进行排序,并根据排序结果从各所述区域图像中确定预设数量的感兴趣区域。
  11. 根据权利要求9所述的装置,其特征在于,所述尺度子图像模块,包括:
    参数获取单元,用于获取子图像参数和图像尺度参数;
    ROI分割单元,用于按照所述子图像参数将所述感兴趣区域分割,得到各子图像块;及
    尺度变换单元,用于按照所述图像尺度参数,将各所述子图像块进行尺度变换,得到多个尺度子图像。
  12. 根据权利要求9所述的装置,其特征在于,所述特征提取模块,包括:
    残差网络查询单元,用于查询预设的图像识别残差网络模型,所述图像识别残差网络模型基于历史尺度子图像数据训练得到;及
    特征提取单元,用于通过所述图像识别残差网络模型分别对多个所述尺 度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量。
  13. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取待分类图像,并确定所述待分类图像中的感兴趣区域;
    根据所述感兴趣区域生成多个尺度子图像;
    分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量;
    连接所述特征子向量,得到所述感兴趣区域对应的图像特征向量;及
    根据所述图像特征向量和预设的分类器,对所述待分类图像进行分类处理,得到图像分类结果。
  14. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    按照预设的区域划分参数划分所述待分类图像,得到各区域图像;
    确定各所述区域图像对应的置信概率;及
    按照所述置信概率对各所述区域图像进行排序,并根据排序结果从各所述区域图像中确定预设数量的感兴趣区域。
  15. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取子图像参数和图像尺度参数;
    按照所述子图像参数将所述感兴趣区域分割,得到各子图像块;及
    按照所述图像尺度参数,将各所述子图像块进行尺度变换,得到多个尺度子图像。
  16. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    查询预设的图像识别残差网络模型,所述图像识别残差网络模型基于历史尺度子图像数据训练得到;及
    通过所述图像识别残差网络模型分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量。
  17. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取待分类图像,并确定所述待分类图像中的感兴趣区域;
    根据所述感兴趣区域生成多个尺度子图像;
    分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量;
    连接所述特征子向量,得到所述感兴趣区域对应的图像特征向量;及
    根据所述图像特征向量和预设的分类器,对所述待分类图像进行分类处理,得到图像分类结果。
  18. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    按照预设的区域划分参数划分所述待分类图像,得到各区域图像;
    确定各所述区域图像对应的置信概率;及
    按照所述置信概率对各所述区域图像进行排序,并根据排序结果从各所述区域图像中确定预设数量的感兴趣区域。
  19. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    获取子图像参数和图像尺度参数;
    按照所述子图像参数将所述感兴趣区域分割,得到各子图像块;及
    按照所述图像尺度参数,将各所述子图像块进行尺度变换,得到多个尺度子图像。
  20. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    查询预设的图像识别残差网络模型,所述图像识别残差网络模型基于历 史尺度子图像数据训练得到;及
    通过所述图像识别残差网络模型分别对多个所述尺度子图像进行特征提取,得到各所述尺度子图像对应的特征子向量。
PCT/CN2019/118339 2019-07-05 2019-11-14 图像分类方法、装置、计算机设备和存储介质 WO2021003938A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910603349.7A CN110427970B (zh) 2019-07-05 2019-07-05 图像分类方法、装置、计算机设备和存储介质
CN201910603349.7 2019-07-05

Publications (1)

Publication Number Publication Date
WO2021003938A1 true WO2021003938A1 (zh) 2021-01-14

Family

ID=68408994

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118339 WO2021003938A1 (zh) 2019-07-05 2019-11-14 图像分类方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN110427970B (zh)
WO (1) WO2021003938A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488136A (zh) * 2021-01-22 2021-03-12 山东商业职业技术学院 一种图像识别系统以及图像识别装置
CN112883983A (zh) * 2021-02-09 2021-06-01 北京迈格威科技有限公司 特征提取方法、装置和电子系统
CN113077876A (zh) * 2021-03-31 2021-07-06 广州金域医学检验中心有限公司 数字病理图像标注方法、装置、计算机设备及存储介质
CN113807363A (zh) * 2021-09-08 2021-12-17 西安电子科技大学 基于轻量化残差网络的图像分类方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427970B (zh) * 2019-07-05 2023-08-01 平安科技(深圳)有限公司 图像分类方法、装置、计算机设备和存储介质
CN111951221B (zh) * 2020-07-13 2023-10-31 清影医疗科技(深圳)有限公司 一种基于深度神经网络的肾小球细胞图像识别方法
CN112115952B (zh) * 2020-08-25 2022-08-02 山东浪潮科学研究院有限公司 一种基于全卷积神经网络的图像分类方法、设备及介质
CN112927197B (zh) * 2021-02-19 2023-06-13 中冶建筑研究总院(深圳)有限公司 一种空调外机支架锈蚀检测方法、装置、设备及存储介质
CN113344040A (zh) * 2021-05-20 2021-09-03 深圳索信达数据技术有限公司 图像分类方法、装置、计算机设备和存储介质
CN113837102B (zh) * 2021-09-26 2024-05-10 广州华多网络科技有限公司 图文融合分类方法及其装置、设备、介质、产品
CN114255329A (zh) * 2021-11-19 2022-03-29 苏州微创畅行机器人有限公司 Roi自动定位方法、装置、手术机器人系统、设备及介质
CN115082718A (zh) * 2022-05-06 2022-09-20 清华大学 基于组织病理图像的胶质瘤分级方法、装置、设备及介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150328A1 (en) * 2009-12-21 2011-06-23 Electronics And Telecommunications Research Institute Apparatus and method for blockiing objectionable image on basis of multimodal and multiscale features
CN102622587A (zh) * 2012-03-08 2012-08-01 哈尔滨工程大学 基于多尺度二阶微分结构模型及改进分水岭算法的手背静脉识别方法
CN107292306A (zh) * 2017-07-07 2017-10-24 北京小米移动软件有限公司 目标检测方法及装置
CN108520214A (zh) * 2018-03-28 2018-09-11 五邑大学 一种基于多尺度hog和svm的手指静脉识别方法
CN108764072A (zh) * 2018-05-14 2018-11-06 浙江工业大学 一种基于多尺度融合的血细胞亚型图像分类方法
CN108805022A (zh) * 2018-04-27 2018-11-13 河海大学 一种基于多尺度centrist特征的遥感场景分类方法
CN109740413A (zh) * 2018-11-14 2019-05-10 平安科技(深圳)有限公司 行人重识别方法、装置、计算机设备及计算机存储介质
CN110427970A (zh) * 2019-07-05 2019-11-08 平安科技(深圳)有限公司 图像分类方法、装置、计算机设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124409A1 (en) * 2015-11-04 2017-05-04 Nec Laboratories America, Inc. Cascaded neural network with scale dependent pooling for object detection
CN106874921B (zh) * 2015-12-11 2020-12-04 清华大学 图像分类方法和装置
CN105678333B (zh) * 2016-01-06 2020-07-28 浙江宇视科技有限公司 一种拥挤区域的确定方法和装置
CN109344821A (zh) * 2018-08-30 2019-02-15 西安电子科技大学 基于特征融合和深度学习的小目标检测方法
CN109740686A (zh) * 2019-01-09 2019-05-10 中南大学 一种基于区域池化和特征融合的深度学习图像多标记分类方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150328A1 (en) * 2009-12-21 2011-06-23 Electronics And Telecommunications Research Institute Apparatus and method for blockiing objectionable image on basis of multimodal and multiscale features
CN102622587A (zh) * 2012-03-08 2012-08-01 哈尔滨工程大学 基于多尺度二阶微分结构模型及改进分水岭算法的手背静脉识别方法
CN107292306A (zh) * 2017-07-07 2017-10-24 北京小米移动软件有限公司 目标检测方法及装置
CN108520214A (zh) * 2018-03-28 2018-09-11 五邑大学 一种基于多尺度hog和svm的手指静脉识别方法
CN108805022A (zh) * 2018-04-27 2018-11-13 河海大学 一种基于多尺度centrist特征的遥感场景分类方法
CN108764072A (zh) * 2018-05-14 2018-11-06 浙江工业大学 一种基于多尺度融合的血细胞亚型图像分类方法
CN109740413A (zh) * 2018-11-14 2019-05-10 平安科技(深圳)有限公司 行人重识别方法、装置、计算机设备及计算机存储介质
CN110427970A (zh) * 2019-07-05 2019-11-08 平安科技(深圳)有限公司 图像分类方法、装置、计算机设备和存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488136A (zh) * 2021-01-22 2021-03-12 山东商业职业技术学院 一种图像识别系统以及图像识别装置
CN112883983A (zh) * 2021-02-09 2021-06-01 北京迈格威科技有限公司 特征提取方法、装置和电子系统
CN113077876A (zh) * 2021-03-31 2021-07-06 广州金域医学检验中心有限公司 数字病理图像标注方法、装置、计算机设备及存储介质
CN113077876B (zh) * 2021-03-31 2023-02-03 广州金域医学检验中心有限公司 数字病理图像标注方法、装置、计算机设备及存储介质
CN113807363A (zh) * 2021-09-08 2021-12-17 西安电子科技大学 基于轻量化残差网络的图像分类方法
CN113807363B (zh) * 2021-09-08 2024-04-19 西安电子科技大学 基于轻量化残差网络的图像分类方法

Also Published As

Publication number Publication date
CN110427970B (zh) 2023-08-01
CN110427970A (zh) 2019-11-08

Similar Documents

Publication Publication Date Title
WO2021003938A1 (zh) 图像分类方法、装置、计算机设备和存储介质
WO2020253629A1 (zh) 检测模型训练方法、装置、计算机设备和存储介质
CN110120040B (zh) 切片图像处理方法、装置、计算机设备和存储介质
JP7078803B2 (ja) 顔写真に基づくリスク認識方法、装置、コンピュータ設備および記憶媒体
WO2020015076A1 (zh) 人脸图像比对方法、装置、计算机设备及存储介质
CN109002766B (zh) 一种表情识别方法及装置
CN111931931B (zh) 一种针对病理全场图像的深度神经网络训练方法、装置
US11256737B2 (en) Image retrieval methods and apparatuses, devices, and readable storage media
CN109308488B (zh) 乳腺超声图像处理装置、方法、计算机设备及存储介质
CN110321968B (zh) 一种超声图像分类装置
Gandomkar et al. BI-RADS density categorization using deep neural networks
WO2020034801A1 (zh) 医疗特征筛选方法、装置、计算机设备和存储介质
CN112241952B (zh) 大脑中线识别方法、装置、计算机设备及存储介质
Li et al. Human sperm health diagnosis with principal component analysis and K-nearest neighbor algorithm
CN111223128A (zh) 目标跟踪方法、装置、设备及存储介质
CN113780145A (zh) 精子形态检测方法、装置、计算机设备和存储介质
Sarrafzadeh et al. The best texture features for leukocytes recognition
Dhanashree et al. Fingernail analysis for early detection and diagnosis of diseases using machine learning techniques
CN115827877A (zh) 一种提案辅助并案的方法、装置、计算机设备和存储介质
Al-Dujaili et al. A New Hybrid Model to Predict Human Age Estimation from Face Images Based on Supervised Machine Learning Algorithms
Sreeraj et al. A machine learning based framework for assisting pathologists in grading and counting of breast cancer cells
Carneiro et al. Parameter optimization of a multiscale descriptor for shape analysis on healthcare image datasets
CN111428679B (zh) 影像识别方法、装置和设备
US20230389879A1 (en) Machine learning techniques for mri processing using regional scoring of non-parametric voxel integrity rankings
Diamant et al. Breast tissue classification in mammograms using visual words

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19936956

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19936956

Country of ref document: EP

Kind code of ref document: A1