CN109635842A - A kind of image classification method, device and computer readable storage medium - Google Patents

A kind of image classification method, device and computer readable storage medium Download PDF

Info

Publication number
CN109635842A
CN109635842A CN201811350802.XA CN201811350802A CN109635842A CN 109635842 A CN109635842 A CN 109635842A CN 201811350802 A CN201811350802 A CN 201811350802A CN 109635842 A CN109635842 A CN 109635842A
Authority
CN
China
Prior art keywords
residual
convolution
feature vector
residual error
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811350802.XA
Other languages
Chinese (zh)
Inventor
赵峰
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811350802.XA priority Critical patent/CN109635842A/en
Publication of CN109635842A publication Critical patent/CN109635842A/en
Priority to PCT/CN2019/089181 priority patent/WO2020098257A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

This programme is related to artificial intelligence, provides a kind of image classification method, device and computer readable storage medium, this method comprises: building depth residual error network, and pre-training is carried out on ImageNet, weight is obtained, and utilize weights initialisation depth residual error network;The output of the last one residual unit of multiple convolutional layers of depth residual error network is extracted respectively as feature vector;Dimension-reduction treatment is carried out to obtained feature vector;Classified using classifier to obtained feature vector.The present invention is based on the features that depth residual error network extracts to carry out image classification, the feature for the extraction that the feature extracted from the deeper of residual error network compares shallow-layer can capture more advanced another characteristic and improve classification performance, nicety of grading of the present invention is higher than CNN, also has reference to other field.

Description

A kind of image classification method, device and computer readable storage medium
Technical field
The present invention relates to artificial intelligence fields, specifically, are related to a kind of image classification method, device and computer-readable deposit Storage media.
Background technique
The high speed development of artificial intelligence technology, deep neural network are increasingly used in computer vision, especially It is image classification field.
In recent years, based on deep learning according to the different characteristic reflected in each comfortable image information, different classes of The image processing method that target distinguishes.It carries out quantitative analysis to image using computer, every in image or image A pixel or region are incorporated into as a certain kind in several classifications, with replace people vision interpretation using more and more extensive.So And in current classification method, for large-sized image, calculation amount is very big, and nicety of grading is not high enough.
Summary of the invention
To solve the shortcomings of the prior art, the present invention provides a kind of image classification method, is applied to electronic device, should Method includes: building depth residual error network, and pre-training is carried out on ImageNet, obtains weight, and utilize weights initialisation Depth residual error network, the depth residual error network include multiple convolution sections, wherein and each convolution section includes multiple residual units, Each residual unit successively includes three convolutional layers again;Extract respectively multiple convolution sections of depth residual error network the last one is residual The output of poor unit is as feature vector;Dimension-reduction treatment is carried out to obtained feature vector;Using classifier to obtained feature Vector is classified.
Preferably, depth residual error network is made of residual unit, and each residual unit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(w′i)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
Preferably, the depth residual error network includes sequentially connected first convolution section, the second convolution section, third convolution Section, Volume Four product section, the 5th convolution section, input picture successively passes through the first to the 5th convolution section, in which: the first convolution section includes The convolution of 7x7x64, wherein 7X7 indicates convolution kernel, and 64 indicate port number;Second convolution section includes 3 the second residual units, the Two residual units successively include tri- convolutional layers of 1X1X64,3X3X64,1X1X256 again;Third convolution section includes 4 third residual errors Unit, third residual unit successively include tri- convolutional layers of 1X1X128,3X3X128,1X1X512 again;Volume Four product section includes 6 A 4th residual unit, the 4th residual unit successively include tri- convolutional layers of 1X1X256,3X3X256,1X1X1024 again;5th Convolution section includes 3 the 5th residual units, and the 5th residual unit successively includes 1X1X512,3X3X512,1X,1X2,048 tri- again Convolutional layer.
Preferably, the defeated of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is extracted respectively It is used as feature vector out.
Preferably, the method for dimensionality reduction being carried out to the feature vector of extraction be using a convolutional layer, a maximum pond layer, Two full articulamentums and softmax layers, the convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and the boundary of convolutional layer is filled using zero.
Preferably, the another method for carrying out dimensionality reduction to the feature vector of extraction is using principal component analysis by the 5th convolution section The feature vector of output of the last one residual unit be reduced to n-dimensional vector, n is the channel for being extracted the convolutional layer of feature Number.
Preferably, classified using linear SVM classifier to obtained feature vector.
The present invention also provides a kind of electronic device, the electronic device includes memory and the place that connect with the memory Device is managed, is stored with the image classification program that can be run on the processor, described image sort program quilt in the memory The processor realizes following steps when executing: building depth residual error network, and pre-training is carried out on ImageNet, obtains power Weight, and weights initialisation depth residual error network is utilized, the depth residual error network includes multiple convolution sections, wherein each convolution Section includes multiple residual units, and each residual unit successively includes three convolutional layers again;The more of depth residual error network are extracted respectively The output of the last one residual unit of a convolution section is as feature vector;Dimension-reduction treatment is carried out to obtained feature vector;Make Classified with classifier to obtained feature vector.
Preferably, depth residual error network is made of residual unit, and each residue unit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(w′i)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
The present invention also provides a kind of computer readable storage medium, including image point in the computer readable storage medium Class method, when described image sort program is executed by processor, the step of realizing image classification method as described above.
Image classification method, device and computer readable storage medium proposed by the present invention are mentioned based on depth residual error network Feature is taken to carry out image classification, the feature extraction extracted from the deeper of depth residual error network is more than shallower feature extraction performance It is good.Experiments prove that nicety of grading is higher than CNN, also there is reference to other field.
Detailed description of the invention
By the way that embodiment is described in conjunction with following accompanying drawings, features described above of the invention and technological merit will become More understands and be readily appreciated that.
Fig. 1 is the step flow chart for indicating the image classification method of the embodiment of the present invention;
Fig. 2 is the structural schematic diagram for indicating the residual unit of the embodiment of the present invention;
Fig. 3 is the structural schematic diagram for indicating the depth residual error network of the embodiment of the present invention;
Fig. 4-1 is the flow diagram for indicating the first dimension reduction method of the embodiment of the present invention;
Fig. 4-2 is the flow diagram for indicating second of dimension reduction method of the embodiment of the present invention;
Fig. 5 is the hardware structure schematic diagram for indicating the electronic device of the embodiment of the present invention;
Fig. 6 is the Program modual graph for indicating the image classification program of the embodiment of the present invention;
Fig. 7 is the composition schematic diagram for indicating the dimension-reduction treatment module of the embodiment of the present invention.
Specific embodiment
Image classification method, device and computer readable storage medium of the present invention described below with reference to the accompanying drawings Embodiment.Those skilled in the art will recognize, without departing from the spirit and scope of the present invention, can To be modified in a manner of a variety of different or combinations thereof to described embodiment.Therefore, attached drawing and description be inherently It is illustrative, it is not intended to limit the scope of the claims.In addition, in the present specification, attached drawing is drawn not in scale, And identical appended drawing reference indicates identical part.
It should be appreciated that ought be in the present specification and claims in use, term " includes " and "comprising" instruction be retouched State the presence of feature, entirety, step, operation, element and/or component, but be not precluded one or more of the other feature, entirety, Step, operation, the presence or addition of element, component and/or its set.It will also be understood that being wanted in description of the invention and right Term "and/or" used in book is asked to refer to any combination and all possibility of one or more of associated item listed Combination, and including these combinations.
The present invention provides a kind of image classification method, is applied to electronic device, this method comprises:
Step S10 constructs depth residual error network, and carries out pre-training on ImageNet, obtains weight, and utilize weight Initialize depth residual error network.Wherein, ImageNet is computer vision system identification project name, is at present in the world The maximum database of image recognition, actually one huge for image/visual exercise picture library.The depth is residual Poor network includes multiple convolution sections, wherein each convolution section includes multiple residual units, and each residual unit successively includes three again A convolutional layer.
Step S30 extracts the output of multiple residual units of depth residual error network as feature vector respectively.
In CNN (convolutional neural networks) model, shallower convolutional layer perception domain is smaller, and the spy of some regional areas is arrived in study Sign;Deeper convolutional layer has biggish perception domain, can learn to being more abstracted some features.These abstract characteristics are to object The sensibility such as size, the position and direction of body are higher, to facilitate the raising of recognition performance.Depth residual error network has deeper The network of level, wherein typical residual unit is made of three convolutional layers.As shown in Figure 2.Feature extraction can be considered as depth The output of filter library.The output is that form is w × h × d vector, and wherein w and h is the width and height of gained feature vector Degree, d is the number of channel in convolutional layer.Therefore, feature extraction can be considered as the two-dimensional array for the local feature that there is d to tie up.The One convolutional layer is the convolution in 1x1, and convolution kernel (namely output channel number) is 64, passes through the 1x1's of first convolutional layer 256 dimensions channel (channel) are dropped to 64 dimensions by convolution, and then by second convolutional layer, it is 64 that the convolution of 3X3, which keeps port number, Dimension, finally by third convolutional layer, feature vector is reverted to 256 dimensions by the convolution of 1x1.
Step S50 carries out dimension-reduction treatment to obtained feature vector.Since the Output Size of convolutional layer is much larger than traditional Based on the feature of 4096 Victoria C NN, for example, the size for the feature vector that the 5th convolution section is extracted is 7 × 7 × 2048.In order to reduce Calculating cost relevant to the manipulation of feature vector carries out dimension-reduction treatment to obtained feature vector.
Step S70 classifies to obtained feature vector using classifier.
Further, depth residual error network is made of residual unit, and each residual unit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(w′i)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
Pre-training is carried out to depth residual error network by ImageNet, that is to say using depth residual error network in ImageNet On picture carry out classification based training, obtain weight matrix wi, and utilize weight matrix w trained in advanceiInitialize depth residual error net Network.
In one alternate embodiment, the depth residual error network includes sequentially connected first convolution section (conv1), Two convolution sections (conv2), third convolution section (conv3), Volume Four accumulate section (conv4), the 5th convolution section (conv5), Yi Jiyi A first full articulamentum FC1, input picture successively passes through the first to the 5th convolution section, and exports through the first full articulamentum FC1.
First convolution section includes the convolution of 7x7x64, wherein and 7X7 indicates convolution kernel, and 64 indicate port number,
Second convolution section include 3 the second residual units, the second residual unit again successively include 1X1X64,3X3X64, Tri- convolutional layers of 1X1X256;
Third convolution section include 4 third residual units, third residual unit again successively include 1X1X128,3X3X128, Tri- convolutional layers of 1X1X512;
Volume Four product section include 6 the 4th residual units, the 4th residual unit again successively include 1X1X256,3X3X256, Tri- convolutional layers of 1X1X1024;
5th convolution section include 3 the 5th residual units, the 5th residual unit again successively include 1X1X512,3X3X512, Tri- convolutional layers of 1X1X2048.
In one alternate embodiment, the study weight of deeper usually has more category feature, before convolutional layer it is defeated The classification performance of outgoing vector is more excellent.If the convolutional layer of deep layer network forms very powerful feature using proper.Therefore, divide Indescribably take the output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section as feature vector.? That is extracting the output of the last one convolutional layer of third convolution section, Volume Four product section, the 5th convolution section respectively as feature Vector.
The treatment process that input picture in step S30 passes through the depth residual error network is specifically described below, with input The size of image is illustrated for being 224x224x3.
Input first passes through first convolution section, and the size of input picture is 224x224x3, and the size for exporting image becomes 112x112, that is, the elongated diminution half of image, port number 64.
Then pass through the second convolution section, the second convolution section includes 3 the second residual units, and the second residual unit successively wraps again Tri- convolutional layers of 1X1X64,3X3X64,1X1X256 are included, therefore, port number becomes 256, and the size for exporting image is 56x56.
Then pass through third convolution section, third convolution section includes 4 third residual units, and third residual unit successively wraps again Tri- convolutional layers of 1X1X128,3X3X128,1X1X512 are included, it is 512 that output channel number, which increases, and the Output Size of image is 28x28。
Then by Volume Four product section, output channel number increases to 1024, image down 14x14.
Then pass through the 5th convolution section, output channel number increases to 2048, image down 7x7.
Using the first full articulamentum FC1 output.However, the present embodiment is not using final defeated of depth residual error network Out as a result, but the output of the last one residual unit of extracting third convolution section, Volume Four product section, the 5th convolution section make respectively For feature vector, corresponding feature vector is third feature vector 301, fourth feature vector 401, fifth feature vector 501.
Third feature vector 301, fourth feature vector 401, fifth feature vector 501 are subjected to dimension-reduction treatment respectively again.
In one alternate embodiment, in step S50, the method for carrying out dimensionality reduction to the feature vector of extraction is using successively One dimensionality reduction convolutional layer (conv6) of connection, maximum pond layer, second, third full articulamentum FC2, FC3 and a softmax The feature vector extracted from third convolution section, Volume Four product section, the 5th convolution section is carried out dimension-reduction treatment by layer respectively.For example, such as Shown in Fig. 4-1, by the feature vector that the 5th convolution section is extracted sequentially input dimensionality reduction convolutional layer, maximum pond layer, second, third entirely FC2, FC3 and softmax layers of articulamentum.The dimensionality reduction convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set It is set to 1, and for the convolutional layer, being filled with is zero.The boundary of convolutional layer is filled using zero, uses zero padding The output data of convolutional layer can be allowed to keep and input data is in the constant of Spatial Dimension.
In one alternate embodiment, in step S50, as shown in the Fig. 4-2, the another of dimensionality reduction is carried out to the feature vector of extraction One method is that third convolution section, Volume Four are accumulated to the last one residual error list of section, the 5th convolution section using principal component analysis (PCA) The feature vector of the output of member is all reduced to n-dimensional vector, and the n is the port number for being extracted the convolutional layer of feature.For example, the 5th The convolutional layer of the last one residual unit of convolution section is 1X1X2048, wherein port number 2048, then the 5th convolution section is most The feature vector of the output of the latter residual unit is all reduced to 2048 dimensional vectors.
In one alternate embodiment, obtained feature vector is carried out using linear SVM (SVM) classifier Classification.Fig. 4-2 shows the assembly line of the PCA-SVM module of the 5th convolution section.This method the experimental results showed that, feature extraction Dimension can be significantly reduced in the case where not significantly reducing performance.
It is the hardware structure schematic diagram of electronic device 1 of the present invention shown in Fig. 5.The electronic device 1 is that one kind can be according to The instruction for being previously set or storing, the automatic equipment for carrying out numerical value calculating and/or information processing.The electronic device 1 can be with It is computer, is also possible to single network server, the server group of multiple network servers composition or based on cloud computing The cloud being made of a large amount of hosts or network server, wherein cloud computing is one kind of distributed computing, by a group loose couplings Computer set composition a super virtual computer.
In the present embodiment, electronic device 1 may include, but be not limited only to, and can be in communication with each other connection by system bus Memory 11, processor 14 and display 15, it should be pointed out that Fig. 2 illustrates only the electronic device 1 with members, It should be understood that be not required for implementing all components shown, the implementation that can be substituted is more or less component.
Wherein, memory 11 includes the readable storage medium storing program for executing of memory and at least one type.Inside save as the fortune of electronic device 1 Row provides caching;Readable storage medium storing program for executing can be for if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memory Deng), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), electric erasable can compile Journey read-only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile Storage medium.In some embodiments, readable storage medium storing program for executing can be the internal storage unit of electronic device 1, such as the electronics The hard disk of device 1;In further embodiments, the external storage which is also possible to electronic device 1 is set Plug-in type hard disk that is standby, such as being equipped on electronic device 1, intelligent memory card (Smart Media Card), secure digital (Secure Digital) card, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 is usual For storing the operating system and types of applications software that are installed on electronic device 1, such as the image classification program in the present embodiment Code etc..In addition, memory 11 can be also used for temporarily storing the Various types of data that has exported or will export.
The processor 14 is for running the program code stored in the memory 11 or processing data.The display Device 15 is used to show the image for needing to classify.
In addition, electronic device 1 further includes network interface, the network interface may include radio network interface or cable network Interface, the network interface are commonly used in establishing communication connection between the electronic device 1 and other electronic equipments.
Image classification program is stored in memory 11, including at least one computer-readable finger stored in memory It enables, which can be executed by processor 14, the method to realize each embodiment of the application;And At least one computer-readable instruction is different according to the function that its each section is realized, can be divided into different logic modules.
In one embodiment, following steps are realized when above-mentioned image classification program program is executed by the processor 14:
Step S10 constructs depth residual error network, and carries out pre-training on ImageNet, obtains weight, and utilize weight Initialize depth residual error network.Wherein, ImageNet is computer vision system identification project name, is at present in the world The maximum database of image recognition, actually one huge for image/visual exercise picture library.The depth is residual Poor network includes multiple convolution sections, wherein each convolution section includes multiple residual units, and each residual unit successively includes three again A convolutional layer.
Step S30 extracts the output conduct of the last one residual unit of multiple convolutional layers of depth residual error network respectively Feature vector.
Step S50 carries out dimension-reduction treatment to obtained feature vector.
Step S70 classifies to obtained feature vector using classifier.
Fig. 6 show the Program modual graph of image classification program 50.In the present embodiment, image classification program 50 is divided For multiple modules, multiple module is stored in memory 11, and is executed by processor 14, to complete the present invention.The present invention So-called module is the series of computation machine program instruction section for referring to complete specific function.
Described image sort program 50 can be divided into depth residual error network pre-training module 501, depth residual error network Initialization module 502, characteristic vector pickup module 503, dimension-reduction treatment module 504, categorization module 505.
Depth residual error network pre-training module 501 obtains weight for carrying out pre-training on ImageNet.Wherein, ImageNet is a computer vision system identification project name, is the current maximum database of image recognition in the world, real It is exactly one huge for image/visual exercise picture library on border.Depth residual error netinit module 502 utilizes weight Initialize depth residual error network.
Characteristic vector pickup module 503 extracts the output of multiple residual units of depth residual error network as feature respectively Vector.
In CNN (convolutional neural networks) model, shallower convolutional layer perception domain is smaller, and the spy of some regional areas is arrived in study Sign;Deeper convolutional layer has biggish perception domain, can learn to being more abstracted some features.These abstract characteristics are to object The sensibility such as size, the position and direction of body are higher, to facilitate the raising of recognition performance.Residual error network has deeper time Network, wherein typical residual unit is made of three convolutional layers.As shown in Figure 2.Feature extraction can be considered as depth-type filtration The output in device library.The output is that form is w × h × d vector, and wherein w and h is the width and height of gained feature vector, and d is The number of channel in convolutional layer.Therefore, feature extraction can be considered as the two-dimensional array for the local feature that there is d to tie up.First volume Lamination is the convolution in 1x1, and convolution kernel (namely output channel number) is 64, passes through the convolution handle of the 1x1 of first convolutional layer 256 dimensions channel (channel) drop to 64 dimensions, and then by second convolutional layer, it is 64 dimensions that the convolution of 3X3, which keeps port number, most Afterwards by third convolutional layer, feature vector is reverted to 256 dimensions by the convolution of 1x1.
Dimension-reduction treatment module 504 carries out dimension-reduction treatment to obtained feature vector.Since the Output Size of convolutional layer is long-range In traditional feature based on 4096 Victoria C NN, for example, the size for the feature vector that the 5th convolution section is extracted is 7 × 7 × 2048. In order to reduce the relevant calculating cost of manipulation to feature vector, dimension-reduction treatment is carried out to obtained feature vector.
Categorization module 505 classifies to obtained feature vector using classifier.
Further, depth residual error network is made of residual unit, and each residual unit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(wi′)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
Pre-training is carried out to depth residual error network by ImageNet, that is to say using depth residual error network in ImageNet On picture carry out classification based training, obtain weight matrix wi, and utilize weight matrix w trained in advanceiInitialize depth residual error net Network.
In one alternate embodiment, the depth residual error network includes sequentially connected first convolution section (conv1), Two convolution sections (conv2), third convolution section (conv3), Volume Four accumulate section (conv4), the 5th convolution section (conv5), Yi Jiyi A first full articulamentum FC1, input picture successively passes through the first to the 5th convolution section, and exports through the first full articulamentum FC1.
First convolution section includes the convolution of 7x7x64, wherein and 7X7 indicates convolution kernel, and 64 indicate port number,
Second convolution section include 3 the second residual units, the second residual unit again successively include 1X1X64,3X3X64, Tri- convolutional layers of 1X1X256;
Third convolution section include 4 third residual units, third residual unit again successively include 1X1X128,3X3X128, Tri- convolutional layers of 1X1X512;
Volume Four product section include 6 the 4th residual units, the 4th residual unit again successively include 1X1X256,3X3X256, Tri- convolutional layers of 1X1X1024;
5th convolution section include 3 the 5th residual units, the 5th residual unit again successively include 1X1X512,3X3X512, Tri- convolutional layers of 1X1X2048.
In one alternate embodiment, the study weight of deeper usually has more category feature, before convolutional layer it is defeated The classification performance of outgoing vector is more excellent.If the convolutional layer of deep layer network forms very powerful feature using proper.Therefore, special Sign vector extraction module 503 extracts the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section respectively Output is used as feature vector.That is, extracting the last one of third convolution section, Volume Four product section, the 5th convolution section respectively The output of convolutional layer is as feature vector.
The treatment process that input picture in step S30 passes through the depth residual error network is specifically described below, with input The size of image is illustrated for being 224x224x3.
Input first passes through first convolution section, and the size of input picture is 224x224x3, and the size for exporting image becomes 112x112, that is, the elongated diminution half of image, port number 64.
Then pass through the second convolution section, the second convolution section includes 3 the second residual units, and the second residual unit successively wraps again Tri- convolutional layers of 1X1X64,3X3X64,1X1X256 are included, therefore, port number becomes 256, and the size for exporting image is 56x56.
Then pass through third convolution section, third convolution section includes 4 third residual units, and third residual unit successively wraps again Tri- convolutional layers of 1X1X128,3X3X128,1X1X512 are included, it is 512 that output channel number, which increases, and the Output Size of image is 28x28。
Then by Volume Four product section, output channel number increases to 1024, image down 14x14.
Then pass through the 5th convolution section, output channel number increases to 2048, image down 7x7.
Using the first full articulamentum FC1 output.However, the present embodiment is not using final defeated of depth residual error network Out as a result, but the output of the last one residual unit of extracting third convolution section, Volume Four product section, the 5th convolution section make respectively For feature vector, corresponding feature vector is third feature vector 301, fourth feature vector 401, fifth feature vector 501.Again Third feature vector 301, fourth feature vector 401, fifth feature vector 501 are subjected to dimension-reduction treatment respectively.
In one alternate embodiment, dimension-reduction treatment module 504 further includes the first dimension-reduction treatment unit 5041, the first dimensionality reduction The method that the feature vector of 5041 pairs of processing unit extractions carries out dimensionality reduction is using a sequentially connected dimensionality reduction convolutional layer, one Maximum pond layer, second, third full articulamentum FC2, FC3 and softmax layers, will be from third convolution section, Volume Four product section, the 5th The feature vector that convolution section is extracted carries out dimension-reduction treatment respectively.For example, as shown in Fig. 4-1, by the feature of the 5th convolution section extraction Vector sequentially inputs dimensionality reduction convolutional layer, maximum pond layer, two full articulamentums and softmax layers (soft maximum layer).The dimensionality reduction Convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and for the convolutional layer, is filled with It is zero.
In one alternate embodiment, dimension-reduction treatment module 504 further includes the second dimension-reduction treatment unit 5042, such as Fig. 4-2 Shown, the method that the feature vector that the second 5042 pairs of dimension-reduction treatment unit extracts carries out dimensionality reduction is using principal component analysis (PCA) The feature vector of the output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is all reduced to n Dimensional vector, the n are the port numbers for being extracted the convolutional layer of feature.For example, the last one residual unit of the 5th convolution section Convolutional layer is 1X1X2048, wherein port number 2048, the then feature of the output of the last one residual unit of the 5th convolution section Vector is all reduced to 2048 dimensional vectors.
In one alternate embodiment, categorization module 505 is using linear SVM (SVM) classifier to obtained spy Sign vector is classified.Fig. 4-2 shows the assembly line of the PCA-SVM module of the 5th convolution section.The experimental result table of this method Bright, the dimension of feature extraction can be significantly reduced in the case where not significantly reducing performance.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium It can be hard disk, multimedia card, SD card, flash card, SMC, read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM), any one in portable compact disc read-only memory (CD-ROM), USB storage etc. or several timess Meaning combination.It include image classification program 50 etc. in the computer readable storage medium, described image sort program 50 is processed Following operation is realized when device 14 executes:
Step S10 constructs depth residual error network, and carries out pre-training on ImageNet, obtains weight, and utilize weight Initialize depth residual error network;
Step S30 extracts the output conduct of the last one residual unit of multiple convolutional layers of depth residual error network respectively Feature vector;
Step S50 carries out dimension-reduction treatment to obtained feature vector;
Step S70 classifies to obtained feature vector using classifier.
The specific embodiment of the computer readable storage medium of the present invention and above-mentioned image classification method and electronics fill Set 1 specific embodiment it is roughly the same, details are not described herein.
The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For member, the invention may be variously modified and varied.All within the spirits and principles of the present invention, it is made it is any modification, Equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of image classification method is applied to electronic device, which is characterized in that this method comprises:
Depth residual error network is constructed, and carries out pre-training on ImageNet, obtains weight, and residual using weights initialisation depth Poor network, the depth residual error network include multiple convolution sections, wherein each convolution section includes multiple residual units, each residual Poor unit successively includes three convolutional layers again;
The output of the last one residual unit of multiple convolution sections of depth residual error network is extracted respectively as feature vector;
Dimension-reduction treatment is carried out to obtained feature vector;
Classified using classifier to obtained feature vector.
2. image classification method as described in claim 1, which is characterized in that depth residual error network is made of residual unit, often A residual unit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(w′i)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
3. image classification method as described in claim 1, which is characterized in that
The depth residual error network includes sequentially connected first convolution section, the second convolution section, third convolution section, Volume Four product Section, the 5th convolution section, input picture successively pass through the first to the 5th convolution section, in which:
First convolution section includes the convolution of 7x7x64, wherein 7X7 indicates convolution kernel, and 64 indicate port number;
Second convolution section includes 3 the second residual units, and the second residual unit successively includes 1X1X64,3X3X64,1X1X256 again Three convolutional layers;
Third convolution section include 4 third residual units, third residual unit again successively include 1X1X128,3X3X128, Tri- convolutional layers of 1X1X512;
Volume Four product section include 6 the 4th residual units, the 4th residual unit again successively include 1X1X256,3X3X256, Tri- convolutional layers of 1X1X1024;
5th convolution section include 3 the 5th residual units, the 5th residual unit again successively include 1X1X512,3X3X512, Tri- convolutional layers of 1X1X2048.
4. image classification method as claimed in claim 3, which is characterized in that
The output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is extracted respectively as feature Vector.
5. image classification method as described in claim 1, which is characterized in that
It is using a convolutional layer, a maximum pond layer, two full connections to the method that the feature vector of extraction carries out dimensionality reduction Layer and softmax layers, the convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and for It is filled using zero on the boundary of convolutional layer.
6. image classification method as claimed in claim 3, which is characterized in that
To the another method that the feature vector of extraction carries out dimensionality reduction be using principal component analysis by the 5th convolution section the last one The feature vector of the output of residual unit is reduced to n-dimensional vector, and n is the port number for being extracted the convolutional layer of feature.
7. image classification method as described in claim 1, which is characterized in that
Classified using linear SVM classifier to obtained feature vector.
8. a kind of electronic device, which is characterized in that the electronic device includes memory and the processing that connect with the memory Device is stored with the image classification program that can be run on the processor in the memory, and described image sort program is by institute It states when processor executes and realizes following steps:
Depth residual error network is constructed, and carries out pre-training on ImageNet, obtains weight, and residual using weights initialisation depth Poor network, the depth residual error network include multiple convolution sections, wherein each convolution section includes multiple residual units, each residual Poor unit successively includes three convolutional layers again;
The output of the last one residual unit of multiple convolution sections of depth residual error network is extracted respectively as feature vector;
Dimension-reduction treatment is carried out to obtained feature vector;
Classified using classifier to obtained feature vector.
9. electronic device as claimed in claim 8, which is characterized in that depth residual error network is made of residual unit, each surplus Counit indicates are as follows:
yi=h (xi)+F(xi, wi)
xi+1=f (yi)
Wherein,
F is residual error function;
F is ReLU function;
wiIt is weight matrix;
xiIt is i-th layer of input;
yiIt is i-th layer of output;
The formula of function h are as follows: h (xi)=xi
The formula of residual error function F are as follows:
F(xi, wi)=wi·σ(B(w′i)·σ(B(xi)))
Wherein, B (xi) it is that batch normalizes;
w′iIt is wiTransposition;
" " indicates convolution;
σ(xi)=max (x, 0).
10. a kind of computer readable storage medium, which is characterized in that include image classification in the computer readable storage medium Program when described image sort program is executed by processor, realizes the image classification as described in any one of claims 1 to 7 The step of method.
CN201811350802.XA 2018-11-14 2018-11-14 A kind of image classification method, device and computer readable storage medium Pending CN109635842A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811350802.XA CN109635842A (en) 2018-11-14 2018-11-14 A kind of image classification method, device and computer readable storage medium
PCT/CN2019/089181 WO2020098257A1 (en) 2018-11-14 2019-05-30 Image classification method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811350802.XA CN109635842A (en) 2018-11-14 2018-11-14 A kind of image classification method, device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN109635842A true CN109635842A (en) 2019-04-16

Family

ID=66067983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811350802.XA Pending CN109635842A (en) 2018-11-14 2018-11-14 A kind of image classification method, device and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN109635842A (en)
WO (1) WO2020098257A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110651277A (en) * 2019-08-08 2020-01-03 京东方科技集团股份有限公司 Computer-implemented method, computer-implemented diagnostic method, image classification apparatus, and computer program product
WO2020098257A1 (en) * 2018-11-14 2020-05-22 平安科技(深圳)有限公司 Image classification method and device and computer readable storage medium
CN111192237A (en) * 2019-12-16 2020-05-22 重庆大学 Glue coating detection system and method based on deep learning
CN112200302A (en) * 2020-09-27 2021-01-08 四川翼飞视科技有限公司 Construction method of weighted residual error neural network
CN112465053A (en) * 2020-12-07 2021-03-09 深圳市彬讯科技有限公司 Furniture image-based object identification method, device, equipment and storage medium
WO2021051497A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 Pulmonary tuberculosis determination method and apparatus, computer device, and storage medium
WO2021179117A1 (en) * 2020-03-09 2021-09-16 华为技术有限公司 Method and apparatus for searching number of neural network channels

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159164B (en) * 2021-04-19 2023-05-12 杭州科技职业技术学院 Industrial Internet equipment collaborative operation method based on distribution type
CN116385806B (en) * 2023-05-29 2023-09-08 四川大学华西医院 Method, system, equipment and storage medium for classifying strabismus type of eye image

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650781A (en) * 2016-10-21 2017-05-10 广东工业大学 Convolutional neural network image recognition method and device
CN106709453A (en) * 2016-12-24 2017-05-24 北京工业大学 Sports video key posture extraction method based on deep learning
CN107527044A (en) * 2017-09-18 2017-12-29 北京邮电大学 A kind of multiple car plate clarification methods and device based on search
CN107590774A (en) * 2017-09-18 2018-01-16 北京邮电大学 A kind of car plate clarification method and device based on generation confrontation network
CN108764134A (en) * 2018-05-28 2018-11-06 江苏迪伦智能科技有限公司 A kind of automatic positioning of polymorphic type instrument and recognition methods suitable for crusing robot

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229952A (en) * 2017-06-01 2017-10-03 雷柏英 The recognition methods of image and device
US9946960B1 (en) * 2017-10-13 2018-04-17 StradVision, Inc. Method for acquiring bounding box corresponding to an object in an image by using convolutional neural network including tracking network and computing device using the same
CN108596069A (en) * 2018-04-18 2018-09-28 南京邮电大学 Neonatal pain expression recognition method and system based on depth 3D residual error networks
CN108596108B (en) * 2018-04-26 2021-02-23 中国科学院电子学研究所 Aerial remote sensing image change detection method based on triple semantic relation learning
CN109635842A (en) * 2018-11-14 2019-04-16 平安科技(深圳)有限公司 A kind of image classification method, device and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650781A (en) * 2016-10-21 2017-05-10 广东工业大学 Convolutional neural network image recognition method and device
CN106709453A (en) * 2016-12-24 2017-05-24 北京工业大学 Sports video key posture extraction method based on deep learning
CN107527044A (en) * 2017-09-18 2017-12-29 北京邮电大学 A kind of multiple car plate clarification methods and device based on search
CN107590774A (en) * 2017-09-18 2018-01-16 北京邮电大学 A kind of car plate clarification method and device based on generation confrontation network
CN108764134A (en) * 2018-05-28 2018-11-06 江苏迪伦智能科技有限公司 A kind of automatic positioning of polymorphic type instrument and recognition methods suitable for crusing robot

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
鲍鲜杰 等: "基于生成对抗网络的SAR图像仿真方法研究", 《第五届高分辨率对地观测学术年会论文集》, 17 October 2018 (2018-10-17), pages 1 - 17 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020098257A1 (en) * 2018-11-14 2020-05-22 平安科技(深圳)有限公司 Image classification method and device and computer readable storage medium
CN110651277A (en) * 2019-08-08 2020-01-03 京东方科技集团股份有限公司 Computer-implemented method, computer-implemented diagnostic method, image classification apparatus, and computer program product
WO2021051497A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 Pulmonary tuberculosis determination method and apparatus, computer device, and storage medium
CN111192237A (en) * 2019-12-16 2020-05-22 重庆大学 Glue coating detection system and method based on deep learning
CN111192237B (en) * 2019-12-16 2023-05-02 重庆大学 Deep learning-based glue spreading detection system and method
WO2021179117A1 (en) * 2020-03-09 2021-09-16 华为技术有限公司 Method and apparatus for searching number of neural network channels
CN112200302A (en) * 2020-09-27 2021-01-08 四川翼飞视科技有限公司 Construction method of weighted residual error neural network
CN112465053A (en) * 2020-12-07 2021-03-09 深圳市彬讯科技有限公司 Furniture image-based object identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2020098257A1 (en) 2020-05-22

Similar Documents

Publication Publication Date Title
CN109635842A (en) A kind of image classification method, device and computer readable storage medium
CN110188795B (en) Image classification method, data processing method and device
WO2020238293A1 (en) Image classification method, and neural network training method and apparatus
Chen et al. DISC: Deep image saliency computing via progressive representation learning
JP2017062781A (en) Similarity-based detection of prominent objects using deep cnn pooling layers as features
CN111275107A (en) Multi-label scene image classification method and device based on transfer learning
Bianco et al. Predicting image aesthetics with deep learning
CN108830322A (en) A kind of image processing method and device, equipment, storage medium
CN110222718B (en) Image processing method and device
CN114549913B (en) Semantic segmentation method and device, computer equipment and storage medium
CN107679457A (en) User identity method of calibration and device
CN109492093A (en) File classification method and electronic device based on gauss hybrid models and EM algorithm
CN109766470A (en) Image search method, device and processing equipment
CN114693624A (en) Image detection method, device and equipment and readable storage medium
Kalliatakis et al. Exploring object-centric and scene-centric CNN features and their complementarity for human rights violations recognition in images
CN111582372A (en) Image classification method, model, storage medium and electronic device
CN114692750A (en) Fine-grained image classification method and device, electronic equipment and storage medium
CN112364828B (en) Face recognition method and financial system
CN113822134A (en) Instance tracking method, device, equipment and storage medium based on video
Afzali et al. Genetic programming for feature selection and feature combination in salient object detection
CN115457308B (en) Fine granularity image recognition method and device and computer equipment
CN112749576B (en) Image recognition method and device, computing equipment and computer storage medium
Cao et al. MSANet: Multi-scale attention networks for image classification
CN115187456A (en) Text recognition method, device, equipment and medium based on image enhancement processing
CN114387489A (en) Power equipment identification method and device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination