CN109635842A

CN109635842A - A kind of image classification method, device and computer readable storage medium

Info

Publication number: CN109635842A
Application number: CN201811350802.XA
Authority: CN
Inventors: 赵峰; 王健宗; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-11-14
Filing date: 2018-11-14
Publication date: 2019-04-16
Also published as: WO2020098257A1

Abstract

This programme is related to artificial intelligence, provides a kind of image classification method, device and computer readable storage medium, this method comprises: building depth residual error network, and pre-training is carried out on ImageNet, weight is obtained, and utilize weights initialisation depth residual error network；The output of the last one residual unit of multiple convolutional layers of depth residual error network is extracted respectively as feature vector；Dimension-reduction treatment is carried out to obtained feature vector；Classified using classifier to obtained feature vector.The present invention is based on the features that depth residual error network extracts to carry out image classification, the feature for the extraction that the feature extracted from the deeper of residual error network compares shallow-layer can capture more advanced another characteristic and improve classification performance, nicety of grading of the present invention is higher than CNN, also has reference to other field.

Description

A kind of image classification method, device and computer readable storage medium

Technical field

The present invention relates to artificial intelligence fields, specifically, are related to a kind of image classification method, device and computer-readable deposit Storage media.

Background technique

The high speed development of artificial intelligence technology, deep neural network are increasingly used in computer vision, especially It is image classification field.

In recent years, based on deep learning according to the different characteristic reflected in each comfortable image information, different classes of The image processing method that target distinguishes.It carries out quantitative analysis to image using computer, every in image or image A pixel or region are incorporated into as a certain kind in several classifications, with replace people vision interpretation using more and more extensive.So And in current classification method, for large-sized image, calculation amount is very big, and nicety of grading is not high enough.

Summary of the invention

To solve the shortcomings of the prior art, the present invention provides a kind of image classification method, is applied to electronic device, should Method includes: building depth residual error network, and pre-training is carried out on ImageNet, obtains weight, and utilize weights initialisation Depth residual error network, the depth residual error network include multiple convolution sections, wherein and each convolution section includes multiple residual units, Each residual unit successively includes three convolutional layers again；Extract respectively multiple convolution sections of depth residual error network the last one is residual The output of poor unit is as feature vector；Dimension-reduction treatment is carried out to obtained feature vector；Using classifier to obtained feature Vector is classified.

Preferably, depth residual error network is made of residual unit, and each residual unit indicates are as follows:

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w′_i)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

Preferably, the depth residual error network includes sequentially connected first convolution section, the second convolution section, third convolution Section, Volume Four product section, the 5th convolution section, input picture successively passes through the first to the 5th convolution section, in which: the first convolution section includes The convolution of 7x7x64, wherein 7X7 indicates convolution kernel, and 64 indicate port number；Second convolution section includes 3 the second residual units, the Two residual units successively include tri- convolutional layers of 1X1X64,3X3X64,1X1X256 again；Third convolution section includes 4 third residual errors Unit, third residual unit successively include tri- convolutional layers of 1X1X128,3X3X128,1X1X512 again；Volume Four product section includes 6 A 4th residual unit, the 4th residual unit successively include tri- convolutional layers of 1X1X256,3X3X256,1X1X1024 again；5th Convolution section includes 3 the 5th residual units, and the 5th residual unit successively includes 1X1X512,3X3X512,1X,1X2,048 tri- again Convolutional layer.

Preferably, the defeated of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is extracted respectively It is used as feature vector out.

Preferably, the method for dimensionality reduction being carried out to the feature vector of extraction be using a convolutional layer, a maximum pond layer, Two full articulamentums and softmax layers, the convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and the boundary of convolutional layer is filled using zero.

Preferably, the another method for carrying out dimensionality reduction to the feature vector of extraction is using principal component analysis by the 5th convolution section The feature vector of output of the last one residual unit be reduced to n-dimensional vector, n is the channel for being extracted the convolutional layer of feature Number.

Preferably, classified using linear SVM classifier to obtained feature vector.

The present invention also provides a kind of electronic device, the electronic device includes memory and the place that connect with the memory Device is managed, is stored with the image classification program that can be run on the processor, described image sort program quilt in the memory The processor realizes following steps when executing: building depth residual error network, and pre-training is carried out on ImageNet, obtains power Weight, and weights initialisation depth residual error network is utilized, the depth residual error network includes multiple convolution sections, wherein each convolution Section includes multiple residual units, and each residual unit successively includes three convolutional layers again；The more of depth residual error network are extracted respectively The output of the last one residual unit of a convolution section is as feature vector；Dimension-reduction treatment is carried out to obtained feature vector；Make Classified with classifier to obtained feature vector.

Preferably, depth residual error network is made of residual unit, and each residue unit indicates are as follows:

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w′_i)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

The present invention also provides a kind of computer readable storage medium, including image point in the computer readable storage medium Class method, when described image sort program is executed by processor, the step of realizing image classification method as described above.

Image classification method, device and computer readable storage medium proposed by the present invention are mentioned based on depth residual error network Feature is taken to carry out image classification, the feature extraction extracted from the deeper of depth residual error network is more than shallower feature extraction performance It is good.Experiments prove that nicety of grading is higher than CNN, also there is reference to other field.

Detailed description of the invention

By the way that embodiment is described in conjunction with following accompanying drawings, features described above of the invention and technological merit will become More understands and be readily appreciated that.

Fig. 1 is the step flow chart for indicating the image classification method of the embodiment of the present invention；

Fig. 2 is the structural schematic diagram for indicating the residual unit of the embodiment of the present invention；

Fig. 3 is the structural schematic diagram for indicating the depth residual error network of the embodiment of the present invention；

Fig. 4-1 is the flow diagram for indicating the first dimension reduction method of the embodiment of the present invention；

Fig. 4-2 is the flow diagram for indicating second of dimension reduction method of the embodiment of the present invention；

Fig. 5 is the hardware structure schematic diagram for indicating the electronic device of the embodiment of the present invention；

Fig. 6 is the Program modual graph for indicating the image classification program of the embodiment of the present invention；

Fig. 7 is the composition schematic diagram for indicating the dimension-reduction treatment module of the embodiment of the present invention.

Specific embodiment

Image classification method, device and computer readable storage medium of the present invention described below with reference to the accompanying drawings Embodiment.Those skilled in the art will recognize, without departing from the spirit and scope of the present invention, can To be modified in a manner of a variety of different or combinations thereof to described embodiment.Therefore, attached drawing and description be inherently It is illustrative, it is not intended to limit the scope of the claims.In addition, in the present specification, attached drawing is drawn not in scale, And identical appended drawing reference indicates identical part.

It should be appreciated that ought be in the present specification and claims in use, term " includes " and "comprising" instruction be retouched State the presence of feature, entirety, step, operation, element and/or component, but be not precluded one or more of the other feature, entirety, Step, operation, the presence or addition of element, component and/or its set.It will also be understood that being wanted in description of the invention and right Term "and/or" used in book is asked to refer to any combination and all possibility of one or more of associated item listed Combination, and including these combinations.

The present invention provides a kind of image classification method, is applied to electronic device, this method comprises:

Step S10 constructs depth residual error network, and carries out pre-training on ImageNet, obtains weight, and utilize weight Initialize depth residual error network.Wherein, ImageNet is computer vision system identification project name, is at present in the world The maximum database of image recognition, actually one huge for image/visual exercise picture library.The depth is residual Poor network includes multiple convolution sections, wherein each convolution section includes multiple residual units, and each residual unit successively includes three again A convolutional layer.

Step S30 extracts the output of multiple residual units of depth residual error network as feature vector respectively.

In CNN (convolutional neural networks) model, shallower convolutional layer perception domain is smaller, and the spy of some regional areas is arrived in study Sign；Deeper convolutional layer has biggish perception domain, can learn to being more abstracted some features.These abstract characteristics are to object The sensibility such as size, the position and direction of body are higher, to facilitate the raising of recognition performance.Depth residual error network has deeper The network of level, wherein typical residual unit is made of three convolutional layers.As shown in Figure 2.Feature extraction can be considered as depth The output of filter library.The output is that form is w × h × d vector, and wherein w and h is the width and height of gained feature vector Degree, d is the number of channel in convolutional layer.Therefore, feature extraction can be considered as the two-dimensional array for the local feature that there is d to tie up.The One convolutional layer is the convolution in 1x1, and convolution kernel (namely output channel number) is 64, passes through the 1x1's of first convolutional layer 256 dimensions channel (channel) are dropped to 64 dimensions by convolution, and then by second convolutional layer, it is 64 that the convolution of 3X3, which keeps port number, Dimension, finally by third convolutional layer, feature vector is reverted to 256 dimensions by the convolution of 1x1.

Step S50 carries out dimension-reduction treatment to obtained feature vector.Since the Output Size of convolutional layer is much larger than traditional Based on the feature of 4096 Victoria C NN, for example, the size for the feature vector that the 5th convolution section is extracted is 7 × 7 × 2048.In order to reduce Calculating cost relevant to the manipulation of feature vector carries out dimension-reduction treatment to obtained feature vector.

Step S70 classifies to obtained feature vector using classifier.

Further, depth residual error network is made of residual unit, and each residual unit indicates are as follows:

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w′_i)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

Pre-training is carried out to depth residual error network by ImageNet, that is to say using depth residual error network in ImageNet On picture carry out classification based training, obtain weight matrix w_i, and utilize weight matrix w trained in advance_iInitialize depth residual error net Network.

In one alternate embodiment, the depth residual error network includes sequentially connected first convolution section (conv1), Two convolution sections (conv2), third convolution section (conv3), Volume Four accumulate section (conv4), the 5th convolution section (conv5), Yi Jiyi A first full articulamentum FC1, input picture successively passes through the first to the 5th convolution section, and exports through the first full articulamentum FC1.

First convolution section includes the convolution of 7x7x64, wherein and 7X7 indicates convolution kernel, and 64 indicate port number,

Second convolution section include 3 the second residual units, the second residual unit again successively include 1X1X64,3X3X64, Tri- convolutional layers of 1X1X256；

Third convolution section include 4 third residual units, third residual unit again successively include 1X1X128,3X3X128, Tri- convolutional layers of 1X1X512；

Volume Four product section include 6 the 4th residual units, the 4th residual unit again successively include 1X1X256,3X3X256, Tri- convolutional layers of 1X1X1024；

5th convolution section include 3 the 5th residual units, the 5th residual unit again successively include 1X1X512,3X3X512, Tri- convolutional layers of 1X1X2048.

In one alternate embodiment, the study weight of deeper usually has more category feature, before convolutional layer it is defeated The classification performance of outgoing vector is more excellent.If the convolutional layer of deep layer network forms very powerful feature using proper.Therefore, divide Indescribably take the output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section as feature vector.? That is extracting the output of the last one convolutional layer of third convolution section, Volume Four product section, the 5th convolution section respectively as feature Vector.

The treatment process that input picture in step S30 passes through the depth residual error network is specifically described below, with input The size of image is illustrated for being 224x224x3.

Input first passes through first convolution section, and the size of input picture is 224x224x3, and the size for exporting image becomes 112x112, that is, the elongated diminution half of image, port number 64.

Then pass through the second convolution section, the second convolution section includes 3 the second residual units, and the second residual unit successively wraps again Tri- convolutional layers of 1X1X64,3X3X64,1X1X256 are included, therefore, port number becomes 256, and the size for exporting image is 56x56.

Then pass through third convolution section, third convolution section includes 4 third residual units, and third residual unit successively wraps again Tri- convolutional layers of 1X1X128,3X3X128,1X1X512 are included, it is 512 that output channel number, which increases, and the Output Size of image is 28x28。

Then by Volume Four product section, output channel number increases to 1024, image down 14x14.

Then pass through the 5th convolution section, output channel number increases to 2048, image down 7x7.

Using the first full articulamentum FC1 output.However, the present embodiment is not using final defeated of depth residual error network Out as a result, but the output of the last one residual unit of extracting third convolution section, Volume Four product section, the 5th convolution section make respectively For feature vector, corresponding feature vector is third feature vector 301, fourth feature vector 401, fifth feature vector 501.

Third feature vector 301, fourth feature vector 401, fifth feature vector 501 are subjected to dimension-reduction treatment respectively again.

In one alternate embodiment, in step S50, the method for carrying out dimensionality reduction to the feature vector of extraction is using successively One dimensionality reduction convolutional layer (conv6) of connection, maximum pond layer, second, third full articulamentum FC2, FC3 and a softmax The feature vector extracted from third convolution section, Volume Four product section, the 5th convolution section is carried out dimension-reduction treatment by layer respectively.For example, such as Shown in Fig. 4-1, by the feature vector that the 5th convolution section is extracted sequentially input dimensionality reduction convolutional layer, maximum pond layer, second, third entirely FC2, FC3 and softmax layers of articulamentum.The dimensionality reduction convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set It is set to 1, and for the convolutional layer, being filled with is zero.The boundary of convolutional layer is filled using zero, uses zero padding The output data of convolutional layer can be allowed to keep and input data is in the constant of Spatial Dimension.

In one alternate embodiment, in step S50, as shown in the Fig. 4-2, the another of dimensionality reduction is carried out to the feature vector of extraction One method is that third convolution section, Volume Four are accumulated to the last one residual error list of section, the 5th convolution section using principal component analysis (PCA) The feature vector of the output of member is all reduced to n-dimensional vector, and the n is the port number for being extracted the convolutional layer of feature.For example, the 5th The convolutional layer of the last one residual unit of convolution section is 1X1X2048, wherein port number 2048, then the 5th convolution section is most The feature vector of the output of the latter residual unit is all reduced to 2048 dimensional vectors.

In one alternate embodiment, obtained feature vector is carried out using linear SVM (SVM) classifier Classification.Fig. 4-2 shows the assembly line of the PCA-SVM module of the 5th convolution section.This method the experimental results showed that, feature extraction Dimension can be significantly reduced in the case where not significantly reducing performance.

It is the hardware structure schematic diagram of electronic device 1 of the present invention shown in Fig. 5.The electronic device 1 is that one kind can be according to The instruction for being previously set or storing, the automatic equipment for carrying out numerical value calculating and/or information processing.The electronic device 1 can be with It is computer, is also possible to single network server, the server group of multiple network servers composition or based on cloud computing The cloud being made of a large amount of hosts or network server, wherein cloud computing is one kind of distributed computing, by a group loose couplings Computer set composition a super virtual computer.

In the present embodiment, electronic device 1 may include, but be not limited only to, and can be in communication with each other connection by system bus Memory 11, processor 14 and display 15, it should be pointed out that Fig. 2 illustrates only the electronic device 1 with members, It should be understood that be not required for implementing all components shown, the implementation that can be substituted is more or less component.

Wherein, memory 11 includes the readable storage medium storing program for executing of memory and at least one type.Inside save as the fortune of electronic device 1 Row provides caching；Readable storage medium storing program for executing can be for if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memory Deng), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), electric erasable can compile Journey read-only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile Storage medium.In some embodiments, readable storage medium storing program for executing can be the internal storage unit of electronic device 1, such as the electronics The hard disk of device 1；In further embodiments, the external storage which is also possible to electronic device 1 is set Plug-in type hard disk that is standby, such as being equipped on electronic device 1, intelligent memory card (Smart Media Card), secure digital (Secure Digital) card, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 is usual For storing the operating system and types of applications software that are installed on electronic device 1, such as the image classification program in the present embodiment Code etc..In addition, memory 11 can be also used for temporarily storing the Various types of data that has exported or will export.

The processor 14 is for running the program code stored in the memory 11 or processing data.The display Device 15 is used to show the image for needing to classify.

In addition, electronic device 1 further includes network interface, the network interface may include radio network interface or cable network Interface, the network interface are commonly used in establishing communication connection between the electronic device 1 and other electronic equipments.

Image classification program is stored in memory 11, including at least one computer-readable finger stored in memory It enables, which can be executed by processor 14, the method to realize each embodiment of the application；And At least one computer-readable instruction is different according to the function that its each section is realized, can be divided into different logic modules.

In one embodiment, following steps are realized when above-mentioned image classification program program is executed by the processor 14:

Step S30 extracts the output conduct of the last one residual unit of multiple convolutional layers of depth residual error network respectively Feature vector.

Step S50 carries out dimension-reduction treatment to obtained feature vector.

Step S70 classifies to obtained feature vector using classifier.

Fig. 6 show the Program modual graph of image classification program 50.In the present embodiment, image classification program 50 is divided For multiple modules, multiple module is stored in memory 11, and is executed by processor 14, to complete the present invention.The present invention So-called module is the series of computation machine program instruction section for referring to complete specific function.

Described image sort program 50 can be divided into depth residual error network pre-training module 501, depth residual error network Initialization module 502, characteristic vector pickup module 503, dimension-reduction treatment module 504, categorization module 505.

Depth residual error network pre-training module 501 obtains weight for carrying out pre-training on ImageNet.Wherein, ImageNet is a computer vision system identification project name, is the current maximum database of image recognition in the world, real It is exactly one huge for image/visual exercise picture library on border.Depth residual error netinit module 502 utilizes weight Initialize depth residual error network.

Characteristic vector pickup module 503 extracts the output of multiple residual units of depth residual error network as feature respectively Vector.

In CNN (convolutional neural networks) model, shallower convolutional layer perception domain is smaller, and the spy of some regional areas is arrived in study Sign；Deeper convolutional layer has biggish perception domain, can learn to being more abstracted some features.These abstract characteristics are to object The sensibility such as size, the position and direction of body are higher, to facilitate the raising of recognition performance.Residual error network has deeper time Network, wherein typical residual unit is made of three convolutional layers.As shown in Figure 2.Feature extraction can be considered as depth-type filtration The output in device library.The output is that form is w × h × d vector, and wherein w and h is the width and height of gained feature vector, and d is The number of channel in convolutional layer.Therefore, feature extraction can be considered as the two-dimensional array for the local feature that there is d to tie up.First volume Lamination is the convolution in 1x1, and convolution kernel (namely output channel number) is 64, passes through the convolution handle of the 1x1 of first convolutional layer 256 dimensions channel (channel) drop to 64 dimensions, and then by second convolutional layer, it is 64 dimensions that the convolution of 3X3, which keeps port number, most Afterwards by third convolutional layer, feature vector is reverted to 256 dimensions by the convolution of 1x1.

Dimension-reduction treatment module 504 carries out dimension-reduction treatment to obtained feature vector.Since the Output Size of convolutional layer is long-range In traditional feature based on 4096 Victoria C NN, for example, the size for the feature vector that the 5th convolution section is extracted is 7 × 7 × 2048. In order to reduce the relevant calculating cost of manipulation to feature vector, dimension-reduction treatment is carried out to obtained feature vector.

Categorization module 505 classifies to obtained feature vector using classifier.

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w_i′)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

In one alternate embodiment, the study weight of deeper usually has more category feature, before convolutional layer it is defeated The classification performance of outgoing vector is more excellent.If the convolutional layer of deep layer network forms very powerful feature using proper.Therefore, special Sign vector extraction module 503 extracts the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section respectively Output is used as feature vector.That is, extracting the last one of third convolution section, Volume Four product section, the 5th convolution section respectively The output of convolutional layer is as feature vector.

Using the first full articulamentum FC1 output.However, the present embodiment is not using final defeated of depth residual error network Out as a result, but the output of the last one residual unit of extracting third convolution section, Volume Four product section, the 5th convolution section make respectively For feature vector, corresponding feature vector is third feature vector 301, fourth feature vector 401, fifth feature vector 501.Again Third feature vector 301, fourth feature vector 401, fifth feature vector 501 are subjected to dimension-reduction treatment respectively.

In one alternate embodiment, dimension-reduction treatment module 504 further includes the first dimension-reduction treatment unit 5041, the first dimensionality reduction The method that the feature vector of 5041 pairs of processing unit extractions carries out dimensionality reduction is using a sequentially connected dimensionality reduction convolutional layer, one Maximum pond layer, second, third full articulamentum FC2, FC3 and softmax layers, will be from third convolution section, Volume Four product section, the 5th The feature vector that convolution section is extracted carries out dimension-reduction treatment respectively.For example, as shown in Fig. 4-1, by the feature of the 5th convolution section extraction Vector sequentially inputs dimensionality reduction convolutional layer, maximum pond layer, two full articulamentums and softmax layers (soft maximum layer).The dimensionality reduction Convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and for the convolutional layer, is filled with It is zero.

In one alternate embodiment, dimension-reduction treatment module 504 further includes the second dimension-reduction treatment unit 5042, such as Fig. 4-2 Shown, the method that the feature vector that the second 5042 pairs of dimension-reduction treatment unit extracts carries out dimensionality reduction is using principal component analysis (PCA) The feature vector of the output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is all reduced to n Dimensional vector, the n are the port numbers for being extracted the convolutional layer of feature.For example, the last one residual unit of the 5th convolution section Convolutional layer is 1X1X2048, wherein port number 2048, the then feature of the output of the last one residual unit of the 5th convolution section Vector is all reduced to 2048 dimensional vectors.

In one alternate embodiment, categorization module 505 is using linear SVM (SVM) classifier to obtained spy Sign vector is classified.Fig. 4-2 shows the assembly line of the PCA-SVM module of the 5th convolution section.The experimental result table of this method Bright, the dimension of feature extraction can be significantly reduced in the case where not significantly reducing performance.

In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium It can be hard disk, multimedia card, SD card, flash card, SMC, read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM), any one in portable compact disc read-only memory (CD-ROM), USB storage etc. or several timess Meaning combination.It include image classification program 50 etc. in the computer readable storage medium, described image sort program 50 is processed Following operation is realized when device 14 executes:

Step S10 constructs depth residual error network, and carries out pre-training on ImageNet, obtains weight, and utilize weight Initialize depth residual error network；

Step S30 extracts the output conduct of the last one residual unit of multiple convolutional layers of depth residual error network respectively Feature vector；

Step S50 carries out dimension-reduction treatment to obtained feature vector；

Step S70 classifies to obtained feature vector using classifier.

The specific embodiment of the computer readable storage medium of the present invention and above-mentioned image classification method and electronics fill Set 1 specific embodiment it is roughly the same, details are not described herein.

The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For member, the invention may be variously modified and varied.All within the spirits and principles of the present invention, it is made it is any modification, Equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of image classification method is applied to electronic device, which is characterized in that this method comprises:

Depth residual error network is constructed, and carries out pre-training on ImageNet, obtains weight, and residual using weights initialisation depth Poor network, the depth residual error network include multiple convolution sections, wherein each convolution section includes multiple residual units, each residual Poor unit successively includes three convolutional layers again；

The output of the last one residual unit of multiple convolution sections of depth residual error network is extracted respectively as feature vector；

Dimension-reduction treatment is carried out to obtained feature vector；

Classified using classifier to obtained feature vector.

2. image classification method as described in claim 1, which is characterized in that depth residual error network is made of residual unit, often A residual unit indicates are as follows:

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w′_i)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

3. image classification method as described in claim 1, which is characterized in that

The depth residual error network includes sequentially connected first convolution section, the second convolution section, third convolution section, Volume Four product Section, the 5th convolution section, input picture successively pass through the first to the 5th convolution section, in which:

First convolution section includes the convolution of 7x7x64, wherein 7X7 indicates convolution kernel, and 64 indicate port number；

Second convolution section includes 3 the second residual units, and the second residual unit successively includes 1X1X64,3X3X64,1X1X256 again Three convolutional layers；

4. image classification method as claimed in claim 3, which is characterized in that

The output of the last one residual unit of third convolution section, Volume Four product section, the 5th convolution section is extracted respectively as feature Vector.

5. image classification method as described in claim 1, which is characterized in that

It is using a convolutional layer, a maximum pond layer, two full connections to the method that the feature vector of extraction carries out dimensionality reduction Layer and softmax layers, the convolutional layer is made of 1 × 1 filter along 512 channels, and stride is set as 1, and for It is filled using zero on the boundary of convolutional layer.

6. image classification method as claimed in claim 3, which is characterized in that

To the another method that the feature vector of extraction carries out dimensionality reduction be using principal component analysis by the 5th convolution section the last one The feature vector of the output of residual unit is reduced to n-dimensional vector, and n is the port number for being extracted the convolutional layer of feature.

7. image classification method as described in claim 1, which is characterized in that

Classified using linear SVM classifier to obtained feature vector.

8. a kind of electronic device, which is characterized in that the electronic device includes memory and the processing that connect with the memory Device is stored with the image classification program that can be run on the processor in the memory, and described image sort program is by institute It states when processor executes and realizes following steps:

Dimension-reduction treatment is carried out to obtained feature vector；

Classified using classifier to obtained feature vector.

9. electronic device as claimed in claim 8, which is characterized in that depth residual error network is made of residual unit, each surplus Counit indicates are as follows:

y_i=h (x_i)+F(x_i, w_i)

x_i+1=f (y_i)

Wherein,

F is residual error function；

F is ReLU function；

w_iIt is weight matrix；

x_iIt is i-th layer of input；

y_iIt is i-th layer of output；

The formula of function h are as follows: h (x_i)=x_i

The formula of residual error function F are as follows:

F(x_i, w_i)=w_i·σ(B(w′_i)·σ(B(x_i)))

Wherein, B (x_i) it is that batch normalizes；

w′_iIt is w_iTransposition；

" " indicates convolution；

σ(x_i)=max (x, 0).

10. a kind of computer readable storage medium, which is characterized in that include image classification in the computer readable storage medium Program when described image sort program is executed by processor, realizes the image classification as described in any one of claims 1 to 7 The step of method.