Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining that correlation is open, rather than the restriction to the disclosure.It also should be noted that in order to
Convenient for description, is illustrated only in attached drawing and disclose relevant part to related.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using embodiment of the disclosure for generating the method for characteristic pattern or for generating characteristic pattern
Device, and the method for image or for identification exemplary system architecture 100 of the device of image for identification.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed on terminal device 101,102,103, such as image processing application,
Video playing application, searching class application, instant messaging tools, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be various electronic equipments.When terminal device 101,102,103 is software, above-mentioned electronic equipment may be mounted at
In.Multiple softwares or software module (such as providing the software of Distributed Services or software module) may be implemented into it,
Single software or software module may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to the figure that terminal device 101,102,103 uploads
As the backstage image processing server handled.Image processing server available image in backstage is handled, and is obtained
Processing result (such as characteristic pattern of image).
It should be noted that can be by server for generating the method for characteristic pattern provided by embodiment of the disclosure
105 execute, can also be executed by terminal device 101,102,103, correspondingly, the device for generating characteristic pattern can be set in
In server 105, also it can be set in terminal device 101,102,103.In addition, being used for provided by embodiment of the disclosure
The method of identification image can be executed by server 105, can also be executed by terminal device 101,102,103, correspondingly, be used for
The device of identification image can be set in server 105, also can be set in terminal device 101,102,103.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
To be implemented as multiple softwares or software module (such as providing the software of Distributed Services or software module), also may be implemented
At single software or software module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.It is not required in the image handled from the feelings remotely obtained
Under condition, above system framework can not include network, only include server or terminal device.
With continued reference to Fig. 2, the process of one embodiment of the method for generating characteristic pattern according to the disclosure is shown
200.The method for being used to generate characteristic pattern, comprising the following steps:
Step 201, target image is obtained, and determines the characteristic pattern of target image.
In the present embodiment, for generating executing subject (such as server shown in FIG. 1 or the terminal of the method for characteristic pattern
Equipment) target image can be obtained from long-range, or from local by wired connection mode or radio connection.Wherein, mesh
Logo image is to handle it, to generate the image of its corresponding characteristic pattern.For example, target image can be above-mentioned execution
The image that the image for the camera shooting that main body includes or above-mentioned executing subject are extracted from preset image collection.
Above-mentioned executing subject may further determine that the characteristic pattern of target image.Wherein, characteristic pattern (feature map) is used
In the feature (such as color characteristic, gray feature etc.) of characterization image.In general, characteristic pattern corresponds to the feature at least one channel
Matrix.Each channel corresponds to a kind of feature of image, meanwhile, each channel corresponds to an eigenmatrix, in eigenmatrix
Each element, a pixel for including corresponding to target image.
Above-mentioned executing subject can determine the characteristic pattern of target image in various manners.As an example, above-mentioned execution master
Each pixel that body can include according to target image color value (including R (Red, red) value, G (Green, green) value,
B (Blue, blue) value), generate the characteristic pattern including three channels (the respectively channel R, the channel G, channel B), each channel pair
Ying Yuyi eigenmatrix, element therein are the color value of corresponding color.
For another example target image can be inputted preset convolutional neural networks, convolutional neural networks by above-mentioned executing subject
Including convolutional layer can extract the feature of target image, generate characteristic pattern.In general, convolutional layer may include at least one convolution
Core, each convolution kernel can be used for generating an eigenmatrix.It should be noted that in general, convolutional neural networks may include
Multiple convolutional layers, characteristic pattern used in the present embodiment can be the characteristic pattern that any convolutional layer generates.
Step 202, the modulation of single order channel attention, characteristic pattern after being modulated are carried out to characteristic pattern.
In the present embodiment, above-mentioned executing subject can carry out the modulation of single order channel attention to characteristic pattern, be modulated
Characteristic pattern afterwards.Wherein, single order channel attention modulation (Channel-wise Attention) refers to, characteristic pattern is corresponding extremely
A few eigenmatrix, is remapped to multiple vectors, and each vector therein corresponds at least one above-mentioned eigenmatrix
An eigenmatrix (correspond to a channel).Recycle preset function, to these vectors carry out operation (such as plus
The operations such as power, classification, pond), operation result is finally converted into new characteristic pattern as characteristic pattern after modulation, wherein modulation
Characteristic pattern includes at least one channel afterwards, and each channel corresponds to an eigenmatrix.Since above-mentioned operation is to above-mentioned multiple
Vector carries out linear operation, therefore referred to as single order channel attention is modulated.
Characteristic pattern after obtained modulation, can be used for characterizing the spy of target image after the modulation of single order channel attention
The various features (such as stripe shape feature, textural characteristics etc.) in each channel that sign figure includes.In practice, usually it can be used
Characteristic pattern carries out the operation such as classify to the pixel that image includes after modulation, so that attention modulation in single order channel can be applied to
The fields such as image recognition, image classification.
Step 203, the first process of convolution is carried out to characteristic pattern after modulation, after obtaining first convolution in preset number channel
Eigenmatrix.
In the present embodiment, above-mentioned executing subject can carry out the first process of convolution to characteristic pattern after modulation, be preset
Eigenmatrix after first convolution in number channel.In general, above-mentioned executing subject can use preset preset number convolution
At least one corresponding eigenmatrix of characteristic pattern carries out the first process of convolution after verification modulation, to obtain preset number channel
The first convolution after eigenmatrix.Convolution kernel is usually the form of matrix, and element therein is preset weighted value, utilizes weight
Value, can at least one eigenmatrix progress convolution algorithm corresponding to characteristic pattern after modulation.It should be noted that convolution kernel packet
The weighted value included can be pre-set, be also possible to advance with machine learning method, to the mind of convolution belonging to convolution kernel
Determined by after network is trained.In the present embodiment, above-mentioned preset number, which is typically larger than, is equal to 2.
Step 204, for eigenmatrix after the first convolution in eigenmatrix after first convolution in preset number channel,
Eigenmatrix after first convolution is converted into first passage feature vector.
In the present embodiment, for feature after the first convolution in eigenmatrix after first convolution in preset number channel
Eigenmatrix after first convolution can be converted to first passage feature vector by matrix, above-mentioned executing subject.Wherein, first is logical
Road feature vector is the vector generated based on the element that eigenmatrix includes after the first convolution.As an example, can be by first
Whole elements of eigenmatrix are rearranged for a vector as first passage feature vector after convolution.Alternatively, can be to
In the matrix that the element that eigenmatrix includes after one convolution obtains after various processing (such as normalization, average pond etc.)
Element is rearranged for a vector as first passage feature vector.
In general, the number of elements that first passage feature vector includes is equal to first prime number that eigenmatrix after the first convolution includes
Amount.For example, it is assumed that eigenmatrix is H row W column after some first convolution, then the vector of N-dimensional can be converted into as first
Channel characteristics vector, wherein N=H × W.Assuming that preset number is C, then the first passage feature vector of available C N-dimensional.
Step 205, it is based on obtained first passage feature vector, determines channel relational matrix.
In the present embodiment, above-mentioned executing subject can be based on obtained first passage feature vector, determine that channel is closed
It is matrix.Wherein, the element that channel relational matrix includes is used to characterize eigenmatrix after first convolution in preset number channel
Between relationship (such as characterization stripe shape feature the first convolution after eigenmatrix and characterization image texture characteristic the first volume
Relationship after product between eigenmatrix).
Specifically, as an example, above-mentioned executing subject can be by each first passage combination of eigenvectors, after obtaining combination
Eigenmatrix, then by eigenmatrix after combination with combine after eigenmatrix transposed matrix be multiplied, the matrix that will be obtained after multiplication
It is determined as pixel relationship matrix.Continue the example in above-mentioned steps 204, the first passage feature vector of C N-dimensional can be combined into
Eigenmatrix after the combination of C row N column, the transposed matrix of eigenmatrix is N row C column after combination, then the matrix obtained after being multiplied is i.e.
For the channel relational matrix of C row C column.Every row of eigenmatrix corresponds to a first passage feature vector after said combination, on
The each column for stating transposed matrix corresponds to a first passage feature vector, therefore, every in the channel relational matrix in this example
A element can correspond respectively to two first passage feature vectors, so as to for characterizing between first passage feature vector
Relationship namely the first convolution after relationship between eigenmatrix.For example, for some element in the relational matrix of channel, it should
The quadratic sum for the element that the closer two first passage feature vectors corresponding with the element of the numerical value of element include, indicates this yuan
Feature that the corresponding channel of the corresponding two first passage feature vectors of element is characterized (such as channel A is for characterizing first
The feature of kind curve, channel B are used to characterize the feature of second of curve) it is more similar.
In some optional implementations of the present embodiment, above-mentioned executing subject can be based on institute in accordance with the following steps
Obtained first passage feature vector, determines channel relational matrix:
Obtained first passage combination of eigenvectors is obtained matrix after the first combination by step 1.
As an example, the first passage feature vector of C N-dimensional, can be combined into matrix after the first combination of C row N column.
Step 2 carries out the second process of convolution to characteristic pattern after modulation, after obtaining second convolution in preset number channel
Eigenmatrix.
Specifically, it is corresponding to characteristic pattern after modulation to can use preset preset number convolution kernel for above-mentioned executing subject
At least one eigenmatrix carries out the second process of convolution, to obtain eigenmatrix after second convolution in preset number channel.
It should be noted that convolution kernel used herein above can be different from convolution kernel used in above-mentioned steps 203.Therefore, this
In the second convolution after eigenmatrix characterize feature be different from step 203 in the first convolution after eigenmatrix characterize spy
Sign.
Step 3, the pixel in pixel for including for target image, the second convolution from preset number channel
Afterwards in eigenmatrix, the corresponding pixel characteristic vector of the pixel is determined.
In general, the element after each second convolution in eigenmatrix, the pixel for including with target image is corresponded.It is right
In a pixel, above-mentioned executing subject can be from after each second convolution in eigenmatrix, will member corresponding with the pixel
Element extracts and group is combined into a vector as pixel characteristic vector.As an example it is supposed that after having second convolution in C channel
Eigenmatrix, eigenmatrix is the matrix of H row W column after each second convolution.Wherein, H is the pixel that target image includes
Line number, W are the columns for the pixel that target image includes.The then pixel characteristic vector of available N number of C dimension, wherein N=H ×
W。
Step 4 combines obtained pixel characteristic vector, obtains matrix after the second combination.
Continue the example of above-mentioned steps three, the pixel characteristic Vector Groups that N number of C is tieed up can be combined into N row C by above-mentioned executing subject
Matrix after second combination of column.
Step 5, matrix multiple after combining matrix after the first combination with second, is generated based on the matrix obtained after multiplication
Channel relational matrix.
Continue the example in above-mentioned steps four, after the first combination of C row N column matrix with N row C is arranged second combine after matrix
After multiplication, which can be determined as channel relational matrix by the matrix of available C row C column.Obtained by this implementation
Channel relational matrix, generated by being then based on eigenmatrix after the second convolution, therefore, channel relational matrix can be used for table
Sign corresponds to after modulation after first convolution in the characteristic pattern same channel that includes eigenmatrix after eigenmatrix and the second convolution
Between relationship, i.e., corresponding to the above-mentioned same channel two kinds of features between relationship.To help so that from target figure
The feature extracted as in is more comprehensive.
In some optional implementations of the present embodiment, matrix packet that above-mentioned executing subject can will obtain after multiplication
The element included is normalized, and obtains channel relational matrix.Obtained channel relational matrix after normalized, packet
The element included is between 0 to 1, therefore, can be as the weight for extracting other features, to help to make to mention
Other features taken can reflect the relationship between the channel that characteristic pattern includes.The algorithm of above-mentioned normalized may include but
It is not limited to following any: z-score standardized algorithm, softmax algorithm.
Step 206, it is based on channel relational matrix, characteristic pattern after modulation is converted, generates characteristic pattern after transformation.
In the present embodiment, above-mentioned executing subject can be based on channel relational matrix, convert to characteristic pattern after modulation,
Generate characteristic pattern after converting.Wherein, characteristic pattern can be used for characterizing each channel that the characteristic pattern of target image includes after transformation
Between relationship, so that facilitate electronic equipment extracts richer feature using characteristic pattern after transformation from target image.
In some optional implementations of the present embodiment, characteristic pattern corresponds to the spy in preset number channel after modulation
Levy matrix.This step can execute as follows:
Firstly, for the eigenmatrix in the eigenmatrix in the corresponding preset number channel of characteristic pattern after modulation, by this
Eigenmatrix is converted to second channel feature vector.As an example it is supposed that some eigenmatrix is H row W column, then it can be by its turn
The vector of N-dimensional is changed to as second channel feature vector, wherein N=H × W.
Then, by obtained second channel combination of eigenvectors, matrix after third combination is obtained.As an example it is supposed that
Preset number is C, then the second channel feature vector of available C N-dimensional, by the second channel combination of eigenvectors of C N-dimensional
Afterwards, matrix after the third combination of available C row N column
Finally, matrix multiple after combining channel relational matrix with third, is generated based on obtained matrix after being multiplied and is become
Change rear characteristic pattern.It continues the example presented above, matrix multiple after the channel relational matrix of C row C column is combined with the C row N third arranged can be with
Obtain the matrix of C row N column.For every row in obtained matrix after being multiplied, which corresponds to a channel, can be by the row packet
The N column element included is converted to the eigenmatrix of H row W column again.So as to obtain the feature corresponding to preset number channel
Characteristic pattern after the transformation of matrix.
Optionally, it when the number and above-mentioned preset number difference in the channel that characteristic pattern includes after modulation, can use pre-
If preset number convolution kernel (be different from above-mentioned for generating after the first convolution eigenmatrix after eigenmatrix and the second convolution
Convolution kernel used), process of convolution is carried out to characteristic pattern after modulation, obtains the eigenmatrix in preset number channel.Using institute
The eigenmatrix in obtained preset number channel generates characteristic pattern after transformation according to above-mentioned optional implementation.
It is that one of the application scenarios of the method according to the present embodiment for generating characteristic pattern shows with continued reference to Fig. 3, Fig. 3
It is intended to.In the application scenarios of Fig. 3, electronic equipment 301 obtains pre-stored target image 302 from local first.It recycles
Preset convolutional neural networks carry out feature extraction to target image 302, obtain the characteristic pattern of target image 302.Wherein, special
Sign figure corresponds to the eigenmatrix 303 at least one channel.Then, electronic equipment 301 carries out the attention of single order channel to characteristic pattern
Power modulation, characteristic pattern after being modulated, wherein characteristic pattern includes the spy in preset number (being indicated here with C) a channel after modulation
Levy matrix 304.Here it is indicated with C × H × W, H is characterized the line number of matrix, and W is characterized matrix column number.Subsequently, electronics is set
Standby 301 convolutional layers including using above-mentioned convolutional neural networks, corresponding with characteristic pattern after modulation, carry out characteristic pattern after modulation
First process of convolution obtains eigenmatrix 305 after first convolution in C channel, can also equally be indicated here with C × H × W.
Then, eigenmatrix after each first convolution is converted to first passage feature vector by electronic equipment 301.For example,
For eigenmatrix after first convolution, electronic equipment 301 arranges whole elements of eigenmatrix after first convolution again
A vector is classified as first passage feature vector.That is, each first passage feature vector includes N number of element, N is the first volume
The number of elements that eigenmatrix includes after product, i.e. N=H × W.
Subsequently, electronic equipment 301 is based on obtained first passage feature vector, determines channel relational matrix 306.Example
Such as, electronic equipment 301 is by first passage combination of eigenvectors, eigenmatrix 307 (size is C × (H × W)) after being combined,
Again by eigenmatrix 307 after combination with combine after the transposed matrix 308 ((H × W) × C) of eigenmatrix be multiplied, will be after multiplication
To matrix be determined as channel relational matrix 306 (size be C × C).
Finally, electronic equipment 301 is based on channel relational matrix 306, characteristic pattern after modulation is converted, after generating transformation
Characteristic pattern 309.For example, the corresponding each eigenmatrix of characteristic pattern after modulation to be converted to the feature vector of N-dimensional, then will be acquired
Combination of eigenvectors be C row N column matrix 310, then channel relational matrix 306 is multiplied with the matrix 310, obtains one
The matrix 311 of a C row N column, the N column element for finally including by every row are converted to the eigenmatrix of H row W column again, wherein H is
The line number for the pixel that target image 302 includes, W are the columns for the pixel that target image 302 includes.To finally obtain correspondence
The characteristic pattern 309 after the transformation of the eigenmatrix in C channel.
The method provided by the above embodiment of the disclosure, by obtaining target image, and the feature of determining target image
Figure, then the modulation of single order channel attention, characteristic pattern after modulate, then to characteristic pattern progress the after modulation are carried out to characteristic pattern
One process of convolution, obtains eigenmatrix after first convolution in preset number channel, then by eigenmatrix after each first convolution
Branch is converted to first passage feature vector, then is based on first passage feature vector, determines channel relational matrix, finally based on logical
Road relational matrix converts characteristic pattern after modulation, generates characteristic pattern after transformation.By transformation after characteristic pattern be according to really
What fixed channel relational matrix generated, simultaneously as characteristic pattern is the characteristic pattern using target image by a series of after transformation
What processing generated, thus, characteristic pattern can be used for characterizing between each channel that the characteristic pattern of target image includes after transformation
Relationship, to can extract richer feature from target image using characteristic pattern after obtained transformation, these features can be with
More comprehensively, target image is accurately characterized, helps to improve the accuracy identified to image, and improve and mention from image
Take the accuracy of target object image.
With continued reference to Fig. 4, the process of one embodiment of the method for the image for identification according to the disclosure is shown
400.The method of the image for identification, comprising the following steps:
Step 401, images to be recognized is obtained.
In the present embodiment, (such as server shown in FIG. 1 or terminal are set the executing subject of the method for image for identification
It is standby) it can be from long-range or from local obtain images to be recognized.Wherein, images to be recognized includes target object image.Target object
Image is the image for characterizing target object, and it is signified that target object can be the image that following convolutional neural networks can identify
The object shown.As an example, target object image can include but is not limited to following at least one image: facial image, human body
Image, animal painting.
Step 402, the convolutional neural networks that images to be recognized input is trained in advance, export for characterizing images to be recognized
In position of the target object image in images to be recognized location information and for characterizing class belonging to target object image
Other classification information.
In the present embodiment, images to be recognized can be inputted convolutional neural networks trained in advance by above-mentioned executing subject,
It exports the location information for characterizing position of the target object image in images to be recognized in images to be recognized and is used for table
Levy the classification information of classification belonging to target object image.
Wherein, convolutional neural networks include convolutional layer and classification layer, and convolutional layer is used to execute using images to be recognized above-mentioned
The method (that is, using images to be recognized as the target image in Fig. 2 corresponding embodiment) of Fig. 2 corresponding embodiment description, generates and becomes
Change rear characteristic pattern.Layer of classifying is used to classify to the pixel that images to be recognized includes based on characteristic pattern after transformation, be generated class
Other information and location information.
In general, classification layer may include full articulamentum and classifier, full articulamentum is used for the various spies for generating convolutional layer
Sign figure (including characteristic pattern after above-mentioned transformation, it can also include the spy that other methods for not utilizing Fig. 2 corresponding embodiment to describe generate
Sign figure) it integrates, generate the feature vector for classification.Classifier can use features described above vector, to above-mentioned to be identified
The pixel that image includes is classified, and may thereby determine that the region for belonging to the pixel composition of some classification, which can be with
It is characterized using location information, the category can be characterized with classification information.
As an example, location information may include the coordinate value of four angle points of rectangle, each coordinate value is corresponded respectively to
A pixel in images to be recognized can determine position of the target object image in images to be recognized according to coordinate value.
Above-mentioned classification information can include but is not limited to the information of following at least one form: text, number, symbol.Example
Such as, classification information can be text " face ", be facial image for characterizing target object image.
In practice, above-mentioned executing subject or other electronic equipments can use preset training sample set to initial convolution
Neural network is trained, to obtain above-mentioned convolutional neural networks.Specifically, as an example, training sample may include sample
This image and the mark classification information and labeling position information that sample image is marked.Execution for training convolutional neural networks
Main body can use machine learning method, and the sample image for including using the training sample in training sample set, will as input
Mark classification information corresponding with the sample image of input and labeling position information are as desired output, to initial convolution nerve net
Network is trained, for the sample image of each training input, available reality output.Wherein, reality output is initial volume
The data of product neural network reality output, for characterizing classification information and location information.Then, above-mentioned executing subject can use
Gradient descent method and back propagation are based on reality output and desired output, adjust the parameter of initial convolutional neural networks, will be every
Initial convolutional neural networks of the convolutional neural networks obtained after secondary adjusting parameter as training next time, and meeting preset instruction
In the case where practicing termination condition, terminate training, so that training obtains convolutional neural networks.Above-mentioned preset trained termination condition can
To include but is not limited at least one of following: the training time is more than preset duration;Frequency of training is more than preset times;Using default
Loss function (such as cross entropy loss function) calculate resulting penalty values and be less than default penalty values threshold value.
Above-mentioned location information and classification information can export in various ways.For example, location information and classification can be believed
Breath is shown on the display that above-mentioned executing subject includes;Or it sends location information and classification information to and above-mentioned execution master
On the electronic equipment of body communication connection;Or color corresponding with classification information is generated in images to be recognized according to location information
Rectangle frame.
The convolutional neural networks that the present embodiment uses, the method due to that can execute the description of Fig. 2 corresponding embodiment, are generated
Transformation after characteristic pattern can be used for characterizing the relationship between the channel that the characteristic pattern of images to be recognized includes, according to each channel
Between relationship, richer feature can be extracted from target image, these features can more comprehensively, accurately characterize target figure
Picture, to can more accurately be classified to the pixel that images to be recognized includes using these features, realize more precisely,
Efficiently identify image.
In some optional implementations of the present embodiment, above-mentioned executing subject is also based on location information, to
Identify extracting target from images object images and display.Specifically, above-mentioned executing subject can determine target according to location information
Position of the object images in images to be recognized, so that target object image zooming-out be come out.Target object image can be shown
On the display screen that above-mentioned executing subject includes, also it may be displayed on aobvious with the electronic equipment of above-mentioned executing subject communication connection
In display screen.This implementation can more accurately be mentioned since above-mentioned convolutional neural networks are utilized from images to be recognized
It takes and displaying target object images.
The method provided by the above embodiment of the disclosure executes the corresponding implementation of above-mentioned Fig. 2 by using convolutional neural networks
Example description method, images to be recognized is identified, output for characterize the target object image in images to be recognized to
The location information of the position in image and the classification information for characterizing classification belonging to target object image are identified, thus effectively
Ground be utilized transformation after characteristic pattern characterize target image characteristic pattern include each channel between relationship, from figure to be identified
Richer feature is extracted as in, so that more accurately being classified to the pixel that images to be recognized includes, is realized more smart
Standard efficiently identifies image.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides one kind for generating spy
One embodiment of the device of figure is levied, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be with
Applied in various electronic equipments.
As shown in figure 5, the present embodiment includes: first acquisition unit 501 for generating the device 500 of characteristic pattern, matched
It is set to acquisition target image, and determines the characteristic pattern of target image, wherein characteristic pattern corresponds to the feature at least one channel
Matrix;Modulation unit 502 is configured to carry out characteristic pattern the modulation of single order channel attention, characteristic pattern after being modulated;Convolution
Unit 503 is configured to carry out the first process of convolution to characteristic pattern after modulation, after obtaining first convolution in preset number channel
Eigenmatrix;Converting unit 504 is configured to the first volume after the first convolution for preset number channel in eigenmatrix
Eigenmatrix after first convolution is converted to first passage feature vector by eigenmatrix after product;Determination unit 505, is configured
At obtained first passage feature vector is based on, channel relational matrix is determined, wherein the element that channel relational matrix includes is used
Relationship after first convolution in characterization preset number channel between eigenmatrix;Converter unit 506 is configured to based on logical
Road relational matrix converts characteristic pattern after modulation, generates characteristic pattern after transformation.
In the present embodiment, first acquisition unit 501 can be by wired connection mode or radio connection from remote
Journey, or target image is obtained from local.Wherein, target image is to handle it, to generate its corresponding characteristic pattern
Image.For example, target image can be the camera shooting that above-mentioned apparatus 500 includes image or above-mentioned apparatus 500 from
The image extracted in preset image collection.
Above-mentioned first acquisition unit 501 may further determine that the characteristic pattern of target image.Wherein, characteristic pattern (feature
Map) for characterizing the feature (such as color characteristic, gray feature etc.) of image.In general, characteristic pattern corresponds at least one channel
Eigenmatrix.Each channel corresponds to a kind of feature of image, meanwhile, each channel corresponds to an eigenmatrix, feature
Each element in matrix, a pixel for including corresponding to target image.
Above-mentioned first acquisition unit 501 can determine the characteristic pattern of target image in various manners.As an example, above-mentioned
The color value (including R value, G value, B value) for each pixel that first acquisition unit 501 can include according to target image generates
Characteristic pattern including three channels (the respectively channel R, the channel G, channel B), each channel correspond to an eigenmatrix, wherein
Element be corresponding color color value.
For another example target image can be inputted preset convolutional neural networks, convolution mind by above-mentioned first acquisition unit 501
The convolutional layer for including through network can extract the feature of target image, generate characteristic pattern.In general, convolutional layer may include at least one
A convolution kernel, each convolution kernel can be used for generating an eigenmatrix.It should be noted that in general, convolutional neural networks can
To include multiple convolutional layers, characteristic pattern used in the present embodiment can be the characteristic pattern that any convolutional layer generates.
In the present embodiment, modulation unit 502 can carry out the modulation of single order channel attention to characteristic pattern, after obtaining modulation
Characteristic pattern.Wherein, single order channel attention modulation (Channel-wise Attention) refers to, characteristic pattern is corresponding at least
One eigenmatrix, is remapped to multiple vectors, and each vector therein corresponds at least one above-mentioned eigenmatrix
One eigenmatrix (corresponding to a channel).Recycle preset function, to these vectors carry out operation (such as weighting,
The operations such as classification, pond), operation result is finally converted into new characteristic pattern as characteristic pattern after modulation, wherein special after modulation
Sign figure includes at least one channel, and each channel corresponds to an eigenmatrix.Since above-mentioned operation is to above-mentioned multiple vectors
Linear operation is carried out, therefore referred to as single order channel attention is modulated.
In the present embodiment, convolution unit 503 can carry out the first process of convolution to characteristic pattern after modulation, obtain present count
Eigenmatrix after first convolution in mesh channel.In general, above-mentioned convolution unit 503 can use preset preset number convolution
At least one corresponding eigenmatrix of characteristic pattern carries out the first process of convolution after verification modulation, to obtain preset number channel
The first convolution after eigenmatrix.Convolution kernel is usually the form of matrix, and element therein is preset weighted value, utilizes weight
Value, can at least one eigenmatrix progress convolution algorithm corresponding to characteristic pattern after modulation.It should be noted that convolution kernel packet
The weighted value included can be pre-set, be also possible to advance with machine learning method, to the mind of convolution belonging to convolution kernel
Determined by after network is trained.In the present embodiment, above-mentioned preset number, which is typically larger than, is equal to 2.
In the present embodiment, for feature after the first convolution in eigenmatrix after first convolution in preset number channel
Eigenmatrix after first convolution can be converted to first passage feature vector by matrix, above-mentioned converting unit 504.Wherein,
One channel characteristics vector is the vector generated based on the element that eigenmatrix includes after the first convolution.As an example, can incite somebody to action
Whole elements of eigenmatrix are rearranged for a vector as first passage feature vector after first convolution.Alternatively, can be with
The matrix that the element that eigenmatrix after first convolution includes is obtained after various processing (such as normalization, average pond etc.)
In element be rearranged for a vector as first passage feature vector.
In general, the number of elements that first passage feature vector includes is equal to first prime number that eigenmatrix after the first convolution includes
Amount.For example, it is assumed that eigenmatrix is H row W column after some first convolution, then the vector of N-dimensional can be converted into as first
Channel characteristics vector, wherein N=H × W.Assuming that preset number is C, then the first passage feature vector of available C N-dimensional.
In the present embodiment, determination unit 505 can be based on obtained first passage feature vector, determine channel relationship
Matrix.Wherein, the element that channel relational matrix includes be used to characterize after first convolution in preset number channel eigenmatrix it
Between relationship (such as characterization stripe shape feature the first convolution after eigenmatrix and characterization image texture characteristic the first convolution
Relationship between eigenmatrix afterwards).
Specifically, as an example, each first passage combination of eigenvectors can be obtained group by above-mentioned determination unit 505
Eigenmatrix after conjunction, then by eigenmatrix after combination with combine after eigenmatrix transposed matrix be multiplied, by what is obtained after multiplication
Matrix is determined as pixel relationship matrix.
In the present embodiment, converter unit 506 can be based on channel relational matrix, convert to characteristic pattern after modulation,
Generate characteristic pattern after converting.Wherein, characteristic pattern can be used for characterizing each channel that the characteristic pattern of target image includes after transformation
Between relationship, so that facilitate electronic equipment extracts richer feature using characteristic pattern after transformation from target image.
In some optional implementations of the present embodiment, determination unit 505 may include: the first composite module (figure
In be not shown), be configured to obtained first passage combination of eigenvectors obtaining matrix after the first combination;Convolution module
(not shown) is configured to carry out the second process of convolution to characteristic pattern after modulation, obtains the second of preset number channel
Eigenmatrix after convolution;First conversion module (not shown), is configured in the pixel for including for target image
Pixel, from eigenmatrix, determining the corresponding pixel characteristic vector of the pixel after second convolution in preset number channel;
Second composite module (not shown) is configured to combine obtained pixel characteristic vector, obtains square after the second combination
Battle array;First generation module (not shown), matrix multiple after being configured to combine matrix after the first combination with second, is based on
The matrix obtained after multiplication generates channel relational matrix.
In some optional implementations of the present embodiment, the first generation module can be further configured to: by phase
The element that the matrix obtained after multiplying includes is normalized, and obtains channel relational matrix.
In some optional implementations of the present embodiment, characteristic pattern corresponds to the spy in preset number channel after modulation
Levy matrix;And converter unit 507 may include: the second conversion module (not shown), be configured to for special after modulation
Sign schemes the eigenmatrix in the eigenmatrix in corresponding preset number channel, is second channel feature by this feature matrix conversion
Vector;Third composite module (not shown) is configured to obtained second channel combination of eigenvectors obtaining third
Matrix after combination;Second generation module (not shown), matrix phase after being configured to combine channel relational matrix with third
Multiply, characteristic pattern after converting is generated based on obtained matrix after being multiplied.
The device provided by the above embodiment 500 of the disclosure, by obtaining target image, and the spy of determining target image
Sign figure, then attention modulation in single order channel is carried out to characteristic pattern, then characteristic pattern after being modulated carries out characteristic pattern after modulation
First process of convolution, obtains eigenmatrix after first convolution in preset number channel, then by feature square after each first convolution
Battle array branch is converted to first passage feature vector, then is based on first passage feature vector, determines channel relational matrix, is finally based on
Channel relational matrix converts characteristic pattern after modulation, generates characteristic pattern after transformation.Since characteristic pattern is according to institute after transformation
What determining channel relational matrix generated, simultaneously as characteristic pattern is the characteristic pattern using target image by a system after transformation
What column processing generated, thus, characteristic pattern can be used for characterizing between each channel that the characteristic pattern of target image includes after transformation
Relationship, to can extract richer feature from target image using characteristic pattern after obtained transformation, these features can
With more comprehensively, accurately characterize target image, help to improve the accuracy identified to image, and improve from image
Extract the accuracy of target object image.
With further reference to Fig. 6, as the realization to method shown in above-mentioned Fig. 4, present disclose provides one kind to scheme for identification
One embodiment of the device of picture, the Installation practice is corresponding with embodiment of the method shown in Fig. 4, which can specifically answer
For in various electronic equipments.
As shown in fig. 6, the device 600 of the image for identification of the present embodiment includes: second acquisition unit 601, it is configured
At acquisition images to be recognized, wherein images to be recognized includes target object image;Output unit 602, being configured to will be to be identified
Image input convolutional neural networks trained in advance, export for characterizing the target object image in images to be recognized to be identified
The location information of position in image and classification information for characterizing classification belonging to target object image, wherein convolution mind
It include convolutional layer and classification layer through network, convolutional layer is used to execute above-mentioned Fig. 2 corresponding embodiment description using images to be recognized
Method, generates characteristic pattern after transformation, and classification layer is used to carry out the pixel that images to be recognized includes based on characteristic pattern after transformation
Classification generates classification information and location information.
In the present embodiment, second acquisition unit 601 can be from long-range or from local obtain images to be recognized.Wherein, to
Identify that image includes target object image.Target object image is the image for characterizing target object, and target object can be
Object indicated by the image that following convolutional neural networks can identify.As an example, target object image may include but not
It is limited to following at least one image: facial image, human body image, animal painting.
In the present embodiment, images to be recognized can be inputted convolutional neural networks trained in advance by output unit 602, defeated
Out for characterizing the location information of position of the target object image in images to be recognized in images to be recognized and for characterizing
The classification information of classification belonging to target object image.
Wherein, convolutional neural networks include convolutional layer and classification layer, and convolutional layer is used to execute using images to be recognized above-mentioned
The method (that is, using images to be recognized as the target image in Fig. 2 corresponding embodiment) of Fig. 2 corresponding embodiment description, generates and becomes
Change rear characteristic pattern.Layer of classifying is used to classify to the pixel that images to be recognized includes based on characteristic pattern after transformation, be generated class
Other information and location information.
In general, classification layer may include full articulamentum and classifier, full articulamentum is used for the various spies for generating convolutional layer
Sign figure (including characteristic pattern after above-mentioned transformation, it can also include the spy that other methods for not utilizing Fig. 2 corresponding embodiment to describe generate
Sign figure) it integrates, generate the feature vector for classification.Classifier can use features described above vector, to above-mentioned to be identified
The pixel that image includes is classified, and may thereby determine that the region for belonging to the pixel composition of some classification, which can be with
It is characterized using location information, the category can be characterized with classification information.
As an example, location information may include the coordinate value of four angle points of rectangle, each coordinate value is corresponded respectively to
A pixel in images to be recognized can determine position of the target object image in images to be recognized according to coordinate value.
Above-mentioned classification information can include but is not limited to the information of following at least one form: text, number, symbol.Example
Such as, classification information can be text " face ", be facial image for characterizing target object image.
Above-mentioned location information and classification information can export in various ways.For example, location information and classification can be believed
Breath is shown on the display that above-mentioned apparatus 600 includes;Or it sends location information and classification information to and above-mentioned apparatus 600
On the electronic equipment of communication connection;Or color corresponding with classification information is generated in images to be recognized according to location information
Rectangle frame.
In some optional implementations of the present embodiment, the device 600 can also include: display unit (in figure not
Show), it is configured to extract target object image and display from images to be recognized based on location information.
It is corresponding to execute above-mentioned Fig. 2 by using convolutional neural networks for the device provided by the above embodiment 600 of the disclosure
The method of embodiment description, identifies images to be recognized, exports for characterizing the target object image in images to be recognized
The location information of position in images to be recognized and classification information for characterizing classification belonging to target object image, thus
Efficiently utilize transformation after characteristic pattern characterize target image characteristic pattern include each channel between relationship, from wait know
Richer feature is extracted in other image, so that more accurately being classified to the pixel that images to be recognized includes, is realized
More precisely, image is efficiently identified.
Below with reference to Fig. 7, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1
Server or terminal device) 700 structural schematic diagram.Terminal device in embodiment of the disclosure can include but is not limited to all
As mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable
Formula multimedia player), the mobile terminal and such as number TV, desk-top meter of car-mounted terminal (such as vehicle mounted guidance terminal) etc.
The fixed terminal of calculation machine etc..Electronic equipment shown in Fig. 7 is only an example, should not be to the function of embodiment of the disclosure
Any restrictions are brought with use scope.
As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.)
701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708
Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment
Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM 703 pass through the phase each other of bus 704
Even.Input/output (I/O) interface 705 is also connected to bus 704.
In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 706 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 707 of dynamic device etc.;Storage device 708 including such as tape, hard disk etc.;And communication device 709.Communication device
709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool
There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.Each box shown in Fig. 7 can represent a device, can also root
According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708
It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.
It is situated between it should be noted that computer-readable medium described in embodiment of the disclosure can be computer-readable signal
Matter or computer-readable medium either the two any combination.Computer-readable medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable medium can include but is not limited to: have the electrical connection, portable of one or more conducting wires
Computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory
(EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or
The above-mentioned any appropriate combination of person.
In embodiment of the disclosure, computer-readable medium can be any tangible medium for including or store program,
The program can be commanded execution system, device or device use or in connection.And in embodiment of the disclosure
In, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, wherein holding
Computer-readable program code is carried.The data-signal of this propagation can take various forms, including but not limited to electromagnetism
Signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable medium with
Outer any computer-readable medium, the computer-readable signal media can be sent, propagated or transmitted for being held by instruction
Row system, device or device use or program in connection.The program code for including on computer-readable medium
It can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any conjunction
Suitable combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more
When a program is executed by the electronic equipment, so that the electronic equipment: obtaining target image, and determine the feature of target image
Figure, wherein characteristic pattern corresponds to the eigenmatrix at least one channel;Attention modulation in single order channel is carried out to characteristic pattern, is obtained
Characteristic pattern after to modulation;First process of convolution is carried out to characteristic pattern after modulation, after obtaining first convolution in preset number channel
Eigenmatrix;For eigenmatrix after the first convolution in eigenmatrix after first convolution in preset number channel, by this
Eigenmatrix is converted to first passage feature vector after one convolution;Based on obtained first passage feature vector, channel is determined
Relational matrix, wherein the element that channel relational matrix includes is used to characterize feature square after first convolution in preset number channel
Relationship between battle array;Based on channel relational matrix, characteristic pattern after modulation is converted, generates characteristic pattern after transformation.
In addition, when said one or multiple programs are executed by the electronic equipment, it is also possible that the electronic equipment: obtaining
Take images to be recognized, wherein images to be recognized includes target object image;By images to be recognized input convolution mind trained in advance
Through network, export location information for characterizing position of the target object image in images to be recognized in images to be recognized and
For characterizing the classification information of classification belonging to target object image.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, described program design language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor
Including first acquisition unit, modulation unit, convolution unit, converting unit, determination unit and converter unit.Wherein, these units
Title do not constitute the restriction to the unit itself under certain conditions, for example, first acquisition unit is also described as
" obtain target image, and determine the unit of the characteristic pattern of target image ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.