CN105469376B

CN105469376B - The method and apparatus for determining picture similarity

Info

Publication number: CN105469376B
Application number: CN201410393684.6A
Authority: CN
Inventors: 姚伶伶; 赫南; 何琪; 胡繁星; 卫望; 王兵
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2014-08-12
Filing date: 2014-08-12
Publication date: 2019-10-25
Anticipated expiration: 2034-08-12
Also published as: CN105469376A

Abstract

The invention discloses a kind of method and apparatus of determining picture similarity, belong to the communications field.The described method includes: obtaining picture feature extraction model using training data training neural network model；First picture and second picture are input to the picture feature extraction model, obtain output neuron as the feature vector of the first picture and the feature vector of second picture on each layer of feature；The similarity between the feature vector of the first picture and the feature vector of second picture is calculated separately to each layer；According to the similarity on each layer between the feature vector of the first picture and the feature vector of second picture, the similarity of the first picture and second picture is calculated.Described device includes: training module, characteristic extracting module, the first computing module and the second computing module.The present invention is based on the automatic extractions that picture feature extraction model realizes feature to dramatically save manpower and time, improve the efficiency of determining similarity without relying on by hand.

Description

The method and apparatus for determining picture similarity

Technical field

The present invention relates to the communications field, in particular to a kind of method and apparatus of determining picture similarity.

Background technique

With the development of internet technology, effect advertisement using more and more.Effect advertisement refers to that advertiser only needs For the advertisement of measurable result payment.Effect advertisement is widely used in electric business website, social platform or mobile application etc..Effect The content overwhelming majority of fruit advertisement is picture, and feature extraction and similarity calculation based on picture have become realization service logic Essential technological means.

Currently, common picture similarity calculating method is as follows: feature extraction is done to picture first, such as GIST feature；So Certain local features are abstracted afterwards, are such as clustered；These features are generalized for feature vector again；Finally utilize similarity formula As picture similarity is calculated in Euclidean distance formula.

But feature extraction is all based on manual realization in the above method, and personnel is needed to have stronger professional knowledge, it is right Personnel's is more demanding, and whole process needs take considerable time that efficiency is lower.

Summary of the invention

In view of this, the embodiment of the invention provides a kind of method and apparatus of determining picture similarity, it is automatic to realize Feature is extracted, the efficiency for determining similarity is improved.The technical solution is as follows:

On the one hand, a kind of method of determining picture similarity is provided, which comprises

Picture feature extraction model is obtained using training data training neural network model；

First picture and second picture are input to the picture feature extraction model, obtain output neuron as feature Each layer on the feature vector of first picture and the feature vector of the second picture；

Each layer is calculated separately between the feature vector of first picture and the feature vector of the second picture Similarity；

According to the phase between the feature vector of the first picture described on each layer and the feature vector of the second picture Like degree, the similarity of first picture and second picture is calculated.

On the other hand, a kind of device of determining picture similarity is provided, described device includes:

Training module, for obtaining picture feature extraction model using training data training neural network model；

Characteristic extracting module is obtained for the first picture and second picture to be input to the picture feature extraction model Output neuron is as the feature vector of first picture on each layer of feature and the feature vector of the second picture；

First computing module, for each layer is calculated separately first picture feature vector and second figure Similarity between the feature vector of piece；

Second computing module, for according to the feature vector of the first picture described on each layer and the second picture The similarity of first picture and second picture is calculated in similarity between feature vector.

Technical solution provided in an embodiment of the present invention has the benefit that using training data training neural network mould Type obtains picture feature extraction model；First picture and second picture are input to the picture feature extraction model, obtained defeated Neuron is as the feature vector of first picture on each layer of feature and the feature vector of the second picture out；To described Each layer calculates separately the similarity between the feature vector of first picture and the feature vector of the second picture；According to institute The similarity on each layer between the feature vector of first picture and the feature vector of the second picture is stated, institute is calculated State the similarity of the first picture and second picture；Wherein, the automatic extraction of feature, nothing are realized based on picture feature extraction model Craft need to be relied on, manpower and time are dramatically saved, improves the efficiency of determining similarity.

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.

Fig. 1 is the method flow diagram for the determination picture similarity that one embodiment of the invention provides；

Fig. 2 be another embodiment of the present invention provides determination picture similarity method flow diagram；

Fig. 3 be another embodiment of the present invention provides depth convolutional neural networks structural schematic diagram；

Fig. 4 be another embodiment of the present invention provides convolutional layer visualization feature schematic diagram；

Fig. 5 (a)~Fig. 5 (d) be another embodiment of the present invention provides picture similarity schematic diagram of calculation result；

Fig. 6 (a)~Fig. 6 (c) be another embodiment of the present invention provides picture classification schematic diagram；

Fig. 7 be another embodiment of the present invention provides determination picture similarity structure drawing of device；

Fig. 8 be another embodiment of the present invention provides determination picture similarity structure drawing of device；

Fig. 9 be another embodiment of the present invention provides server architecture figure.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

Referring to Fig. 1, one embodiment of the invention provides a kind of method of determining picture similarity, comprising:

101: obtaining picture feature extraction model using training data training neural network model.

Neural network (Neural Networks, NN) is extensive by a large amount of, simple processing unit (referred to as neuron) Ground interconnects and the complex networks system of formation, it reflects many essential characteristics of human brain function, is one highly complex Non-linear dynamic learning system.Neural network model has large-scale parallel, distributed storage and processing, self-organizing, adaptive Should and self-learning ability, be particularly suitable for processing and need while considering many factors and condition, inaccurate and fuzzy information processing Problem.

In the present embodiment, the training data can be picture, and the type of picture is unlimited, such as can be the figure of electric business industry Picture etc. in piece or ImageNet database.

102: the first picture and second picture being input to the picture feature extraction model, obtain output neuron as special The feature vector of the feature vector of first picture and the second picture on each layer of sign.

Wherein, the first picture and second picture are any two picture, and the method provided through this embodiment determines the The similarity of one picture and second picture.

103: each layer is calculated separately between the feature vector of first picture and the feature vector of the second picture Similarity.

In the present embodiment, the output neuron as feature each layer usually by multilayer, then each of these layer all may be used To obtain the feature vector of the first picture, the also feature vector of available second picture.

For example, output neuron having three layers altogether as feature in feature extraction model, A layers, B layers and C layers, then by first Available first picture is in the A layers of feature vector Va1 exported, the feature vector Vb1 exported at B layers after picture inputs the model With the feature vector Vc1 exported at C layers；Second picture is inputted to the spy of available second picture output at A layers after the model Levy vector Va2, the feature vector Vb2 exported at B layers and the feature vector Vc2 exported at C layers.Calculating the similar of feature vector When spending, the similarity of the similarity of Va1 and Va2, the similarity of Vb1 and Vb2 and Vc1 and Vc2 can be calculated separately.

104: according to similar between the feature vector of first picture and the feature vector of the second picture on each layer Degree, is calculated the similarity of first picture and second picture.

Here can calculate the similarity of two pictures using regression algorithm, the feature vector being calculated on each layer it Between similarity can take different weights, the calculated result of picture similarity can be made more acurrate by adjusting weighted value, mentioned High computational accuracy.

In the present embodiment, floating number of the similarity characterization of picture between [0,1], numerical value is bigger to indicate more similar, number It is more dissimilar to be worth smaller expression.

It is optionally, described to obtain picture feature extraction mould using training data training neural network model in the present embodiment Type may include:

It is based on back-propagation algorithm training depth convolutional neural networks model using training data, obtains picture feature extraction Model；Or,

Using training data training deepness auto encoder, picture feature extraction model is obtained.

Further, described to be based on back-propagation algorithm training depth convolutional neural networks model using training data, it obtains To picture feature extraction model, may include:

Depth convolutional neural networks model is chosen, the depth convolutional neural networks model includes that M layers of convolutional layer and N layers are complete Articulamentum；

Specified K layer is chosen in M+N layers and does feature extraction, using the K layers of neuron output as feature vector；

It is based on the back-propagation algorithm training depth convolutional neural networks model using training data, is obtained with K spy Levy the picture feature extraction model of vector.

In the present embodiment, it is preferable that described K layers is full articulamentum, or including convolutional layer and full articulamentum.

In the present embodiment, optionally, described pair of each layer calculates separately the feature vector and second figure of first picture Similarity between the feature vector of piece may include:

Using cosine-algorithm, Jaccard algorithm or Pearson algorithm, the spy of first picture is calculated separately to each layer Levy the similarity between the feature vector of the vector sum second picture.

It is optionally, described according to the feature vector of first picture on each layer and the second picture in the present embodiment Similarity between feature vector is calculated the similarity of first picture and second picture, may include:

Using linear regression algorithm or logistic regression algorithm, according to the feature vector of first picture on each layer and this The similarity of first picture and second picture is calculated in similarity between the feature vector of two pictures.

The above method provided in this embodiment obtains picture feature using training data training neural network model and extracts mould Type；First picture and second picture are input to the picture feature extraction model, obtain output neuron as each of feature The feature vector of the feature vector of first picture and the second picture on layer；Described first is calculated separately to each layer Similarity between the feature vector of picture and the feature vector of the second picture；According to the first picture described on each layer Feature vector and the second picture feature vector between similarity, first picture and second picture is calculated Similarity；Wherein, the automatic extraction of feature is realized based on picture feature extraction model, without relying on by hand, is greatly saved Manpower and time have been saved, the efficiency of determining similarity is improved.

Referring to fig. 2, another embodiment of the present invention provides a kind of methods of determining picture similarity, this method comprises:

201: being based on back-propagation algorithm training depth convolutional neural networks model using training data, obtain picture feature Extraction model.

Wherein, depth convolutional neural networks (deep convolutional neural network) are a kind of artificial minds A kind of machine learning model for having supervision through network and depth, one of representative method of deep learning.Deep learning (deep learning) is a branch of machine learning, it attempts to learn out to express the multilayer of feature or concept, passes through group It closes low-level feature and forms more abstract high-level characteristic, to find that the distributed nature of data indicates.

Back-propagation algorithm (Back Propagation, BP) is a kind of supervised learning algorithm, is often used to train multilayer Perceptron and feedforward neural network.In the present embodiment, depth convolutional neural networks model is trained using back-propagation algorithm.Instead Mainly include two links to propagation algorithm: excitation is propagated and weight updates, and loop iteration is repeated by the two links, Until response of the model to input reaches scheduled target zone.

Advantage using depth convolutional neural networks is that it can directly use input of the picture as network, avoids The preprocessing process of a series of complex when artificial extraction feature converts implicit feature for the mode that explicit features extract and takes out It takes, dramatically saves the time, improve the efficiency of feature extraction.In addition, this network structure to translation, scaling, incline Tiltedly or the deformation of other forms has height invariance.

In addition, this step can also be replaced by following steps:

Wherein, deepness auto encoder (deep auto encoder) is a kind of unsupervised deep learning method, is also belonged to In neural network, it can be used to do feature extraction by reappearing input signal as far as possible come training pattern.

202: the first picture and second picture being input to the picture feature extraction model, obtain output neuron as special The feature vector of the feature vector of first picture and the second picture on each layer of sign.

203: using cosine-algorithm, Jaccard Jie Kade algorithm or Pearson Pearson came algorithm, which is counted respectively Calculate the similarity between the feature vector of first picture and the feature vector of the second picture.

Wherein, cosine-algorithm (cosine similarity) is the cosine by measuring two inner product of vectors space angles Value measures the similitude between them.Jaccard algorithm and Pearson algorithm are also all the algorithms for calculating similarity, herein Excessive explanation is not done.It is illustrated by taking cosine-algorithm as an example below, which can be indicated with following formula:

sim_layern=cosine (fv_ni, fv_nj)；

Wherein, i and j indicates that the different picture of any two, n indicate the number of plies in depth convolutional neural networks, fv_niIt indicates The feature vector that picture i is extracted in n-th layer, fv_njIndicate the feature vector that picture j is extracted in n-th layer, sim_layernIt indicates Similarity in n-th layer between the feature vector of two pictures.

204: use linear regression algorithm or logistic regression algorithm, according to the feature vector of first picture on each layer and The similarity of first picture and second picture is calculated in similarity between the feature vector of the second picture.

Wherein, linear regression algorithm and logistic regression algorithm are not repeated herein for returning the algorithm calculated, when So, it under other embodiments, can also be not specifically limited in this embodiment using other regression algorithms, the present embodiment.Below It is specifically described by taking linear regression algorithm as an example, which can be indicated with following formula:

Wherein, m indicates the number of plies for doing feature extraction specified in depth convolutional neural networks, and k is any in m layers Layer, k=1 ..., m, sim_layerkFor the similarity between the feature vector of two pictures calculated on k layer, W_kFor with this sim_layerkCorresponding weight, similarity indicate the similarity between two pictures, belong to [0,1] section, are worth bigger table Show more similar, the similarity of identical picture is 1.The weight can be arranged according to the test result of test data, this Embodiment is not specifically limited in this embodiment.

In the above method, step 201 can be specifically included:

Depth convolutional neural networks model is chosen, the depth convolutional neural networks model includes that M layers of convolutional layer and N layers are complete Articulamentum；Specified K layer is chosen in M+N layers and does feature extraction, using the K layers of neuron output as feature vector；It uses Training data is based on the back-propagation algorithm training depth convolutional neural networks model, obtains the picture with K feature vector Feature extraction model.

Preferably, described K layers is full articulamentum, or including convolutional layer and full articulamentum.In the present embodiment, convolutional layer Including but not limited to: convolution, Rectified Linear Units (neural network excitation function), max-pooling (lower sampling Method) or normalization (standardization) etc., the present embodiment is not specifically limited in this embodiment.In order to improve the meter of similarity Precision is calculated, can choose described K layers includes convolutional layer and full articulamentum.

For example, with reference to Fig. 3, for the structural schematic diagram for the depth convolutional neural networks that one embodiment of the invention provides.Wherein, Left end is input terminal, and right end is output end, from left to right successively includes five layers of convolutional layer and three layers of full articulamentum.Input terminal is original Beginning picture, output end are the feature vector of picture.For each layer network, specific network parameter can be preset, so as to According to the parameter training model of setting.The present embodiment is not specifically limited the parameter of setting, is exemplified by Table 1 and is illustrated.

Table 1

Wherein, the picture of input is scaled to 227*227 pixel size, divides tri- color dimension inputs of RGB.First layer is extremely Layer 5 L1~L5 is convolutional layer, and layer 6 to the 8th layer of L6~L8 is full articulamentum.By taking first layer as an example, convolution filter Size is 11*11, and convolution stride is 4, this layer shares 96 convolution filters, and the output of this layer is the figure of 96 55*55 sizes Piece.After first layer convolutional filtering, also execution ReLUs operation and Max-Pooling operation.The full articulamentum of above three is equivalent to The full Connection Neural Network classifier of three layers added on the basis of five layers of convolutional layer.The neuron of layer 6 and layer 7 Number is 4096, and the 8th layer of neuron number is 1000, is equivalent to 1000 picture classifications of training objective.

In said structure, it specifies the number of plies of extraction feature unlimited, can be such as layer 6 to the 8th layer or third layer To layer 7 etc..Preferably, layer 5 can be chosen and do feature extraction to the 8th layer, that is, include that a convolutional layer and three are complete Articulamentum.The output of this four layers feature is denoted as four vectors respectively, the dimension of this four feature vectors be respectively 43264 (= 256*13*13), 4096,4096,1000.

In the present embodiment, convolutional layer belongs to low layer, and full articulamentum belongs to high level.The feature of low layer can characterize the thin of picture Save feature, such as texture, displacement, edge.It can according to above-mentioned network structure for the original image in figure for example, with reference to Fig. 4 To obtain the picture feature of convolutional layer first layer L1, second layer L2 and third layer L3 characterization as shown in the figure, it can be seen that these What feature reflected is the minutia of picture.High-rise feature can characterize the semantic feature of picture, such as scene, classification.Cause This, when not only having included convolutional layer but also having included full articulamentum for specified K layers, the feature of extraction is more acurrate, and the picture being calculated is similar Degree not only can capture the detail differences of picture, it is often more important that can from picture is semantically understood, therefore, picture similarity Calculating is more accurate, and accuracy rate, which can reach, to be obviously improved.

In order to which the accuracy that the above method provided in this embodiment calculates similarity is better described.It is given below specific Example proves.Referring to Fig. 5, for another embodiment of the present invention provides picture similarity calculation result schematic diagram.Wherein, Fig. 5 (a) Two pictures in two personages it is the same, but stance and text annotate different, the picture similarity calculated Similarity=0.641919076443, it is seen that similarity is still very high.In two pictures of Fig. 5 (b) from picture pixels It sees that similarity is not high, but understands in angle from picture, both women's shoes, it should which there is certain similarity, the result of calculating It is similarity=0.492526054382.The drafting of personage and position are all different in two pictures of Fig. 5 (c), but Due to being the same personage, so there is certain similarity, calculating the result is that similarity=0.509093165398.Fig. 5 It (d) is all the half body head portrait of a schoolgirl in two pictures, but one is game emphatically, the other is clothes emphatically, so Similarity-rough set is low, calculating the result is that similarity=0.216924488544.It can from these simple examples Out, method provided by the invention is while catching picture detail, more it will be appreciated that picture, so the picture calculated is similar It spends more reasonable.

In addition, it is noted that training the available picture classification of depth convolutional neural networks model in the present embodiment Model can be while obtaining picture classification model if specifying the layer of extraction feature during the training Training obtains picture feature extraction model.Therefore, picture classification model and feature extraction model are produced simultaneously in training.Its In, the picture classification model can be used to classify to picture, and have good precision.

For example, being got from the picture classification training data for patting net and getting a large amount of electric business industry from ImageNet The picture classification training data of 1000 classifications of the whole industry.Using these classification based training data, based on open source training platform Depth convolutional neural networks model has been respectively trained in Caffe, obtained Paipai electric business industry picture classification model and ImageNet picture classification model.The picture of above-mentioned acquisition is tested respectively with the two models, Fig. 6 show therein The test result of three pictures.Fig. 6 (a) and Fig. 6 (b) is the example for patting electric business industry picture classification device, and Fig. 6 (c) is The example of ImageNet picture classification device.There are 5 specific names below each picture, is successively Top1~Top5 according to ranking. The data of each classification right are that the picture calculated belongs to the probability of the classification.As the picture in Fig. 6 (a) belongs to first The probability of a classification ranking " Designer Men's Wear " is 0.910874903202.The probability that three pictures adhere to 5 classifications separately from figure can To find out, the result of Paipai electric business industry picture classification device classification belongs to the accuracy rate of Top1 classification up to 50%, belongs to The accuracy rate of Top1~Top5 classification is up to 85%；The result of ImageNet picture classification device classification belongs to the accurate of Top1 classification Rate belongs to the accuracy rate of Top1~Top5 classification up to 80% up to 40%.

Referring to Fig. 7, another embodiment of the present invention provides a kind of devices of determining picture similarity, comprising:

Training module 701, for obtaining picture feature extraction model using training data training neural network model；

Characteristic extracting module 702 is obtained for the first picture and second picture to be input to the picture feature extraction model Output neuron is as the feature vector of first picture on each layer of feature and the feature vector of the second picture；

First computing module 703, for calculating separately the feature vector and the second picture of first picture to each layer Feature vector between similarity；

Second computing module 704, for the spy according to the feature vector of first picture and the second picture on each layer The similarity between vector is levied, the similarity of first picture and second picture is calculated.

Referring to Fig. 8, in the present embodiment, optionally, the training module 701 may include:

First training unit 701a, for being based on back-propagation algorithm training depth convolutional neural networks using training data Model obtains picture feature extraction model；Or,

Second training unit 701b, for obtaining picture feature extraction using training data training deepness auto encoder Model.

In the present embodiment, optionally, first training unit can be used for:

Depth convolutional neural networks model is chosen, which includes that M layers of convolutional layer and N layers connect entirely Connect layer；

Preferably, described K layers is full articulamentum, or including convolutional layer and full articulamentum.

In the present embodiment, optionally, first computing module 703 may include:

First computing unit counts each layer for using cosine-algorithm, Jaccard algorithm or Pearson algorithm respectively Calculate the similarity between the feature vector of first picture and the feature vector of the second picture.

In the present embodiment, optionally, second computing module 704 may include:

Second computing unit, for using linear regression algorithm or logistic regression algorithm, according to first figure on each layer First picture and second picture is calculated in similarity between the feature vector of piece and the feature vector of the second picture Similarity.

Above-mentioned apparatus provided in this embodiment can execute the method that any of the above-described embodiment of the method provides, the process side of being detailed in Description in method embodiment, does not repeat herein.

Above-mentioned apparatus provided in this embodiment obtains picture feature using training data training neural network model and extracts mould Type；First picture and second picture are input to the picture feature extraction model, obtain output neuron as each of feature The feature vector of the feature vector of first picture and the second picture on layer；Described first is calculated separately to each layer Similarity between the feature vector of picture and the feature vector of the second picture；According to the first picture described on each layer Feature vector and the second picture feature vector between similarity, first picture and second picture is calculated Similarity；Wherein, the automatic extraction of feature is realized based on picture feature extraction model, without relying on by hand, is greatly saved Manpower and time have been saved, the efficiency of determining similarity is improved.

Another embodiment of the present invention provides a kind of server, which, which can be used for executing in above-described embodiment, is provided Determination picture similarity method.Referring to Fig. 9, which can generate bigger difference because configuration or performance are different It is different, it may include one or more central processing units (central processing units, CPU) 1922 (for example, one A or more than one processor) and memory 1932, one or more storage application programs 1942 or data 1944 are deposited Storage media 1930 (such as one or more mass memory units).Wherein, memory 1932 and storage medium 1930 can be with It is of short duration storage or persistent storage.The program for being stored in storage medium 1930 may include one or more module (diagrams Do not mark), each module may include to the series of instructions operation in server.Further, central processing unit 1922 It can be set to communicate with storage medium 1930, the series of instructions behaviour in storage medium 1930 executed on server 1900 Make.

Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, and/or, one or more operating systems 1941, example Such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..

Specifically in the present embodiment, it is configured to execute one or one by one or more than one processor Procedure above includes the instruction for performing the following operation:

It is wherein, described to obtain picture feature extraction model using training data training neural network model, comprising:

Wherein, described to be based on back-propagation algorithm training depth convolutional neural networks model using training data, obtain figure Piece feature extraction model, comprising:

Specified K layer is chosen in M+N layers and does feature extraction, using K layers of neuron output as feature vector；

It is based on the back-propagation algorithm training depth convolutional neural networks model using training data, is obtained with K The picture feature extraction model of feature vector.

Wherein, described K layers is full articulamentum, or including convolutional layer and full articulamentum.

Wherein, the feature of the feature vector and the second picture that first picture is calculated separately to each layer Similarity between vector, comprising:

Using cosine-algorithm, Jie Kade algorithm or Pearson came algorithm, first picture is calculated separately to each layer Similarity between feature vector and the feature vector of the second picture.

Wherein, described according to the feature vector of the first picture described on each layer and the feature vector of the second picture Between similarity, the similarity of first picture and second picture is calculated, comprising:

Using linear regression algorithm or logistic regression algorithm, according to the feature vector of the first picture described on each layer and The similarity of first picture and second picture is calculated in similarity between the feature vector of the second picture.

Above-mentioned server provided in this embodiment obtains picture feature using training data training neural network model and extracts Model；First picture and second picture are input to the picture feature extraction model, obtain output neuron as feature The feature vector of the feature vector of first picture and the second picture on each layer；Described is calculated separately to each layer Similarity between the feature vector of one picture and the feature vector of the second picture；According to the first figure described on each layer First picture and the second figure is calculated in similarity between the feature vector of piece and the feature vector of the second picture The similarity of piece；Wherein, the automatic extraction of feature is realized based on picture feature extraction model, without relying on by hand, greatly Manpower and time are saved, the efficiency of determining similarity is improved.

Another embodiment of the present invention provides a kind of non-volatile readable storage medium, one is stored in the storage medium Or multiple modules (programs), when which is used in equipment, the equipment can be made to execute as follows The instruction (instructions) of step:

Above-mentioned non-volatile readable storage medium provided in this embodiment is obtained using training data training neural network model To picture feature extraction model；First picture and second picture are input to the picture feature extraction model, obtain output mind Through member as the feature vector of first picture on each layer of feature and the feature vector of the second picture；To each layer Calculate separately the similarity between the feature vector of first picture and the feature vector of the second picture；According to described each Similarity on layer between the feature vector of first picture and the feature vector of the second picture is calculated described The similarity of one picture and second picture；Wherein, the automatic extraction of feature is realized based on picture feature extraction model, without according to Rely by hand, dramatically saves manpower and time, improve the efficiency of determining similarity.

Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware Complete, relevant hardware can also be instructed to complete by program, the program can store in it is a kind of it is non-volatile can It reads in storage medium, the non-volatile readable storage medium can be read-only memory, disk or CD etc..

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims

1. a kind of method of determining picture similarity, which is characterized in that the described method includes:

First picture and second picture are input to the picture feature extraction model, obtain output neuron as each of feature The feature vector of the feature vector of first picture and the second picture on layer；

The phase between the feature vector of first picture and the feature vector of the second picture is calculated separately to each layer Like degree；

According to the similarity between the feature vector of the first picture described on each layer and the feature vector of the second picture, The similarity of first picture and the second picture is calculated using regression algorithm, wherein be calculated on each layer Similarity correspondence between described eigenvector takes different weights.

2. the method according to claim 1, wherein described obtained using training data training neural network model Picture feature extraction model, comprising:

It is based on back-propagation algorithm training depth convolutional neural networks model using training data, picture feature is obtained and extracts mould Type；Or,

3. according to the method described in claim 2, it is characterized in that, described trained using training data based on back-propagation algorithm Depth convolutional neural networks model, obtains picture feature extraction model, comprising:

Depth convolutional neural networks model is chosen, the depth convolutional neural networks model includes M layers of convolutional layer and N layers of full connection Layer；

It is based on the back-propagation algorithm training depth convolutional neural networks model using training data, is obtained with K feature The picture feature extraction model of vector.

4. according to the method described in claim 3, it is characterized in that, described K layers is full articulamentum, or including convolutional layer and Full articulamentum.

5. the method according to claim 1, wherein described calculate separately first picture to each layer Similarity between feature vector and the feature vector of the second picture, comprising:

Using cosine-algorithm, Jie Kade algorithm or Pearson came algorithm, the feature of first picture is calculated separately to each layer Similarity between the feature vector of second picture described in vector sum.

6. the method according to claim 1, wherein the feature according to the first picture described on each layer The similar of first picture and second picture is calculated in similarity between the feature vector of second picture described in vector sum Degree, comprising:

Using linear regression algorithm or logistic regression algorithm, according to the feature vector of the first picture described on each layer and described The similarity of first picture and second picture is calculated in similarity between the feature vector of second picture.

7. a kind of device of determining picture similarity, which is characterized in that described device includes:

Characteristic extracting module is exported for the first picture and second picture to be input to the picture feature extraction model Neuron is as the feature vector of first picture on each layer of feature and the feature vector of the second picture；

First computing module, for each layer calculated separately first picture feature vector and the second picture Similarity between feature vector；

For according to the phase between the feature vector of the first picture described on each layer and the feature vector of the second picture Like degree, the similarity of first picture and the second picture is calculated using regression algorithm, wherein calculate on each layer To described eigenvector between the corresponding module for taking different weights of similarity.

8. device according to claim 7, which is characterized in that the training module includes:

First training unit is obtained for being based on back-propagation algorithm training depth convolutional neural networks model using training data To picture feature extraction model；Or,

Second training unit, for obtaining picture feature extraction model using training data training deepness auto encoder.

9. device according to claim 8, which is characterized in that first training unit is used for:

10. device according to claim 9, which is characterized in that described K layers is full articulamentum, or including convolutional layer With full articulamentum.

11. device according to claim 7, which is characterized in that first computing module includes:

First computing unit calculates separately institute to each layer for using cosine-algorithm, Jie Kade algorithm or Pearson came algorithm State the similarity between the feature vector of the first picture and the feature vector of the second picture.

12. device according to claim 7, which is characterized in that described device further includes the second computing module, and described second Computing module includes:

Second computing unit, for using linear regression algorithm or logistic regression algorithm, according to the first figure described on each layer First picture and the second figure is calculated in similarity between the feature vector of piece and the feature vector of the second picture The similarity of piece.

13. a kind of computer readable storage medium, the storage medium is stored with one or more instruction, one Or more than one instruction is configured to be executed by processor such as the determination picture similarity as described in claim 1 to 6 is any Method.