CN108073934A

CN108073934A - Nearly multiimage detection method and device

Info

Publication number: CN108073934A
Application number: CN201611020242.2A
Authority: CN
Inventors: 安山; 陈宇; 黄志标; 汪振华; 麻晓珍; 翁志
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2016-11-17
Filing date: 2016-11-17
Publication date: 2018-05-25

Abstract

The invention discloses a kind of nearly multiimage detection method and device, are related to picture search field.Method therein includes：Picture in picture set to be detected is inputted into deep learning network model respectively, exports the global characteristics of each picture in picture set to be detected；The global characteristics of each picture in picture set to be detected are quantified as by two-value Bit String by two-value hash algorithm；According to the distance between two-value Bit String of each picture in picture set to be detected, determine that the nearly repetition in picture set to be detected is schemed.So as to accurately and rapidly concentrate detection is near to repeat to scheme in large-scale data.

Description

Nearly multiimage detection method and device

Technical field

The present invention relates to picture search field, more particularly to a kind of nearly multiimage detection method and device.

Background technology

More and more commodity pictures are generated with the development of internet.User has acquisition is near to repeat figure (i.e. at present Close degree be less than preset value picture) demand.However the characteristic of picture is more complicated, for possessing online business easily As the more than one hundred million commodity picture in city for large-scale database, it is a problem to detect the nearly figure that repeats.

Existing image detecting method usually first detects similar picture, then closely heavy by setting rational threshold determination Multiple figure.Wherein, most image detecting methods depends on vision bag of words.The local spy of vision bag of words extraction picture Sign, with infomation detection and common weighting technique TF-IDF (the term frequency-inverse document of image detection Frequency scoring) is weighted, obtains similar pictures.This method due to being detected using the local feature of picture, because This false drop rate is higher, and detection efficiency is than relatively low, it is difficult to for the nearly detection for repeating figure of large-scale dataset.

The content of the invention

The technical problem that the present invention solves is to seek a kind of nearly detection for repeating figure suitable for large-scale dataset Technology.

One side according to embodiments of the present invention provides a kind of nearly multiimage detection method, which is characterized in that bag It includes：Picture in picture set to be detected is inputted into deep learning network model respectively, is exported every in picture set to be detected The global characteristics of a picture；The global characteristics of each picture in picture set to be detected are quantified as two by two-value hash algorithm It is worth Bit String；According to the distance between two-value Bit String of each picture in picture set to be detected, pictures to be detected are determined Nearly repetition in conjunction is schemed.

In some embodiments, the picture in picture set to be detected is inputted into deep learning network model respectively, exported The global characteristics of each picture in picture set to be detected include：Picture in picture set to be detected is inputted respectively GoogleNET network models；The information of the average average pooling layers of output in pond in GoogleNET network models is made To input the global characteristics of picture.

In some embodiments, this method further includes：It is trained using the global characteristics of the picture in picture set to be detected Two-value hash algorithm.

In some embodiments, according to the distance between two-value Bit String of each picture in picture set to be detected, really The nearly figure that repeats in fixed picture set to be detected includes：It is each in being closed to pictures to be detected according to the global characteristics of picture Picture is clustered；In each cluster, the distance between two-value Bit String of each picture is calculated, and will be apart from less than default The picture of threshold value is determined as closely repeating to scheme.

In some embodiments, according to the distance between two-value Bit String of each picture in picture set to be detected, really The nearly figure that repeats in fixed picture set to be detected includes：It is each in being closed to pictures to be detected according to the global characteristics of picture Picture is clustered；Determine the immediate cluster centre of global characteristics with samples pictures；Calculate immediate cluster centre institute The distance between the two-value Bit String of each picture in corresponding cluster and the two-value Bit String of samples pictures；Will apart from less than The picture of predetermined threshold value is determined as the near of samples pictures and repeats to scheme.

In some embodiments, according to the distance between two-value Bit String of each picture in picture set to be detected, really The nearly figure that repeats in fixed picture set to be detected includes：According between the two-value Bit String of each picture in picture set to be detected Distance, the picture less than predetermined threshold value of adjusting the distance carries out color filtering, and the picture that the close degree of color is met to preset value is true It is set to nearly repetition to scheme.

In some embodiments, the picture that the close degree of color is met to preset value is determined as closely repeating figure including：It will be away from Hsv color space is converted to by RGB color from satisfactory picture；According to the different value amounts of H, S, V of pixel Turn to corresponding color；Count the accounting information of the pixel of each color；The accounting difference of the pixel of a variety of colors is small It is determined as closely repeating to scheme in the picture of preset value.

Other side according to embodiments of the present invention provides a kind of nearly multiimage detection device, which is characterized in that Including：Global characteristics determining module, it is defeated for the picture in picture set to be detected to be inputted deep learning network model respectively Go out the global characteristics of each picture in picture set to be detected；Global characteristics quantization modules, for passing through two-value hash algorithm The global characteristics of each picture in picture set to be detected are quantified as two-value Bit String；It is near to repeat figure determining module, for root The distance between two-value Bit String according to each picture in picture set to be detected, determines the nearly repetition in picture set to be detected Figure.

In some embodiments, global characteristics determining module is included：Mode input unit, for by pictures to be detected Picture in conjunction inputs GoogleNET network models respectively；Global characteristics determination unit, for by GoogleNET network models In average average pooling layer information exported in pond as the global characteristics for inputting picture.

In some embodiments, device further includes：Two-value hash algorithm training module, for utilizing picture set to be detected In picture global characteristics training two-value hash algorithm.

In some embodiments, closely repeating figure determining module includes：Cluster cell, for the global characteristics according to picture, Each picture in being closed to pictures to be detected clusters；Metrics calculation unit, in each cluster, calculating each figure The distance between two-value Bit String of piece；Near to repeat figure determination unit, the picture for distance to be less than to predetermined threshold value is determined as It is near to repeat to scheme.

In some embodiments, closely repeating figure determining module includes：Cluster cell, for the global characteristics according to picture, Each picture in being closed to pictures to be detected clusters；Cluster centre determination unit is complete with samples pictures for determining Office's immediate cluster centre of feature；Metrics calculation unit, for calculating in the cluster corresponding to immediate cluster centre The distance between the two-value Bit String of each picture and the two-value Bit String of samples pictures；Nearly repeatedly figure determination unit, for inciting somebody to action Distance is determined as the near of samples pictures less than the picture of predetermined threshold value and repeats to scheme.

In some embodiments, closely repeating figure determining module includes：Apart from filter element, for according to pictures to be detected The distance between two-value Bit String of each picture in conjunction filters out the picture that distance is less than predetermined threshold value；Color filtering unit, Picture for adjusting the distance less than predetermined threshold value carries out color filtering, and the picture that the close degree of color is met to preset value is determined as It is near to repeat to scheme.

In some embodiments, color filtering unit includes：Color space conversion subelement, for distance to be met the requirements Picture hsv color space is converted to by RGB color；Color quantizing subelement, for according to H, S, V of pixel not Corresponding color is quantified as with value；Information Statistics subelement, for counting the accounting information of the pixel of each color；Nearly weight Multiple figure determination subelement, the picture for the accounting difference of the pixel of a variety of colors to be less than to preset value are determined as closely repeating Figure.

Another aspect according to embodiments of the present invention provides a kind of nearly multiimage detection device, which is characterized in that Including：Memory；And the processor of memory is coupled to, processor is configured as based on instruction stored in memory, Perform above-mentioned image detecting method.

The present invention detects picture using the global characteristics of picture, and accuracy is higher, and by the global characteristics of picture Two-value Bit String is quantified as, determines that nearly repetition is schemed according to the distance between two-value Bit String, detection efficiency is higher.It is this accurate It can be adapted for the nearly detection for repeating figure of large-scale dataset with quick detection method.

By referring to the drawings to the detailed description of exemplary embodiment of the present invention, other feature of the invention and its Advantage will become apparent.

Description of the drawings

It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other attached drawings according to these attached drawings.

Fig. 1 shows the flow diagram of one embodiment of nearly multiimage detection method of the invention.

Fig. 2 shows the flow diagram of another embodiment of nearly multiimage detection method of the invention.

Fig. 3 A and Fig. 3 B are shown respectively the present invention and carry the commodity picture of logo and showing for the commodity picture after removal logo It is intended to.

Fig. 4 shows the structure diagram of one embodiment of nearly multiimage detection device of the invention.

Fig. 5 shows the structure diagram of another embodiment of nearly multiimage detection device.

Fig. 6 shows the structure diagram of another embodiment of nearly multiimage detection device of the invention.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Below Description only actually at least one exemplary embodiment is illustrative, is never used as to the present invention and its application or makes Any restrictions.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Lower all other embodiments obtained, belong to the scope of protection of the invention.

The nearly figure that repeats in the present invention refers to that close degree is less than the picture of preset value.For example, essentially identical picture, figure Piece and its scaling pictures, alternatively, essentially identical picture of other parts etc. in addition to trade mark position, can be considered as nearly repetition and scheme.It is logical In the case of often, in addition to subtle aberration, the different picture of color, which is considered as non-near repetition, schemes.

The nearly image detection for repeating figure detection provided by the invention suitable for large-scale dataset is described with reference to Fig. 1 Method.

Fig. 1 shows the flow diagram of one embodiment of nearly multiimage detection method of the invention.It as shown in Figure 1, should The nearly multiimage detection method of embodiment includes:

Picture in picture set to be detected is inputted deep learning network model by step S102 respectively, and output is to be detected The global characteristics of each picture in picture set.

Specifically, deep learning network model can be GoogleNET network models, VGG network structures, AlexNET Network structure.

Preferably, the GoogleNET network models with more excellent feature descriptive power can be selected.Due to GoogleNet 1024 dimensional feature energy of the average average pooling layers of output in pond of the last one in each convolutional layer of network model output Enough global features with best effect description input picture, therefore average pond preferably in GoogleNet network models Global characteristics of the information of pooling layers of outputs of average as input picture.

The global characteristics of each picture in picture set to be detected are quantified as two by step S104 by two-value hash algorithm It is worth Bit String.

Step S106 according to the distance between two-value Bit String of each picture in picture set to be detected, is determined to be checked Nearly repetition in mapping piece set is schemed.

Wherein, the distance between the two-value Bit String is such as can be Hamming distance or Euclidean distance.Preferably, select Hamming distance.Hamming distance can be obtained by calculating the exclusive or distance of two two-value codes.In order to further improve picture detection Efficiency, SSE2 hardware instructions _ mm_popcnt_u64 that efficiency can be selected higher calculate Hamming distance.Specifically, by two Value code saves as 64 bit formats, then calculates Hamming distance using the hardware instruction.For 1024bit two-value codes, i.e., 16 64 The two-value code of bit format, the single search efficiency in 100,000 data sets take 0.00203068 second, with calculating the Chinese by shifting Prescribed distance takes 0.06971355 second and compares, apparent with hardware instruction acceleration effect.

Above-mentioned implementation detects picture using the global characteristics of picture, and accuracy is higher, and the overall situation of picture is special Sign is quantified as two-value Bit String, determines that nearly repetition is schemed according to the distance between two-value Bit String, detection efficiency is higher.This standard It really can be adapted for the nearly detection for repeating figure of large-scale dataset with quick detection method.From the perspective of quantity, this The nearly demand for repeating figure detection, user being better met of image data collection of the invention scale support more than 100,000,000.

The image detecting method of the present invention can be divided into off-line training and the part of on-line checking two is realized, additionally can be with Picture detection efficiency is further improved using the method that bucket is divided to search for.The process is described with reference to Fig. 2.

Fig. 2 shows the flow diagram of another embodiment of nearly multiimage detection method of the invention.As shown in Fig. 2, The nearly multiimage detection method of the present embodiment comprises the following steps：

First, off-line training is carried out, including：

Step S201 if there is picture identification in picture, can first remove picture identification, then perform subsequent step again； If there is no picture identification in picture, subsequent step can be directly performed.

Wherein, usually there is the picture identification such as logo in commodity picture, after logo being removed, then carry out subsequent instruction White silk and detection process.

Following methods for example may be employed in removal picture identification：The commodity of default quantity are randomly choosed in commodity picture storehouse Picture, the position that statistics commodity sign occurs in these commodity pictures, and a rectangular area Z is determined according to statistical information, Rectangular area Z can cover the picture identification region of such as more than 95% picture, then remove the rectangular area in each picture Rectangular area Z is arranged to unified color by Z.

It is a commodity picture for carrying logo for example, with reference to Fig. 3 A and 3B, Fig. 3 A, Fig. 3 B are the commodity removed after logo Picture.

Removal picture identification can eliminate the image content difference caused by identifying difference, eliminate picture identification in picture The interference of appearance, convenient for the detection subsequently for closely repeating figure.

Step S202 is trained deep learning network model using training picture set, further improves picture inspection The accuracy of survey.

Wherein, training picture set used for example can be a part of picture chosen from picture set to be detected, It can be individual picture set.

Picture in picture set to be detected is inputted deep learning network model by step S203 respectively, and output is to be detected The global characteristics of each picture in picture set.

Step S204 trains two-value hash algorithm using the global characteristics of the picture in picture set to be detected.

Preferably, two-value Hash training is carried out using iterative quantization (Iterative Quantization) method.Iteration Quantization method mainly carries out linear dimension to data and about subtracts, and realizes that two-value quantifies in result space afterwards.Using iteration amount Change the column vector matrix for the hyperplane coefficient that algorithm is acquired when making two-value Hash quantization error minimum, according to the row of hyperplane coefficient The global characteristics matrix for the picture chosen in vector matrix and picture set to be detected determines two-value code matrix, and by two-value code Matrix is determined as the two-value hash algorithm that training obtains.The training process of two-value hash algorithm is detailed below.

For n data point { x₁,x₂,…,x_n, the row of composition data matrix X, x_i∈R^d, each data point is a figure The global characteristics of piece, by taking GoogleNET network models as an example, d 1024.It is assumed that data are centered on 0, i.e., Target is study two-value code matrix B ∈ { -1,1 }^n×c, and by two-value code matrix B ∈ { -1,1 }^n×cAs two-value hash algorithm.Its Middle c represents two-value code length.For each bit k=1 ..., c, binary-coding function is defined as h_k(x)=sgn (xw_k), Middle w_kIt is the column vector of hyperplane coefficient, and exists：

Therefore, entire cataloged procedure can be denoted as to B=sgn (XW), wherein W ∈ R^d×cIt is to be classified as w_kMatrix.Generate two-value The target of coding be each bit maximum variance and bit between it is paired uncorrelated.It can be by maximizing following mesh Scalar functions are realized:

By generating the bit of complete equipilibrium, the data point h of variance, i.e. half can be maximized_k(x)=1, in addition one Half strong point is -1.But this newly-increased requirement causes above formula that can not solve.Above formula can be relaxed as following successive objective letter Number：

Limitation W therein^TW=I needs all Hash hyperplane orthogonal each other.For the coding of c bits, can pass through Obtain data covariance matrix X^TC characteristic vector of X maximums is to obtain W.It therefore, can be by asking for two-value code matrix B Solution is converted into the solution to W.

A kind of specific solution of iterative quantization method is described below.

If the vector in space after mapping is denoted as v ∈ R^c, then sgn (v) is hypercube { -1,1 }^cOn according to Euclidean distance Closest to the vertex of v.Quantization error | | sgn (v)-v | |²Smaller, the binary-coding obtained gets over the original office of energy retention data Portion's structure.

If W is optimal solution, for arbitrary orthogonal c × c matrixes R,And optimal solution.It so can be just Conversion is handed over to map data V=XW, for minimizing quantization error

Wherein | | | |_FRepresent F norms.

Random initializtion R first then obtains the minimum value of quantization error using the step of similar k means Methods. During each iteration, each data point first distributes to the nearest angle point of two-value hypercube, updates R afterwards and quantifies to miss for minimizing Difference.

In fixed R, update B：

Due to fixed projection after data matrix V=XW, minimize above formula be equivalent to maximize

WhereinRepresent elementCompared with B, maximizing the condition of this formula is, whenWhen, B_ij= 1, whenWhen, B_ij=-1.That is B=sgn (VR).

In fixed B, update R：

For fixed B, object function corresponds to classical orthogonal Procrustes problems, that is, finds so that a point set and another The rotating vector of one point set alignment, two point sets are provided by having mapped data V and target two-value code matrix B.For fixing B, Optimization method is to calculate c row c row B first^TV matrix singular value decomposition S, then makeAcquire the value of R.

During this kind of solution, by way of to variable B, variable R iteration, W when making quantization error Q minimums is acquired Value, so as to the specific value according to W exploitations B.

Then, on-line checking is carried out, including：

Step S205, the global characteristics based on picture, each picture in being closed to pictures to be detected are clustered, are convenient for It is scanned in class.

The global characteristics of each picture in picture set to be detected are quantified as two by step S206 by two-value hash algorithm It is worth Bit String.

Step S207 carries out the nearly search for repeating figure, to realize according to the distance of the two-value Bit String of picture and cluster Divide bucket search.Two kinds of typical cases are set forth below.

Step S207A, the nearly repetition detected in pictures to be detected are schemed.Specifically, in each cluster, each figure is calculated The distance between two-value Bit String of piece, and distance is determined as closely repeating to scheme less than the picture of predetermined threshold value.

Step S207B, near in picture centralized detecting samples pictures to be detected repeat to scheme.Specifically, definite and sample graph The immediate cluster centre of global characteristics of piece, calculates two of each picture in the cluster corresponding to immediate cluster centre It is worth the distance between Bit String and the two-value Bit String of samples pictures, and will be determined as apart from the picture for being less than predetermined threshold value described The near of samples pictures repeats to scheme.

With 10,000,000 data instances, if cluster is 100 classes, about 100,000 data in every class are single to 1024bit two-value codes The secondary search used time is about 2 milliseconds, and the nearly about 200 seconds time for repeating figure detection is realized in such.There are new samples pictures needs During retrieval, first compared with 100 class centers, nearest class center is selected, using the picture that such is included as detection range It is retrieved, obtains close-repetitive picture.By selecting suitable clusters number, the pictures to be detected of more than one hundred million ranks can be supported Close-repetitive picture detection, further improve the nearly detection efficiency for repeating figure.

Optionally, on-line checking part can also include：

Step S208, satisfactory picture of adjusting the distance carry out color filtering, the close degree of color are met preset value Picture is determined as closely repeating to scheme.

A kind of specific embodiment is can will to be converted to HSV face by RGB color apart from satisfactory picture The colour space is quantified as corresponding color according to the different values of H, S, V of pixel, such as is quantified as 33 kinds of colors.Then, unite The accounting information of the pixel of each color is counted, and the accounting difference of the pixel of a variety of colors is true less than the picture of preset value It is set to nearly repetition to scheme.For example, first two color and its accounting are asked for testing result picture, if the first two face with samples pictures Color is identical and the sum of color accounting difference is within 5%, then it is assumed that the two color is identical, and testing result is retained.

In above-described embodiment, color filtering is carried out to closely repeating figure testing result, the different inspection of picture color can be excluded It surveys as a result, elimination color further improves the nearly accuracy for repeating figure of the invention for the closely repeatedly influence of figure testing result.

The nearly multiimage detection device of one embodiment of the invention is described with reference to Fig. 4.

Fig. 4 shows the structure diagram of one embodiment of nearly multiimage detection device of the invention.It as shown in figure 4, should The nearly multiimage detection device 40 of embodiment includes：

Global characteristics determining module 402, for the picture in picture set to be detected to be inputted deep learning network respectively Model exports the global characteristics of each picture in picture set to be detected；

Global characteristics quantization modules 404, for passing through two-value hash algorithm by each picture in picture set to be detected Global characteristics are quantified as two-value Bit String；

It is near to repeat figure determining module 406, for according between the two-value Bit String of each picture in picture set to be detected Distance, determine in picture set to be detected near repeats to scheme.

Above-described embodiment detects the global characteristics of each picture in picture set by obtaining, and passes through two-value Hash and calculate The global characteristics of each picture in picture set to be detected are quantified as two-value Bit String by method, finally according to picture set to be detected In each picture the distance between two-value Bit String, determine in picture set to be detected near repeats to scheme.So as to accurate, quick Realize and near repeat figure detection what large-scale data was concentrated.From the perspective of quantity, scale support of the present invention is more than 100,000,000 Image data collection the nearly demand for repeating figure detection, user being better met.

Optionally, when detecting the nearly repetition figure in pictures to be detected, global characteristics determining module 402 includes：

Mode input unit 4022, for the picture in picture set to be detected to be inputted GoogleNET network moulds respectively Type.

Global characteristics determination unit 4024, for by average pond average pooling in GoogleNET network models Global characteristics of the information of layer output as input picture.

Optionally, image detection device 40 further includes：

Two-value hash algorithm training module 403, for being trained using the global characteristics of the picture in picture set to be detected The two-value hash algorithm.

Optionally, when the nearly repetition of picture centralized detecting samples pictures to be detected is schemed, the nearly figure determining module 406 that repeats is wrapped It includes：

Cluster cell 4062, for the global characteristics according to picture, each picture in being closed to pictures to be detected carries out Cluster.

Metrics calculation unit 4064, in each cluster, calculating the distance between two-value Bit String of each picture.

Near to repeat figure determination unit 4066, the picture for distance to be less than to predetermined threshold value is determined as closely repeating to scheme.

Optionally, closely repeating figure determining module 406 includes：

Cluster cell 4062, for the global characteristics according to picture, each picture in being closed to pictures to be detected carries out Cluster；

Cluster centre determination unit 4063, for determining the immediate cluster centre of global characteristics with samples pictures；

Metrics calculation unit 4065, for calculating two of each picture in the cluster corresponding to immediate cluster centre It is worth the distance between Bit String and the two-value Bit String of samples pictures；

Near to repeat figure determination unit 4066, the picture for distance to be less than to predetermined threshold value is determined as the samples pictures It is near to repeat to scheme.

Optionally, closely repeating figure determining module 406 includes：

Apart from filter element 4067, for according between the two-value Bit String of each picture in picture set to be detected away from From, filter out distance be less than predetermined threshold value picture；

Color filtering unit 4068, the picture for adjusting the distance less than predetermined threshold value carries out color filtering, and color is close The picture that degree meets preset value is determined as closely repeating to scheme.

Optionally, color filtering unit 4068 includes：

Color space conversion subelement, for hsv color will to be converted to by RGB color apart from satisfactory picture Space；

Color quantizing subelement, for being quantified as corresponding color according to the different values of H, S, V of pixel；

Information Statistics subelement, for counting the accounting information of the pixel of each color；

It is near to repeat figure determination subelement, it is true for the accounting difference of the pixel of a variety of colors to be less than to the picture of preset value It is set to nearly repetition to scheme.

Fig. 5 is the structure chart of one embodiment of the nearly multiimage detection device of the present invention.As shown in figure 5, the embodiment Nearly multiimage detection device 50 include：Memory 510 and the processor 520 for being coupled to the memory 510, processor 520 are configured as, based on the instruction being stored in memory 510, performing the image detection side in any one foregoing embodiment Method.

Wherein, memory 510 is such as can include system storage, fixed non-volatile memory medium.System stores Device is such as being stored with operating system, application program, Boot loader (Boot Loader) and other programs.

Fig. 6 is the structure chart of another embodiment of the nearly multiimage detection device of the present invention.As shown in fig. 6, the implementation The device 60 of example includes：Memory 510 and processor 520, can also include input/output interface 630, network interface 640, Memory interface 650 etc..It can for example pass through bus between these interfaces 630,640,650 and memory 510 and processor 520 650 connections.Wherein, input/output interface 630 is display, the input-output equipment such as mouse, keyboard, touch-screen provide connection and connect Mouthful.Network interface 640 provides connecting interface for various networked devices.Memory interface 650 is the external storages such as SD card, USB flash disk Connecting interface is provided.

Present invention additionally comprises a kind of computer readable storage mediums, are stored thereon with computer instruction, which is processed Device realizes the image detecting method in any one foregoing embodiment when performing.

It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or computer program Product.Therefore, the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware can be used in the present invention Apply the form of example.Moreover, the computer for wherein including computer usable program code in one or more can be used in the present invention The calculating implemented on non-transient storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) can be used The form of machine program product.

The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided The processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that the instruction performed by computer or the processor of other programmable data processing devices is generated for real The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.

These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction generation being stored in the computer-readable memory includes referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.

These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to generate computer implemented processing, so as in computer or The instruction offer performed on other programmable devices is used to implement in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modifications, equivalent replacements and improvements are made should all be included in the protection scope of the present invention.

Claims

1. a kind of nearly multiimage detection method, which is characterized in that including：

Picture in picture set to be detected is inputted into deep learning network model respectively, is exported every in picture set to be detected The global characteristics of a picture；

The global characteristics of each picture in picture set to be detected are quantified as by two-value Bit String by two-value hash algorithm；

According to the distance between two-value Bit String of each picture in picture set to be detected, determine in picture set to be detected It is near to repeat to scheme.

2. the method as described in claim 1, which is characterized in that the picture by picture set to be detected inputs depth respectively Learning network model is spent, exporting the global characteristics of each picture in picture set to be detected includes：

Picture in picture set to be detected is inputted into GoogleNET network models respectively；

Using the information of the average average pooling layers of output in pond in GoogleNET network models as the complete of input picture Office's feature.

3. the method as described in claim 1, which is characterized in that the method further includes：

The two-value hash algorithm is trained using the global characteristics of the picture in picture set to be detected.

4. the method as described in claim 1, which is characterized in that the two-value according to each picture in picture set to be detected The distance between Bit String determines that the nearly figure that repeats in picture set to be detected includes：

According to the global characteristics of picture, each picture in being closed to pictures to be detected clusters；

Determine the immediate cluster centre of global characteristics with samples pictures；

Calculate the two-value Bit String of each picture and the two-value of samples pictures in the cluster corresponding to immediate cluster centre The distance between Bit String；

Distance is determined as the near of the samples pictures less than the picture of predetermined threshold value to repeat to scheme.

5. the method as described in claim 1, which is characterized in that the two-value according to each picture in picture set to be detected The distance between Bit String determines that the nearly figure that repeats in picture set to be detected includes：

According to the distance between two-value Bit String of each picture in picture set to be detected, adjust the distance less than the figure of predetermined threshold value Piece carries out color filtering, and the picture that the close degree of color is met to preset value is determined as closely repeating to scheme.

6. method as claimed in claim 5, which is characterized in that the picture that the close degree of color is met to preset value determines Include closely to repeat figure：

Hsv color space will be converted to by RGB color apart from satisfactory picture；

Different values according to H, S, V of pixel are quantified as corresponding color；

Count the accounting information of the pixel of each color；

The picture that the accounting difference of the pixel of a variety of colors is less than to preset value is determined as closely repeating to scheme.

7. a kind of nearly multiimage detection device, which is characterized in that including：

Global characteristics determining module, it is defeated for the picture in picture set to be detected to be inputted deep learning network model respectively Go out the global characteristics of each picture in picture set to be detected；

Global characteristics quantization modules, for passing through two-value hash algorithm by the global characteristics of each picture in picture set to be detected It is quantified as two-value Bit String；

It is near to repeat figure determining module, for according to the distance between two-value Bit String of each picture in picture set to be detected, Determine that the nearly repetition in picture set to be detected is schemed.

8. device as claimed in claim 7, which is characterized in that the global characteristics determining module includes：

Mode input unit, for the picture in picture set to be detected to be inputted GoogleNET network models respectively；

Global characteristics determination unit, for export average pooling layers of average pond in GoogleNET network models Global characteristics of the information as input picture.

9. device as claimed in claim 7, which is characterized in that described device further includes：

Two-value hash algorithm training module, for training the two-value using the global characteristics of the picture in picture set to be detected Hash algorithm.

10. device as claimed in claim 7, which is characterized in that the nearly figure determining module that repeats includes：

Cluster cell, for the global characteristics according to picture, each picture in being closed to pictures to be detected clusters；

Cluster centre determination unit, for determining the immediate cluster centre of global characteristics with samples pictures；

Metrics calculation unit, for calculating the two-value Bit String of each picture in the cluster corresponding to immediate cluster centre The distance between two-value Bit String of samples pictures；

Near to repeat figure determination unit, the picture for distance to be less than to predetermined threshold value is determined as the nearly repetition of the samples pictures Figure.

11. device as claimed in claim 7, which is characterized in that the nearly figure determining module that repeats includes：

Apart from filter element, for according to the distance between two-value Bit String of each picture in picture set to be detected, filtering Go out the picture that distance is less than predetermined threshold value；

Color filtering unit, the picture for adjusting the distance less than predetermined threshold value carry out color filtering, the close degree of color are met The picture of preset value is determined as closely repeating to scheme.

12. device as claimed in claim 11, which is characterized in that the color filtering unit includes：

Color space conversion subelement, for hsv color sky will to be converted to by RGB color apart from satisfactory picture Between；

Near to repeat figure determination subelement, the picture for the accounting difference of the pixel of a variety of colors to be less than to preset value is determined as It is near to repeat to scheme.

13. a kind of nearly multiimage detection device, which is characterized in that including：

Memory；And

The processor of the memory is coupled to, the processor is configured as based on the instruction being stored in the memory, Perform such as image detecting method described in any item of the claim 1 to 8.