CN102122389A - Method and device for judging image similarity - Google Patents

Method and device for judging image similarity Download PDF

Info

Publication number
CN102122389A
CN102122389A CN2010100022407A CN201010002240A CN102122389A CN 102122389 A CN102122389 A CN 102122389A CN 2010100022407 A CN2010100022407 A CN 2010100022407A CN 201010002240 A CN201010002240 A CN 201010002240A CN 102122389 A CN102122389 A CN 102122389A
Authority
CN
China
Prior art keywords
picture
commodity
color value
main color
pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010100022407A
Other languages
Chinese (zh)
Inventor
戴能
贾梦雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN2010100022407A priority Critical patent/CN102122389A/en
Publication of CN102122389A publication Critical patent/CN102122389A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and device for judging image similarity, solving the problem of inaccurate picture similarity judgement. The method disclosed by the invention comprises the following steps: respectively performing the following steps for two pictures subjected to picture similarity judgement to obtain the vector space of each picture; dividing each picture into a plurality of zones; respectively determining the main color value of each zone as the mean value of the color value of a pixel point in the zone; determining the main color value of the whole picture as the mean value of the color value of all pixel points of the whole picture; obtaining the vector space according to the main color values of multiple zones and the main color value of the whole picture; and comparing the vector spaces corresponding to the two pictures subjected to picture similarity judgement to determine the similarity of the two pictures. Because the pictures can obtain the vector space comprising a plurality of main color values, and the values are stable, thus the picture similarity is accurately judged.

Description

Method and device that a kind of image similarity is judged
Technical field
The application belongs to technical field of image processing, method and device that particularly a kind of image similarity is judged.
Background technology
Current shopping online classifies as different set with commodity, is the important method that a kind of user of help does shopping.To commodity according to certain attributive classification, obtain several different classes of commodity, attribute according to other carries out (finer) classification in subcategory again, by continuous sort operation, we have just obtained specific commodity set to the end, are that NOKIA, model are the mobile phone of N73 such as black, manufacturer.Wherein ' black ', ' NOKIA ', ' N73 ' and even ' mobile phone ' are each values of different attribute.
The classification of some commodity is clearer and more definite, such as mobile phone, must be certain specific brand, specific model etc., and behind the various attributes of having registered mobile phone (such as brand, model, pattern), just can service routine be classified by commodity automatically, they are ranged identical or different set.
And for other commodity, classification is just so not clear and definite, both can be this, can be the sort of yet, such as the upper garment in the clothes, defend clothing, long sleeves.Owing to can't register the various attributes of these commodity sometimes fully, the value of some attribute also can't be determined simultaneously, such as color, pattern etc.These difficulties directly cause and they can't be classified as in the identical or different set.
Based on second kind of situation, the solution that has in the prior art is the pictorial information by them, with using the commodity of similar pictures to condense together, they can be classified as in the identical or different set.Use picture that each commodity all has as attribute, attribute is decided.Use picture as a big advantage of attribute be picture with respect to literal, the cost of modification is higher, so more credible.Simultaneously, from picture, extract enough different commodity can be distinguished and can judge information similar, as the value of attribute.Like this, different commodity just can compare mutually, classify.Therefore adopt picture as attribute, at first will from picture, obtain to distinguish different commodity and can carry out ratio of similitude information.At present for picture, owing to can't understand the content of picture, need calculate the hashed value of this picture correspondence by the MD5 algorithm to picture, utilize this hashed value representative picture to compare, do following shortcoming like this: the hashed value that obtains by the MD5 algorithm can only identify the uniqueness of picture, one pictures has some variation a little, even the variation that can't discern, also can cause diverse hashed value to occur.Therefore also just can't carry out similar coupling, scheme to change any and change much for one, can't make a distinction from the hashed value that calculates at all.When as seen existence is carried out the similarity judgement to the commodity picture in the prior art, the commodity picture can not be accurately identified after being changed a bit, cause the not accurate enough problem of commodity picture analogies judgement, and amount of calculation is big, causes commodity picture analogies judging efficiency low.
Summary of the invention
In order to solve the not accurate enough problem of commodity picture analogies judgement in the prior art, the method that the embodiment of the present application provides a kind of image similarity to judge comprises:
The commodity picture that has obtained is divided into a plurality of zones, calculates the main color value in each zone and the main color value of commodity picture integral body, main color value is by getting the average acquisition to pixel in the zone or the whole all color values of pixel of commodity picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of commodity picture integral body;
Vector space to a plurality of commodity picture correspondences is compared, and in certain threshold range, determines the commodity picture analogies of comparing according to difference.
The embodiment of the present application also provides a kind of method of merchandise news polymerization simultaneously, comprising:
Two commodity pictures that carry out the image similarity judgement are carried out following step respectively, obtain the vector space of each picture:
The commodity picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite commodity picture integral body is the whole all averages of pixel color value of commodity picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of commodity picture integral body;
The vector space that carries out two commodity picture correspondences that image similarity judges is compared, determine the similarity of two commodity pictures;
To use the merchandise news of similar commodity picture commodity to be aggregated in the identity set.
Simultaneously, the embodiment of the present application also provides a kind of image search method, comprising:
Search engine server receives the searching request of the user inquiring picture of client transmission;
Search engine server is divided into several zones with picture to be checked, the main color value of determining each zone respectively is the average of the pixel color value in the zone, and determines that the main color value of picture integral body to be checked is the whole all averages of pixel color value of picture to be checked;
Search engine server obtains a vector space according to the main color value of picture integral body to be checked and the main color value in each zone;
Search engine server obtains the main color value of the picture integral body of preserving in the database, and the main color value in each zone of picture in the database, and obtaining corresponding vector space according to the main color value in the main color value of picture integral body in the database and each zone, the region quantity of picture is identical with the region quantity of picture to be checked in the database;
The vector space of picture is compared one by one in that search engine server is treated the inquiry picture and the database, determines the similarity of two pictures;
Search engine server will be compared the picture with picture analogies to be checked that obtains and send to client.
The embodiment of the present application also provides the device that a kind of image similarity is judged simultaneously, comprising:
First computing module, the commodity picture that is used for having obtained is divided into a plurality of zones, calculate the main color value in each zone and the main color value of commodity picture integral body, main color value is by getting the average acquisition to pixel in the zone or the whole all color values of pixel of commodity picture;
Second computing module is used for obtaining a vector space according to the main color value in a plurality of zones and the main color value of commodity picture integral body;
Comparing module is used for the vector space of a plurality of commodity picture correspondences is compared, and in certain threshold range, determines the commodity picture analogies of comparing according to difference.
The embodiment of the present application also provides the device that a kind of image similarity is judged simultaneously, comprising:
Computing module is used for two pictures that carry out the picture analogies judgement are carried out following step respectively, obtains the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Comparing module is used for the vector space of two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures.
The specific embodiments that is provided by above-mentioned the application as can be seen, can obtain comprising the vector space of a plurality of main color values just by picture, this value is more stable, can ignore the small variation of picture, also can judge what of picture change, can carry out similar coupling with the vector space of other picture correspondences based on this, make that the judgement of commodity picture analogies is comparatively accurate, and owing to be that parameter compares two vector spaces only to the main color difference in each zone of picture and the whole main color difference of picture, the computing parameter is few, has accelerated the speed of picture analogies judgement.
Description of drawings
The first embodiment system construction drawing that Fig. 1 provides for the application
The first embodiment method flow diagram that Fig. 2 provides for the application;
The second embodiment device structural drawing that Fig. 3 provides for the application;
The 3rd embodiment device structural drawing that Fig. 4 provides for the application.
Embodiment
First embodiment that the application provides is the method that a kind of image similarity is judged, this method is applied in as shown in Figure 1 the system, this system comprises: server 10 and plurality of client end 20, wherein server 10 is used to collect, put in order the commodity picture of trade company by client upload, and the commodity picture that obtains is carried out similarity judge.Wherein, client 20 can be portable terminal, computing machine etc.
This method flow comprises as shown in Figure 2:
Step 101: trade company's first is opened commodity by client 20 and is uploaded list, adds and waits to upload the commodity picture 01 of commodity A, and the link of the commodity picture 02 of commodity A is provided in uploading the descriptive labelling of list, carries out commodity and uploads.
After commodity were uploaded, the commodity picture 01 of commodity A and commodity picture 02 can be stored in the commodity picture library and call in order to subsequent step.
Step 102: trade company's second is opened commodity by client 20 and is uploaded list, adds the commodity picture 02 ' of waiting to upload commodity B, carries out commodity and uploads.
Equally, after commodity were uploaded, the commodity picture 02 ' of commodity B also can be stored in this commodity picture library and to be equipped with subsequent step and to call.
Step 103: server 10 obtains commodity picture 01, commodity picture 02 and commodity picture 02 ' from the commodity picture library.
The commodity picture library can be arranged in the server 10, also can be arranged in the special storage server 11, and storage server 11 can be connected with server 10 by network, makes server 10 can obtain the commodity picture easily from the commodity picture library.
Step 104: server 10 is divided into 9 zones with picture 01, calculate the main color value in 9 zones and the main color value of commodity picture 01 integral body, and obtain a vector space in view of the above, obtain the vector space of commodity picture 02 and commodity picture 02 ' correspondence successively.
It only is a preferred scheme in the present embodiment that server 10 is divided into 9 zones with commodity picture 01, and just picture 01 is divided into 4 zones or 16 zones.
Color value in the present embodiment is represented that by a hexadecimal notation this symbol is formed (RGB) by red, green and blue value.The minimum value of every kind of color is that 0 (sexadecimal: #00), maximal value is 255 (sexadecimals: #FF).For example the color value of the pixel of a black is #FFFFFF, and the color value of a pure white pixel is #000000.Commodity picture 1 comprises 640*480 pixel in the present embodiment, and commodity picture 1 is divided into essentially identical 9 zones of size, and each regional pixel number is approximately 3.40,000, the color value of the whole pixels in the zone 1 is got the main color value #102030 that average obtains zone 1, below the color value of pixel being got average describes, for example having 2 pixel color values to be respectively #111111 and #333333 gets average and obtains #222222, obtain the main color value in regional 2-zone 9 equally, specifically referring to table 1, sign 1-9 represents the sign in regional 1-zone 9, the sign of sign 0 expression commodity picture integral body.
Figure G2010100022407D00061
Table 1
Obtain a vector space r1 according to the main color value in the table 1, similarly obtain the vector space r2 of commodity picture 02 correspondence and the vector space r2 ' of commodity picture 02 ' correspondence.
Step 105: 10 pairs of commodity pictures 01 of server, commodity picture 02 and commodity picture 02 ' corresponding vector space r1, r2 and r2 ' compare, in certain threshold value Δ scope, determine that commodity picture 01, the commodity picture 02 of comparing are similar with commodity picture 02 ' according to difference.
Vector space r1 and vector space r2 are compared, difference is smaller or equal to the threshold value Δ, determine that thus commodity picture 01 is similar with commodity picture 02, vector space r2 and vector space r2 ' are compared, difference determines thus that smaller or equal to the threshold value Δ commodity picture 02 is similar with commodity picture 02 ', compares vector space r1 and vector space r2 ', difference determines thus that smaller or equal to the threshold value Δ commodity picture 01 is similar with commodity picture 02 '.Certainly if vector space r1 and vector space r2 compare, difference X is greater than the threshold value Δ, determine commodity picture 01 and commodity picture 02 dissmilarity thus, it is same if vector space r2 and vector space r2 ' compare, difference is determined commodity picture 02 and commodity picture the 02 ' dissmilarity thus greater than the threshold value Δ.Similar with commodity picture 02 ' with commodity picture 02 in the present embodiment, commodity picture 02 and commodity picture 01 be similar to carry out follow-up explanation.
The difference that vector space r1 and vector space r2 are compared and concrete being calculated as follows of threshold value Δ comparison, r1=(r1 ID1, r1 ID2, r1 ID3, r1 ID4, r1 ID5, r1 ID6, r1 ID7, r1 ID8, r1 ID9, r1 ID10), r2=(r2 ID1, r2 ID2, r2 ID3, r2 ID4, r2 ID5, r2 ID6, r2 ID7, r2 ID8, r2 ID9, r2 ID10), r1 wherein ID1To r1 ID9The main color value in expression picture 01 zone 1 to zone 9, wherein r1 ID10The main color value of expression commodity picture 01 integral body, wherein r2 ID1-r2 ID9The main color value in expression picture 02 zone 1 to zone 9, wherein r2 ID10The main color value of expression commodity picture 02 integral body, difference X=[(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ... + (r1 ID10-r2 ID10) 2] 1/2, difference and threshold value Δ are compared, smaller or equal to the threshold value Δ, determine that commodity picture 01 is similar with commodity picture 02 according to difference thus.Pass through aforementioned calculation, vector space r1 and vector space r2 are compared, determine that the commodity picture 01 of comparing is similar with commodity picture 02, it is preferred version in the present embodiment, can also adopt following method, vector space r1 and vector space r2 are compared, determine that the commodity picture 01 of comparing is similar with commodity picture 02.Certainly, difference X can also be expressed as follows: X=(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ... + (r1 ID10-r2 ID10) 2Perhaps X=(r1 ID1-r2 ID1) 4+ (r1 ID2-r2 ID2) 4+ ... + (r1 ID10-r2 ID10) 4Difference X can also be expressed as follows: X=|r1 ID1-r2 ID1|+| r1 ID2-r2 ID2|+...+| r1 ID10-r2 ID10|.As seen according to the main color value in 9 zones among the vector space r1 and the main color value of picture 01 integral body, and the main color value of the main color value in 9 zones among the vector space r2 and picture 02 integral body, color difference and main color difference with the corresponding region are that parameter is right, utilize multiple predetermined algorithm, all can realize vector space r1 and vector space r2 are compared, and then definite commodity picture 01 is similar with commodity picture 02, above-mentioned algorithm in the present embodiment only is for the preferred embodiment of present techniques scheme is described, and is not the qualification to the application.
Preceding method is applicable to that the commodity picture equally also is applicable to other picture.
By above-mentioned explanation as can be known, can obtain comprising the vector space of a plurality of main color values by picture, this value is more stable, can ignore the small variation of picture, and the picture change what also can be judged.Find through a large amount of experiments, picture is divided into 9 zones, calculates the main color value in each zone, again in conjunction with the main color value of picture integral body, just be enough to distinguish different commodity pictures, can eliminate the commodity picture amplification, dwindle, variation that slight watermark brings.And because main color is stable, a continuous value, 9 zones and whole main color can constitute a vector space, can carry out similar coupling with the vector space of other picture correspondences based on this, make that the judgement of commodity picture analogies is comparatively accurate.
Further based on said method, present embodiment also provides a kind of method of merchandise news polymerization, after obtaining vector space in the commodity picture, uses the vector space of picture correspondence, carrying out similar coupling, can be same set with the commercial articles clustering of similar commodity picture correspondence.
Server 10 also can will use the commodity of similar commodity picture to be aggregated in the identity set in the present embodiment for this reason.
During concrete enforcement, at first obtain commodity A from the commodity picture library, it is as newly-increased commodity at this moment for commodity A, and it has used 2 commodity pictures: commodity picture 01 and commodity picture 02.Since current without any set, so also use commodity picture 01 and commodity picture 02, obtained a new set: set 1 based on this without any the commodity in the set.Obtain commodity B from the commodity picture library, again according to commodity B, it has used commodity picture 02 ', because commodity picture 02 ' is similar with commodity picture 02, commodity B is added in the set 1.Obtain commodity C from the commodity picture library, as the commodity C of newly-increased commodity, it has used commodity picture 03 and commodity picture 04.According to commodity picture 03 and commodity picture 04, (concrete similar judgement and aforesaid method are similar for commodity picture 01, commodity picture 02 and commodity picture 02 ' the equal dissmilarity used to the commodity A, the commodity B that gather in 1, repeat no more herein), then commodity C is added in the new set 2.From the commodity picture library, obtain commodity D at last, according to commodity D as newly-increased commodity, it has used commodity picture 03 ' and commodity picture 01 ', similar according to commodity picture 01 ' to commodity picture 01, similar according to commodity picture 03 ' to commodity picture 03, set 1 and set 2 are merged in set 3 and the commodity picture 03 ' and 01 ' the adding set 3 of commodity picture with commodity D use.If from the commodity picture library, obtain commodity D early than obtaining commodity C, have only set 1 this moment, the commodity picture of commodity comprises in the set 1: commodity picture 01, commodity picture 02 and commodity picture 02 '.Then commodity picture among the commodity D 03 ' and commodity picture 01 ' are joined in the set 1, simultaneously, with the part of commodity picture 03 as the commodity picture of commodity in the set 1.
If user's first has 5 commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5, wherein commodity A1 has used commodity picture 11, commodity A2 has used commodity picture 12, commodity A3 has used commodity picture 13, commodity A4 has used commodity picture 14, commodity A5 has used commodity picture 15, commodity picture 11, commodity picture 12, commodity picture 13, commodity picture 14 is similar with commodity picture 15, because the commodity amount of user's first is 5 and is not more than predetermined quantity 6, then according to commodity picture 11, commodity picture 12, commodity picture 13, commodity picture 14 is similar with commodity picture 15, with commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5 are aggregated in the identity set.Similarly, commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5, belong to 5 different user ID, similar according to commodity picture 11, commodity picture 12, commodity picture 13, commodity picture 14 with commodity picture 15, commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5 are aggregated in the identity set.But if in different application systems, commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5 belong to user's first, since the commodity amount of user's first be 5 greater than predetermined quantity 4, even commodity picture 11, commodity picture 12, commodity picture 13, commodity picture 14 are similar with commodity picture 15, commodity A1, commodity A2, commodity A3, commodity A4 and commodity A5 are not aggregated in the identity set yet.
When carrying out the commodity polymerization, use commodity picture that each commodity all has, attribute is decided as attribute.Using picture is that picture is with respect to literal as a big advantage of attribute, the cost of revising is higher, so when carrying out the commodity polymerization, from picture, extract enough different commodity can be distinguished and can judge the vector space that information similar is mentioned as described above, as the value of attribute.Like this, different commodity just can compare mutually, classify, and sorting result is more accurate simultaneously.
The application judges that the method for image similarity can be applied in different technical fields, as, the filtration of rubbish picture, and the picture search technical field etc.When being applied to the filtration of rubbish picture, can in server, set up the rubbish picture library in advance, picture and any picture that to propagate on the internet of being considered to of various against the form of the statute or social morality standards have been stored in the described rubbish picture library in advance, as, obscene picture, violence picture etc.When the user utilizes client transmissions picture category information, server can scan and obtain this picture, and the picture among the rubbish picture figure in this picture and the server compared one by one, and utilize the described image similarity determination methods of the application to determine whether this picture is the rubbish picture.If the rubbish picture is then forbidden the transmission of this picture.
When the described image similarity determination methods of the application is applied to the picture searching technical field, when server end receives the picture of user's desire search, can compare one by one with the picture of pre-stored in the server or the picture that utilizes crawler technology to grab, main color difference and the whole main color difference of picture with corresponding region in two pictures are that parameter compares two vector spaces, determine the similarity of two pictures.And the picture that all are similar is sent to subscription client as Search Results.Utilize the application's image similarity determination methods, owing to only calculate the main color difference in each zone, amount of calculation is less, has therefore improved the efficient of search, and as much as possiblely searches similar picture.
The concrete steps of present embodiment method flow comprise:
Step 301: search engine server receives the searching request of the user inquiring similar pictures of client transmission.
User's desire is by search engine server search and the same or analogous picture of a certain picture to be checked, can specify the picture of desire search to the search engine client, and send the picture searching request to search engine server by this client, wherein, the picture of desire search can be the picture that the user is uploaded to client, also can be the picture that client is obtained from the internet.
Step 302: search engine server is divided into several zones with picture to be checked, the main color value of determining each zone respectively is the average of the pixel color value in the zone, and determines that the main color value of picture integral body to be checked is the whole all averages of pixel color value of picture to be checked.
Search engine server is divided into N zone with picture to be checked, and guarantees the big or small basic identical of each zone as far as possible, and here, the value of N is the integer greater than 1, and for example 9,4,16 etc.; Then,, add up the pixel number on this zone, and the color value of each pixel for each zone of dividing in N zone, back, and with the mean value of the color value of the pixel in this zone main color value as this zone.Color value in the present embodiment is represented that by a hexadecimal notation this symbol is formed (RGB) by red, green and blue value.The minimum value of every kind of color is that 0 (sexadecimal: #00), maximal value is 255 (sexadecimals: #FF).For example the color value of the pixel of a black is #FFFFFF, and the color value of a pure white pixel is #000000.Comprise that with picture to be checked 640*480 pixel is example in the present embodiment, picture to be checked is divided into essentially identical 9 zones of size, each regional pixel number is approximately 3.4 ten thousand, the color value of the whole pixels in the zone 1 is got the main color value #102030 that average obtains zone 1, below the color value of pixel being got average describes, for example having 2 pixel color values to be respectively #111111 and #333333 gets average and obtains #222222, obtain the main color value in regional 2-zone 9 equally, specifically referring to table 2, sign 1-9 represents the sign in regional 1-zone 9, the sign of sign 0 expression picture integral body to be checked.
Figure G2010100022407D00111
Table 2
Step 303: search engine server obtains a vector space according to the main color value of picture integral body to be checked and the main color value in each zone.
Obtain a vector space r1, f1=(r1 according to the main color value in the table 2 ID1, r1 ID2, r1 ID3, r1 ID4, f1 ID5, r1 ID6, f1 ID7, r1 ID8, r1 ID9, r1 ID10), r1 wherein ID1To r1 ID9The main color value in expression picture 01 zone 1 to zone 9, wherein r1 ID10The main color value of representing picture 01 integral body to be checked.
Step 304: search engine server obtains the main color value of a plurality of picture integral body of preserving in the database, and the main color value in each zone in the picture in the database, and obtain corresponding vector space according to the main color value of picture integral body in the database and the main color value in each zone.
The picture of preserving in the database can be that search engine server utilizes crawler technology to collect a large amount of pictures from the internet, also can be the picture that user that shopping website self is preserved uploads, and the embodiment of the present application is not done qualification to the source of picture certainly.Search engine server can be divided into several zones with pictures all in the database in advance, present embodiment can all be divided into each picture 9 essentially identical zones of size, and calculate the main color value in each zone respectively, and the main color value of each picture integral body.Search engine server can be at the sign of picture, the sign in each zone in each picture, and main color value sets up concordance list, the structure of this concordance list can be as shown in table 3.Wherein be somebody's turn to do the sign in each zone in right several each picture of first bit representation that identify, 1-9 represents the sign in regional 1-zone 9, the sign of sign 0 expression picture integral body, and left number front three is used to represent the sign of each picture, as picture identification 001,002.Certainly the application also can determine corresponding main color value to the picture that stores in the database according to the described method of step 202 again after receiving query requests.
Figure G2010100022407D00121
Table 3
Can obtain the vector space of each picture correspondence according to the main color value in the table 3, be example with picture identification 001, and its corresponding vector space is r2, r2=(r2 ID1, r2 ID2, r2 ID3, r2 ID4, r2 ID5, r2 ID6, r2 ID7, r2 ID8, r2 ID9, r2 ID10).R2 wherein ID1To r2 ID9The main color value (the main color value of sign 0011,0012,0013,0014,0015,0016,0017,0018 and 0019) in expression picture 001 zone 1 to zone 9, wherein r2 ID10The main color value (the main color value of sign 0010) of expression picture 001 integral body.
Step 305: the vector space of picture is compared one by one in that search engine server is treated the inquiry picture and the database, determines the similarity of two pictures.
Present embodiment is that example describes with picture to be checked and picture identification 001 corresponding vector space, and vector space r1 and vector space r2 are compared, and difference determines thus that smaller or equal to the threshold value Δ picture to be checked is similar with picture 001.Certainly if vector space r1 and vector space r2 compare, difference X determines picture to be checked and picture 001 dissmilarity thus greater than the threshold value Δ.
The difference that vector space r1 and vector space r2 are compared and concrete being calculated as follows of threshold value Δ comparison, difference X=[(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ... + (r1 ID10-r2 ID10) 2] 1/2, difference and threshold value Δ are compared, smaller or equal to the threshold value Δ, determine that picture to be checked is similar with picture 001 according to difference thus.By aforementioned calculation, vector space r1 and vector space r2 are compared, determine the picture analogies of comparing, just preferred version in the present embodiment can also adopt following method, and difference X can also be expressed as follows: X=(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ... + (r1 ID10-r2 ID10) 2Perhaps X=(r1 ID1-r2 ID1) 4+ (r1 ID2-r2 ID2) 4+ ... + (r1 ID10-r2 ID10) 4Difference X can also be expressed as follows: X=|r1 ID1-r2 ID1|+| r1 ID2-r2 ID2|+...+| r1 ID10-r2 ID10|.As seen according to the main color value in 9 zones among the vector space r1 and the main color value of integral body, and the main color value in 9 zones among the vector space r2 and whole main color value, main color difference and main color difference with the corresponding region are that parameter is right, utilize multiple predetermined algorithm, all can realize vector space r1 and vector space r2 are compared, and then determine whether picture to be checked is similar to the picture in the database, above-mentioned algorithm in the present embodiment only is for the preferred embodiment of present techniques scheme is described, and is not the qualification to the application.
Step 306: search engine server will be compared the picture with picture analogies to be checked that obtains and send to client.
By above-mentioned explanation as can be known, can obtain comprising the vector space of a plurality of main color values by picture, this value is more stable, can ignore the small variation of picture, and the picture change what also can be judged.Find through a large amount of experiments, picture is divided into 9 zones, calculates the main color value in each zone, again in conjunction with the main color value of picture integral body, just be enough to distinguish different commodity pictures, can eliminate the commodity picture amplification, dwindle, variation that slight watermark brings.And because main color is stable, a continuous value, 9 zones and whole main color can constitute a vector space, can carry out similar coupling with the vector space of other picture correspondences based on this, make that the judgement of commodity picture analogies is comparatively accurate.
Second embodiment that the application provides is the device that a kind of image similarity is judged, this apparatus structure comprises as shown in Figure 3:
Computing module 201 is used for two pictures that carry out the picture analogies judgement are carried out following step respectively, obtains the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the pixel face in the zone
The average of colour, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Comparing module 202 is used for the vector space of two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures.
Further, computing module 201, the commodity picture that specifically is used for obtaining is divided into nine zones.
Further, computing module 201, specifically be used for according to the main color value in a plurality of zones in the vector space separately of two pictures that carry out the picture analogies judgement and the main color value of picture integral body, color difference and main color difference with the corresponding region are that parameter compares two vector spaces, determine the similarity of two pictures.
The 3rd embodiment that the application provides is a kind of device of merchandise news polymerization, and this apparatus structure comprises as shown in Figure 4:
Computing module 201 is used for two pictures that carry out the picture analogies judgement are carried out following step respectively, obtains the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Comparing module 202 is used for the vector space of two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures.
Polymerization module 203 is used for and will uses the commodity of similar commodity picture to be aggregated in identity set.
Further, also comprise:
Comparison module 204 is used for the commodity picture of newly-increased commodity and the commodity picture of existing set commodity are compared;
Polymerization module 203, specifically be used for if the commodity picture of all commodity pictures of newly-increased commodity and existing set commodity is all dissimilar, then these commodity are added in the new set, if all commodity pictures of newly-increased commodity all with an existing set in the commodity picture analogies of commodity, then will increase the commodity adding newly should have in the set, if the commodity picture analogies of commodity in a part in all commodity pictures of newly-increased commodity and the existing set, commodity picture in other parts and other the existing set is all dissimilar, then will increase commodity newly adds in this set, and will increase the part of the other parts commodity picture of commodity newly as the commodity picture of commodity in should existingly gathering, if all commodity pictures of newly-increased commodity respectively with several set in the commodity picture analogies, these several set are merged identity sets.
Further, polymerization module 203 also is used for and will uses the commodity of similar commodity picture to be aggregated in identity set, uses the commodity of similar commodity picture to belong to the different user sign, or use the quantity of the commodity of similar commodity picture to be not more than predetermined quantity, and belong to same user ID.
For the convenience of describing, be divided into various modules with function when describing above the device and describe respectively.Certainly, when implementing the application, can in same or a plurality of softwares and/or hardware, realize the function of each module.
Obviously, those skilled in the art can carry out various changes and modification and the spirit and scope that do not break away from the application to the application.Like this, if these of the application are revised and modification belongs within the scope of the application's claim and equivalent technologies thereof, then the application also is intended to comprise these changes and modification interior.

Claims (15)

1. the method that image similarity is judged is characterized in that, comprising:
Two pictures that carry out the picture analogies judgement are carried out following step respectively, obtain the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Vector space to two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures.
2. the method for claim 1 is characterized in that, the picture that obtains is divided into nine zones.
3. the method for claim 1, it is characterized in that, according to the main color value in a plurality of zones in the vector space separately in two pictures that carry out the picture analogies judgement and the main color value of picture integral body, main color difference and the whole main color difference of picture with the corresponding region are that parameter compares two vector spaces, determine the similarity of two pictures.
4. method as claimed in claim 3 is characterized in that, described main color difference and the whole main color difference of picture with the corresponding region is that parameter specifically comprises the formula that two vector spaces compare:
X=[(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ...+(r1 ID10-r2 ID10) 2] 1/2, or
X=(r1 ID1-r2 ID1) 2+ (r1 ID2-r2 ID2) 2+ ...+(r1 ID10-r2 ID10) 2, or
X=(r1 ID1-r2 ID1) 4+ (r1 ID2-r2 ID2) 4+ ...+(r1 ID10-r2 ID10) 4, or
X=|r1 ID1-r2 ID1|+|r1 ID2-r2 ID2|+...+|r1 ID10-r2 ID10|,
Wherein, X is a difference, r1 ID1To r1 ID9The corresponding main color value in each zone of first picture of picture analogies judgement, r1 are carried out in expression ID10The main color value of expression picture integral body; R2 ID1To r2 ID9The corresponding main color value in each zone of second picture of picture analogies judgement, r2 are carried out in expression ID10The main color value of expression picture integral body.
5. the method for a merchandise news polymerization is characterized in that, comprising:
Two commodity pictures that carry out the image similarity judgement are carried out following step respectively, obtain the vector space of each picture:
The commodity picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite commodity picture integral body is the whole all averages of pixel color value of commodity picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of commodity picture integral body;
The vector space that carries out two commodity picture correspondences that image similarity judges is compared, determine the similarity of two commodity pictures;
To use the merchandise news of similar commodity picture commodity to be aggregated in the identity set.
6. method as claimed in claim 5 is characterized in that, also comprises:
The commodity picture of newly-increased commodity is compared with the commodity picture that has commodity in the set;
The merchandise news polymerization is specially:
If the commodity picture of commodity is all dissimilar in all commodity pictures of newly-increased commodity and the existing set, the merchandise news that then will increase commodity newly adds in the new set;
If the commodity picture analogies of all commodity pictures of newly-increased commodity commodity all and in the existing set, the merchandise news that then will increase commodity newly add to have and gather;
If the commodity picture analogies of commodity in a part in all commodity pictures of newly-increased commodity and the existing set, commodity picture in other parts and other the existing set is all dissimilar, the merchandise news that then will increase commodity newly adds in this set, and will increase the part of the other parts commodity picture of commodity as the commodity picture of commodity in should existingly gathering newly;
If all commodity pictures of newly-increased commodity respectively with several set in the commodity picture analogies, these several set are merged identity sets.
7. method as claimed in claim 5 is characterized in that, uses the commodity of similar commodity picture to belong to the different user sign, or uses the quantity of the commodity of similar commodity picture to be not more than predetermined quantity, and belong to same user ID.
8. an image search method is characterized in that, comprising:
Search engine server receives the searching request of the user inquiring picture of client transmission;
Search engine server is divided into several zones with picture to be checked, the main color value of determining each zone respectively is the average of the pixel color value in the zone, and determines that the main color value of picture integral body to be checked is the whole all averages of pixel color value of picture to be checked;
Search engine server obtains a vector space according to the main color value of picture integral body to be checked and the main color value in each zone;
Search engine server obtains the main color value of a plurality of picture integral body of preserving in the database, and the main color value in each zone of picture in the database, and obtaining corresponding vector space according to the main color value in the main color value of picture integral body in the database and each zone, the region quantity of picture is identical with the region quantity of picture to be checked in the database;
The vector space of picture is compared one by one in that search engine server is treated the inquiry picture and the database, determines the similarity of two pictures;
Search engine server will be compared the picture with picture analogies to be checked that obtains and send to client.
9. method as claimed in claim 8 is characterized in that, described search engine server obtains the main color value of the picture integral body of preserving in the database, and the main color value in each zone in the picture in the database, specifically comprises:
Picture in the database is divided into several zones, the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of determining picture integral body to be checked is the whole all averages of pixel color values of picture to be checked, and described main color value and corresponding picture identification are set up concordance list;
Search engine server obtains corresponding main color value from concordance list.
10. the device that image similarity is judged is characterized in that, comprising:
Computing module is used for two pictures that carry out the picture analogies judgement are carried out following step respectively, obtains the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Comparing module is used for the vector space of two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures.
11. device as claimed in claim 10 is characterized in that, computing module, the picture that specifically is used for obtaining are divided into nine zones.
12. device as claimed in claim 10, it is characterized in that, computing module, specifically be used for according to the main color value in a plurality of zones in the vector space separately of two pictures that carry out the picture analogies judgement and the main color value of picture integral body, main color difference and the whole main color difference of picture with the corresponding region are that parameter compares two vector spaces, determine the similarity of two pictures.
13. the device of a merchandise news polymerization is characterized in that, computing module is used for two pictures that carry out the picture analogies judgement are carried out following step respectively, obtains the vector space of each picture:
Picture is divided into a plurality of zones, and the main color value of determining each zone respectively is the average of the pixel color value in the zone, and the main color value of definite picture integral body is the whole all averages of pixel color value of picture;
Obtain a vector space according to the main color value in a plurality of zones and the main color value of picture integral body;
Comparing module is used for the vector space of two picture correspondences of carrying out the picture analogies judgement is compared, and determines the similarity of two pictures,
The polymerization module is used for and will uses the commodity of similar commodity picture to be aggregated in identity set.
14. device as claimed in claim 13 is characterized in that, also comprises:
Comparison module is used for the commodity picture of newly-increased commodity and the commodity picture of existing set commodity are compared;
The polymerization module, specifically be used for if the commodity picture of all commodity pictures of newly-increased commodity and existing set commodity is all dissimilar, then these commodity are added in the new set, if all commodity pictures of newly-increased commodity all with an existing set in the commodity picture analogies of commodity, then will increase the commodity adding newly should have in the set, if the commodity picture analogies of commodity in a part in all commodity pictures of newly-increased commodity and the existing set, commodity picture in other parts and other the existing set is all dissimilar, then will increase commodity newly adds in this set, and will increase the part of the other parts commodity picture of commodity newly as the commodity picture of commodity in should existingly gathering, if all commodity pictures of newly-increased commodity respectively with several set in the commodity picture analogies, these several set are merged identity sets.
15. device as claimed in claim 13, it is characterized in that, the polymerization module, also be used for and use the commodity of similar commodity picture to be aggregated in identity set, use the commodity of similar commodity picture to belong to the different user sign, or use the quantity of the commodity of similar commodity picture to be not more than predetermined quantity, and belong to same user ID.
CN2010100022407A 2010-01-12 2010-01-12 Method and device for judging image similarity Pending CN102122389A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010100022407A CN102122389A (en) 2010-01-12 2010-01-12 Method and device for judging image similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010100022407A CN102122389A (en) 2010-01-12 2010-01-12 Method and device for judging image similarity

Publications (1)

Publication Number Publication Date
CN102122389A true CN102122389A (en) 2011-07-13

Family

ID=44250940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010100022407A Pending CN102122389A (en) 2010-01-12 2010-01-12 Method and device for judging image similarity

Country Status (1)

Country Link
CN (1) CN102122389A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295243A (en) * 2012-02-29 2013-09-11 佳能株式会社 Image processing method and device and object detection method and system
CN104112284A (en) * 2013-04-22 2014-10-22 阿里巴巴集团控股有限公司 Method and equipment for detecting similarity of images
WO2014183591A1 (en) * 2013-05-16 2014-11-20 北京京东尚科信息技术有限公司 Image provision method, server device, and terminal device
CN104424230A (en) * 2013-08-26 2015-03-18 阿里巴巴集团控股有限公司 Network commodity recommendation method and device
CN106411988A (en) * 2016-03-31 2017-02-15 北京金山安全软件有限公司 Garbage treatment method and device and mobile terminal
WO2017088701A1 (en) * 2015-11-27 2017-06-01 阿里巴巴集团控股有限公司 Mass picture management method and apparatus
CN106878680A (en) * 2017-02-24 2017-06-20 深圳汇创联合自动化控制有限公司 A kind of easy transmission facility recognition system
CN107862710A (en) * 2017-11-28 2018-03-30 奕响(大连)科技有限公司 A kind of similar decision method of picture based on conversion lines
CN108470028A (en) * 2017-02-23 2018-08-31 北京唱吧科技股份有限公司 A kind of picture examination method and apparatus
CN110990512A (en) * 2019-11-29 2020-04-10 农业农村部规划设计研究院 Method and device for checking vector elements and administrative regions in full coverage mode
CN114691252A (en) * 2020-12-28 2022-07-01 中国联合网络通信集团有限公司 Screen display method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1296597A (en) * 1999-02-01 2001-05-23 Lg电子株式会社 Representative color designating method using reliability
US6253201B1 (en) * 1998-06-23 2001-06-26 Philips Electronics North America Corporation Scalable solution for image retrieval
CN1916906A (en) * 2006-09-08 2007-02-21 北京工业大学 Image retrieval algorithm based on abrupt change of information
CN1926575A (en) * 2004-03-03 2007-03-07 日本电气株式会社 Image similarity calculation system, image search system, image similarity calculation method, and image similarity calculation program
CN101021870A (en) * 2007-03-20 2007-08-22 北京中星微电子有限公司 Picture inquiry method and system
CN101211355A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Image inquiry method based on clustering
CN101216830A (en) * 2007-12-28 2008-07-09 腾讯科技(深圳)有限公司 Method and system for search commercial articles according to colors
CN101576896A (en) * 2008-05-09 2009-11-11 鸿富锦精密工业(深圳)有限公司 Retrieval system and retrieval method for similar pictures

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6253201B1 (en) * 1998-06-23 2001-06-26 Philips Electronics North America Corporation Scalable solution for image retrieval
CN1296597A (en) * 1999-02-01 2001-05-23 Lg电子株式会社 Representative color designating method using reliability
CN1926575A (en) * 2004-03-03 2007-03-07 日本电气株式会社 Image similarity calculation system, image search system, image similarity calculation method, and image similarity calculation program
CN1916906A (en) * 2006-09-08 2007-02-21 北京工业大学 Image retrieval algorithm based on abrupt change of information
CN101211355A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Image inquiry method based on clustering
CN101021870A (en) * 2007-03-20 2007-08-22 北京中星微电子有限公司 Picture inquiry method and system
CN101216830A (en) * 2007-12-28 2008-07-09 腾讯科技(深圳)有限公司 Method and system for search commercial articles according to colors
CN101576896A (en) * 2008-05-09 2009-11-11 鸿富锦精密工业(深圳)有限公司 Retrieval system and retrieval method for similar pictures

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295243A (en) * 2012-02-29 2013-09-11 佳能株式会社 Image processing method and device and object detection method and system
CN103295243B (en) * 2012-02-29 2016-11-16 佳能株式会社 Image processing method and equipment and object detecting method and system
CN104112284A (en) * 2013-04-22 2014-10-22 阿里巴巴集团控股有限公司 Method and equipment for detecting similarity of images
US9734164B2 (en) 2013-05-16 2017-08-15 Beijing Jingdong Shangke Information Technology Co, Ltd. Method, server, device, and terminal device for providing an image search
WO2014183591A1 (en) * 2013-05-16 2014-11-20 北京京东尚科信息技术有限公司 Image provision method, server device, and terminal device
TWI616834B (en) * 2013-08-26 2018-03-01 Alibaba Group Services Ltd Network product recommendation method and device
CN104424230A (en) * 2013-08-26 2015-03-18 阿里巴巴集团控股有限公司 Network commodity recommendation method and device
WO2017088701A1 (en) * 2015-11-27 2017-06-01 阿里巴巴集团控股有限公司 Mass picture management method and apparatus
CN106411988A (en) * 2016-03-31 2017-02-15 北京金山安全软件有限公司 Garbage treatment method and device and mobile terminal
CN108470028A (en) * 2017-02-23 2018-08-31 北京唱吧科技股份有限公司 A kind of picture examination method and apparatus
CN106878680A (en) * 2017-02-24 2017-06-20 深圳汇创联合自动化控制有限公司 A kind of easy transmission facility recognition system
CN107862710A (en) * 2017-11-28 2018-03-30 奕响(大连)科技有限公司 A kind of similar decision method of picture based on conversion lines
CN110990512A (en) * 2019-11-29 2020-04-10 农业农村部规划设计研究院 Method and device for checking vector elements and administrative regions in full coverage mode
CN114691252A (en) * 2020-12-28 2022-07-01 中国联合网络通信集团有限公司 Screen display method and device
CN114691252B (en) * 2020-12-28 2023-05-30 中国联合网络通信集团有限公司 Screen display method and device

Similar Documents

Publication Publication Date Title
CN102122389A (en) Method and device for judging image similarity
CN109086720B (en) Face clustering method, face clustering device and storage medium
US11663642B2 (en) Systems and methods of multicolor search of images
CN104850301B (en) A kind of method and system that application icon is classified in system desktop
CN110383274A (en) Identify method, apparatus, system, storage medium, processor and the terminal of equipment
CN109711228B (en) Image processing method and device for realizing image recognition and electronic equipment
CN103714077B (en) Method, the method and device of retrieval verification of object retrieval
JP6751684B2 (en) Similar image search device
AU2012367397B2 (en) System and methods for spam detection using frequency spectra of character strings
US20140250457A1 (en) Video analysis system
CN104346370A (en) Method and device for image searching and image text information acquiring
CN101576932A (en) Close-repetitive picture computer searching method and device
CN105335422B (en) The alarm method and device of public feelings information
CN110706238B (en) Method and device for segmenting point cloud data, storage medium and electronic equipment
CN106304016A (en) The difference of a kind of mobile terminal connects the method and system of bluetooth equipment of the same name
CN107038649A (en) Friend recommendation method and device for terminal user
CN104253981B (en) A kind of method that moving target for video investigation presses color sequence
CN112000024A (en) Method, device and equipment for controlling household appliance
CN113128329A (en) Visual analytics platform for updating object detection models in autonomous driving applications
Avola et al. A shape comparison reinforcement method based on feature extractors and f1-score
CN106411704B (en) A kind of distribution refuse messages recognition methods
KR20160135679A (en) Method of providing information using image-recognition technology
CN103678458A (en) Method and system used for image analysis
CN110020123A (en) A kind of promotion message put-on method, device, medium and equipment
Kethsy Prabavathy et al. Histogram difference with fuzzy rule base modeling for gradual shot boundary detection in video cloud applications

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1159833

Country of ref document: HK

C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110713

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1159833

Country of ref document: HK