CN105989043A - Method and device for automatically acquiring trademark in commodity image and searching trademark - Google Patents

Method and device for automatically acquiring trademark in commodity image and searching trademark Download PDF

Info

Publication number
CN105989043A
CN105989043A CN201510059267.2A CN201510059267A CN105989043A CN 105989043 A CN105989043 A CN 105989043A CN 201510059267 A CN201510059267 A CN 201510059267A CN 105989043 A CN105989043 A CN 105989043A
Authority
CN
China
Prior art keywords
commodity image
point
weight
trade mark
candidate feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510059267.2A
Other languages
Chinese (zh)
Inventor
薛晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510059267.2A priority Critical patent/CN105989043A/en
Publication of CN105989043A publication Critical patent/CN105989043A/en
Pending legal-status Critical Current

Links

Abstract

The invention relates to a searching technology and discloses a method and a device for automatically acquiring a trademark in a commodity image and searching the trademark. The method for automatically acquiring the trademark comprises the following steps: extracting local characteristics from the commodity image to obtain multiple characteristic points; executing the following steps on each commodity image: selecting candidate characteristic points from the characteristic points of each of the other commodity images, and calculating weight of the current characteristic point according to distance priority of the current characteristic point and whether the commodity image corresponding to the candidate characteristic points and the commodity image are in the same trademark sample category; and taking a region in which the weight of the characteristic point is larger than a preset threshold value as a trademark sample region. The method provided by the invention can automatically acquire the trademark sample regions of a lot of commodity images, so as to obtain a trademark sample image, and manual annotation does not need to be carried out. The trademark is searched according to the weight of the characteristic point the closest to the characteristic point of a to-be-searched commodity image in a characteristic space, and the trademark sample category of the to-be-searched commodity image is voted, so that searching accuracy is improved.

Description

Automatically trade mark and retrieval business's calibration method and device thereof in commodity image are obtained
Technical field
The present invention relates to searching field, obtain trade mark and retrieval trade mark in commodity image particularly to automatic Method and device thereof.
Background technology
Trade mark retrieval and identification system have application widely, including the intellectual property protection of trade mark, brand The analysis of trade mark exposure and the commercial articles searching based on trademark image etc..
The general flow of brand recognition in prior art: the collection of trade mark sample image, building of aspect indexing The retrieval of vertical and input picture.This system is described in detail below, and it generallys include following step Rapid:
1) collection of trade mark sample image: mainly extract trade mark institute from commodity image by manual type In region, thus obtain.Assume that training set includes that the quantity of trade mark sample image is S, institute The quantity belonging to trademark class is T;For a series of trade mark sample images in training set {xi, i=1,2 ..., S}, its corresponding trademark class is respectively { yi, i=1,2 ..., S}.Trade mark sample image Feature be that trade mark region occupies main body, do not comprise or less other of commodity in picture of comprising Region.
2) local shape factor of sample image: general employing SIFT feature extraction algorithm.Here i-th is defined J-th Based on Feature Points of width image zooming-out isUsed here as NiRepresent the quantity of the characteristic point of the i-th width image zooming-out.
3) index construct: open in the feature space that sample image extracts the feature composition obtaining at S, use tree Shape data structure (such as kd-tree etc.) mode sets up index, follow-up retrieves faster to facilitate Characteristic point most like with input feature vector (Euclidean distance is nearest) in index.
4) retrieve: SIFT feature is extracted to input picture (commodity image);Each obtaining for extraction SIFT feature point, with it apart from immediate K characteristic point in searching feature space, and by this K The trade mark sample classification ballot of individual Feature point correspondence adds one.
5) result output: all carried out one according to all characteristic points of image to be retrieved to input for the above-mentioned steps After secondary ballot, add up the final score result of each trade mark sample classification, and export highest scoring Trade mark sample classification, as the matching result of input picture.
The shortcoming of technique scheme is:
1. the quantity of trade mark sample image, directly determines the performance of final system.And this parts of images Collection needs to consume substantial amounts of artificial mark;And the brand that comprises in training set (and new business Mark sample image) be continuously increased, maintenance cost also improves accordingly.
2., in traditional scheme, the contribution of such purpose of each feature point pairs ballot of feature space is identical , this point is invalid in the middle of reality is applied.
Content of the invention
It is an object of the invention to provide the side of trade mark and retrieval trade mark in a kind of automatic acquisition commodity image Method and device thereof, can obtain the trade mark sample areas in commodity image, it is not necessary to consume substantial amounts of automatically Artificial mark;According to the weight of the different characteristic point in trade mark sample areas, the accurate of retrieval can be promoted Degree.
For solving above-mentioned technical problem, embodiments of the present invention disclose a kind of acquisition commodity image automatically Middle business's calibration method, comprises the following steps:
Local shape factor is carried out to the commodity image in training set, obtains multiple features of commodity image Point;
Repeat following steps to each commodity image:
Obtain commodity image a characteristic point as current signature point, remaining every width commodity in training set The characteristic point of image is respectively chosen one minimum with current signature point distance as candidate feature point, according to Commodity corresponding to the distance-taxis of each candidate feature point and current signature point and each candidate feature point Whether image is the positive sample belonging to blanket brand sample classification with commodity image, calculates current signature point and uses In effective weight identifying trade mark in commodity image;
After the weight of all characteristic points of commodity image all calculates and finishes, choose weight in commodity image big In the region corresponding to the characteristic point of predetermined threshold as the trade mark sample image in commodity image.
Embodiments of the present invention also disclose business's calibration method in a kind of retrieval commodity image, in training set The weight of all commodity image is empty more than the characteristic point constitutive characteristic in the trade mark sample areas of predetermined threshold Between, the method comprises the following steps:
Local shape factor is carried out to the commodity image to be retrieved inputting and obtains multiple characteristic point;
Each characteristic point obtaining extraction successively is as current signature point, and finds in feature space Immediate K the characteristic point with the distance of current signature point, and the weight of K characteristic point is added separately to In the ballot score of the trade mark sample classification belonging to each self-corresponding commodity image of K characteristic point;
Add up the ballot score of each trade mark sample classification, by the trade mark of the trade mark sample classification of highest scoring Retrieval result as commodity image to be retrieved.
Embodiments of the present invention also disclose the device of trade mark in a kind of automatic acquisition commodity image, including With lower module:
Characteristic extracting module, for carrying out local shape factor to the commodity image in training set, obtains business Multiple characteristic points of product image;
Characteristic point weight computation module, for repeating following operation to each commodity image:
Obtain commodity image a characteristic point as current signature point, remaining every width commodity in training set The characteristic point of image is respectively chosen one minimum with current signature point distance as candidate feature point, according to Commodity corresponding to the distance-taxis of each candidate feature point and current signature point and each candidate feature point Whether image is the positive sample belonging to blanket brand sample classification with commodity image, calculates current signature point and uses In effective weight identifying trade mark in commodity image;
Module is chosen in trade mark region, for finishing when the weight of all characteristic points of commodity image all calculates After, choose weight in commodity image and be more than the region corresponding to characteristic point of predetermined threshold as commodity image In trade mark sample image.
Embodiments of the present invention also disclose a kind of device retrieving trade mark in commodity image, in training set The weight of all commodity image is empty more than the characteristic point constitutive characteristic in the trade mark sample areas of predetermined threshold Between, this device includes with lower module:
Characteristic extracting module, obtains many for carrying out local shape factor to the commodity image to be retrieved of input Individual characteristic point;
Weight votes accumulator module, each characteristic point successively extraction being obtained as current signature point, And find and immediate K the characteristic point of distance of current signature point in feature space, and by K spy Levy weight a little and be added to the trade mark sample classification belonging to each self-corresponding commodity image of K characteristic point respectively Ballot score in;
Module retrieved by trade mark, for adding up the ballot score of each trade mark sample classification, by highest scoring The trade mark of trade mark sample classification is as the retrieval result of commodity image to be retrieved.
Compared with prior art, the main distinction and effect thereof are embodiment of the present invention:
Can automatically obtain the trade mark sample areas in shiploads of merchandise image, obtain trade mark sample image, no Need to consume substantial amounts of artificial mark, and with the increase of trade mark quantity and commodity image, maintenance cost Remain unchanged.
According in feature space with the characteristic point of commodity image to be retrieved apart from the power of immediate characteristic point Weight, the trade mark sample classification belonging to commodity image to be retrieved is voted, and improves the accuracy of retrieval.
Further, if the characteristic point of commodity image is more likely to training set corresponding to positive sample Candidate feature Point matching, then increase its weight, otherwise then reduce, and improves the accuracy rate of system and recalls Rate.
Further, use employing local feature when extracting feature to go for image when extracting feature to mix Fold and have situation about blocking.
Further, index is set up to feature space, can more quickly in search index with input feature vector Most like characteristic point.
Brief description
Fig. 1 is the relation schematic diagram of commodity image and trade mark sample image;
Fig. 2 is the schematic diagram that trade mark sample classification comprises multiple trade mark sample images;
Fig. 3 is the stream of business's calibration method in a kind of automatic acquisition commodity image in first embodiment of the invention Journey schematic diagram;
Fig. 4 is that in third embodiment of the invention, a kind of flow process retrieving business's calibration method in commodity image is shown It is intended to;
Fig. 5 is the knot of the device of trade mark in a kind of automatic acquisition commodity image in four embodiment of the invention Structure schematic diagram;
Fig. 6 is that in sixth embodiment of the invention, a kind of structure retrieving the device of trade mark in commodity image is shown It is intended to.
Detailed description of the invention
In the following description, many technology are proposed in order to make reader be more fully understood that the application thin Joint.But, even if it will be understood by those skilled in the art that do not have these ins and outs and based on The many variations of following embodiment and modification, it is also possible to realize that each claim of the application is required and protect The technical scheme protected.
Term is explained:
Trade mark sample image: refer in particular to not comprise or seldom comprise background in the present invention, only comprise trade mark Image, typically obtains from commodity image by way of artificial mark.It is illustrated in fig. 1 shown below as commodity Image and the relation schematic diagram of trade mark sample image, wherein full figure is commodity image, and rectangle frame chooses part For trade mark sample image.
Trade mark sample classification: refer to that sample image or commodity image carry out group according to the mode of sample classification Knit.Such as " Starbucks " is exactly a trade mark sample classification, and it may comprise under varying environment Multiple trade mark sample images, are illustrated in figure 2 trade mark sample classification and comprise showing of multiple trade mark sample images It is intended to.
Brand recognition (system): input an image to be retrieved (may comprise or not comprise trade mark), Requirement system is capable of identify that and returns whether this image comprises the brand in trade mark sample classification, concrete product Board information and the region at place.
The general flow of brand recognition: common flow process generally comprises the collection of sample image, aspect indexing Foundation and three steps of retrieval of input picture (or image to be retrieved).
Local feature: global characteristics is used to describe the gross feature of whole image, such as color histogram. The shortcoming of global characteristics is not to be suitable for image aliasing and has situation about blocking.Local feature typically wraps Containing the segment space scope in image, a good local feature needs possess following character: repeatable, Uniqueness, locality, quantitative, accuracy, high efficiency, is wherein most important with repeatability again. Local feature mates: substantially can be attributed to and carry out similitude between high dimension vector by distance function The problem of retrieval.Substantially having two class solutions, the first is by the method for exhaustion (linear scanning method), Point in data set will enter row distance one by one with query point and compare;The second is to set up index to carry out quickly Coupling, the kd tree such as commonly used and improved kd tree query mode (BBF, Best-Bin-First) etc..
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to this Bright embodiment is described in further detail.
First embodiment of the invention relates to business's calibration method in a kind of commodity image of acquisition automatically, and Fig. 3 is This obtains the schematic flow sheet of business's calibration method in commodity image automatically.
Specifically, as it is shown on figure 3, this obtains business's calibration method in commodity image automatically includes following step Rapid:
Step 101, carries out local shape factor to the commodity image in training set, obtains commodity image Multiple characteristic points.
Repeat following steps to each commodity image:
Step 102, obtain commodity image a characteristic point as current signature point, in training set its The characteristic point of remaining every width commodity image is chosen one minimum with current signature point distance as candidate spy Levy a little, the distance-taxis according to each candidate feature point and current signature point and each candidate feature point institute Whether corresponding commodity image is the positive sample belonging to blanket brand sample classification with commodity image, calculates and works as Front characteristic point is for effective weight identifying trade mark in commodity image.
Can be determined by distance when being appreciated that and choose candidate feature point, such as Euclidean distance, remaining String covariance distance.Additionally, not necessarily choose candidate feature point according to distance, other represent characteristic point phase Also permissible like the method for degree.
Step 103, after the weight of all characteristic points of commodity image all calculates and finishes, chooses commodity figure In Xiang, weight is more than the region corresponding to characteristic point of predetermined threshold as the trade mark sample graph in commodity image Picture.
Present embodiment can obtain the trade mark sample areas in shiploads of merchandise image automatically, obtains trade mark sample This image, it is not necessary to consume substantial amounts of artificial mark, and with the increase of trade mark quantity and commodity image, Maintenance cost remains unchanged.
Second embodiment of the invention relates to business's calibration method in a kind of commodity image of acquisition automatically, and second is real Mode of executing is improved on the basis of the first embodiment, mainly thes improvement is that: if commodity The characteristic point of image is more likely to the candidate feature Point matching with training set corresponding to positive sample, then increase Its weight, on the contrary then reduce, improve accuracy rate and the recall rate of system;Local is used when extracting feature Feature goes for image aliasing and has situation about blocking.Specifically:
In above-mentioned steps 102, " distance-taxis according to each candidate feature point and current signature point with And whether each commodity image corresponding to candidate feature point is to belong to blanket brand sample class with commodity image The positive sample of purpose, calculates current signature point for effective weight identifying trade mark in commodity image " step Including sub-step:
Step 1021, according to the distance size of each candidate feature point and current signature point, to each candidate Characteristic point carries out ascending order arrangement;
Step 1022, judges that in ascending order arrangement, each commodity image corresponding to candidate feature point is whether successively For positive sample;
Step 1023, whether the commodity image according to corresponding to candidate feature point is the judgement knot of positive sample Really, and candidate feature point ascending order arrangement in position, calculate this feature point weight, wherein, institute Corresponding commodity image is that the candidate feature point of positive sample is more forward in ascending order arrangement, then to current signature The weight contribution value of point is bigger.
Wherein, after each candidate feature point carries out ascending order arrangement according to the distance size with current signature point, Sort more forward i.e. less with the distance of current signature point.
Preferably, in above-mentioned steps 1023, including sub-step:
If the commodity image corresponding to candidate feature point is positive sample, then the weight by this candidate feature point Contribution margin increases in the weight of current signature point;
If the commodity image corresponding to candidate feature point is negative sample, then do not increase the power of current signature point Weight, but reduce the weight contribution value coming this candidate feature point candidate feature point thereafter.
Wherein negative sample refers to that the commodity image corresponding to candidate feature point is not belonging to same with commodity image The commodity image of trade mark sample classification.
If the characteristic point of commodity image is more likely to the candidate feature with training set corresponding to positive sample Point matching, then increase its weight, otherwise then reduce, and improves accuracy rate and the recall rate of system.
Furthermore, it is to be understood that in other embodiments of the present invention, it is also possible to take other to calculate and work as The method of front characteristic point weight, and it is not limited to this.For example, if the commodity corresponding to candidate feature point Image is positive sample, then increase to the weight contribution value of this candidate feature point in the weight of current signature point; If the commodity image corresponding to candidate feature point is negative sample, then reduce the weight of current signature point accordingly, Candidate feature point corresponding to negative sample does not affect the power coming this candidate feature point candidate feature point thereafter Weight contribution margin.
Preferably, in above-mentioned steps 1023, sub-step is specifically included:
Initialize Qk=0, Pk=0;Wherein, k represents kth width commodity image in training set;
Ascending order arrangement for each candidate feature pointWherein,Represent kth width In commodity image, the candidate feature point minimum with j-th characteristic point distance of commodity image i, S-1 represents In training set in addition to commodity image i the number of remaining commodity image, if candidate feature pointCorresponding Commodity image be positive sample, then use below equation update PkAnd Qk:
Pk=Pk-1+1
Q k = Q k - 1 + 1 k P k
If negative sample, then below equation is used to update PkAnd Qk:
Qk=Qk-1
Pk=Pk-1
To current signature pointWeight use below equation be normalized:
Score i j = Q s - 1 S y i - 1
Wherein,Represent the quantity for positive sample in each commodity image corresponding to candidate feature point.
Preferably, in a step 101, local feature is Scale invariant features transform feature.Extract feature Shi Caiyong local feature goes for image aliasing and has situation about blocking.
Furthermore, it is to be understood that Scale invariant features transform feature (i.e. Scale-invariant feature Transform, is called for short SIFT feature) generally include the X-coordinate of 1 dimension, the Y coordinate of 1 dimension, 1 dimension Dimensional information, 1 dimension principal direction information and 128 dimension Feature Descriptor information.The present invention's In other embodiments, it is also possible to extract local feature otherwise, such as extract SURF feature.
As the preference of present embodiment, the flow process obtaining trade mark sample image from commodity image is main Including:
1. training set prepares.The sample included in training set compared with traditional scheme, required for this preference Picture, not as traditional scheme requires it is the trade mark region marking from commodity picture and extracting, and permissible It is the commodity picture itself comprising trade mark;Trade mark region is calculated automatically from by subsequent algorithm, nothing Must manual intervention;And commodity picture can also be automatically obtained by technological means or simply by artificially collecting, Such as utilize text search engine (Baidu, Taobao), obtained by the keyword batch of brand trademark, make Training set for this preference.Assume that training set includes that S opens sample image altogether, be respectively belonging to T business Mark classification: i.e. for a series of trade mark sample image { xi, i=1,2 ..., S}, its corresponding trade mark sample classification It is respectively { yi, i=1,2 ..., S}.
2. local shape factor.The general SIFT algorithm using classics.J-th characteristic point of the i-th width image zooming-out It is expressed asUsed here as NiRepresent the characteristic point of the i-th width image zooming-out Quantity.
3. weight calculation.For each characteristic point in each width trade mark samples pictures, find this feature respectively Point with in each characteristic point of other S-1 width images, Euclidean distance minimum characteristic point;Concrete steps describe As follows:
1) assumeIt is j-th characteristic point in the i-th width image, calculate itself and other S-1 each feature of width image The Euclidean distance of point.
2) choosing in above-mentioned calculated result, (S-1 width image altogether does not include the i-th width image originally to each figure Body) characteristic point minimum with input feature vector point distance, it as candidate point, is designated asRepresent kth width figure In Xiang, the characteristic point minimum with j-th characteristic point Euclidean distance in the i-th width image, useRepresent both Between distance.
3) according toSize pairCarry out ascending order arrangement, form new sequence: trade mark samples pictures {xk, k=1,2 ..., S-1} and corresponding trade mark sample classification { yk, k=1,2 ..., S-1}.
4) as k=0, Q is initializedk=0, Pk=0.
5) sequence for step 3 outputIn each characteristic point, if meet:
yi=yk
Picture belonging to i.e. k-th characteristic point, belongs to blanket brand sample classification with input picture, then uses Below equation updates PkAnd Qk:
Pk=Pk-1+1
Q k = Q k - 1 + 1 k P k
If be unsatisfactory for, then below equation is used to update:
Qk=Qk-1
Pk=Pk-1
The principle of above-mentioned formula is: the feature point pairs score value of the negative sample in matching sequence is not contributed, but The weight (because denominator k adds) of the characteristic point of the positive sample at its rear for the ranking can be reduced.
6) sequence is worked asIn after all of characteristic point all calculated one time, use below equation to obtain final 'sWeight:
Score i j = Q s - 1 S y i - 1
In above formulaRepresent that S opens in sample image altogether, with yiThe identical image of classification (positive sample, but Do not comprise input picture itself) quantity.This formula is meant that, if positive sample ranking is more forward, Then final score is higher;The interval of whole score is between 0~1;If sequenceIn positive sample standard deviation go out Before present negative sample, then must be divided into 1.
7) result exporting in above formulaCan be as the weight of j-th feature of the i-th width image;This spy The quantity levying in all positive samples coupling is more, and in all negative samples, the quantity of coupling is fewer, then its Importance and weight are bigger;The high characteristic point of these weights has generally corresponded to the trade mark sample in commodity image One's respective area.By setting a rational threshold θ, all characteristic points meeting following condition can be extracted Validity feature point as brand recognition: { Score i j ≥ θ , i = 1,2 , . . . , S ; j = 1,2 , . . . , N i } .
Third embodiment of the invention relates to business's calibration method in a kind of retrieval commodity image, and Fig. 4 is this inspection The schematic flow sheet of business's calibration method in rope commodity image.
Specifically, in training set, the weight of all commodity image is more than the trade mark sample areas of predetermined threshold Interior characteristic point constitutive characteristic space, as shown in Figure 4, in this retrieval commodity image, business's calibration method includes Following steps:
Step 401, carries out local shape factor to the commodity image to be retrieved inputting and obtains multiple characteristic point.
Step 402, each characteristic point obtaining extraction successively is as current signature point, and in feature Space is found immediate K the characteristic point of distance with current signature point, and by the power of K characteristic point It is heavily added separately to the ballot score of trade mark sample classification belonging to each self-corresponding commodity image of K characteristic point In.
Step 403, adds up the ballot score of each trade mark sample classification, by the trade mark sample of highest scoring The trade mark of classification is as the retrieval result of commodity image to be retrieved.
Furthermore, it is to be understood that under normal circumstances, the part that in commodity image, trade mark region occupies is fewer, If the characteristic point constitutive characteristic space do not extracted in advance in trade mark sample areas, and direct statistical nature point Ballot score if, may non-trade mark provincial characteristics point add up the spy in branch's covering trade mark region Levy a score, thus can not correctly retrieve the trade mark sample classification of commodity image.
And the weight of all commodity original images is generally corresponding more than the characteristic point of predetermined threshold in training set The trade mark region of commodity original image, therefore, in present embodiment using the characteristic point in trade mark region as Feature space, can solve the problem that the branch that obtains that above-mentioned non-trade mark provincial characteristics point is added up covers trade mark region The situation of characteristic point score.
Preferably, " each characteristic point successively extraction being obtained as current signature point, and spy Levy in space and find and immediate K the characteristic point of distance of current signature point " step in, including with Lower sub-step:
Tree data structure is used to set up index to feature space.
Find the current signature point with commodity image to be retrieved by search index apart from each spy of immediate K Levy a little.
Index is set up to feature space, spy that can be most like with input feature vector in search index more quickly Levy a little.Furthermore, it is to be understood that tree sets up index can use such as kd tree and improved Kd tree query mode (BBF, Best-Bin-First) etc..
Present embodiment according in feature space with the characteristic point of commodity image to be retrieved apart from immediate The weight of characteristic point, the trade mark sample classification belonging to commodity image to be retrieved is voted, and improves inspection The accuracy of rope.
As the preference of present embodiment, in retrieval commodity image, the flow process of trade mark specifically includes that
1. retrieve: SIFT feature is extracted to input picture (commodity image);For extract obtain each Individual SIFT feature point, finds in feature space with it closest to K the feature of (Euclidean distance is minimum) Point, and the trade mark classification ballot of this K Feature point correspondence is increased corresponding weight.
2. result output: all carried out according to all characteristic points of image to be retrieved to input for the above-mentioned steps After single ballot, add up the final score result of each trade mark sample classification, and export the business of highest scoring This classification of standard specimen, as the matching result of input picture.
The each method embodiment of the present invention all can realize in modes such as software, hardware, firmwares.No matter The present invention is to realize with software, hardware or firmware mode, and instruction code may be stored in any class In the addressable memory of computer of type (for example permanent or revisable, volatibility or non- Volatibility, solid-state or non-solid, fixing or removable medium etc.).Equally, Memory can e.g. programmable logic array (Programmable Array Logic, be called for short " PAL "), random access memory (Random Access Memory, be called for short " RAM "), Programmable read only memory (Programmable Read Only Memory is called for short " PROM "), Read-only storage (Read-Only Memory is called for short " ROM "), electrically erasable are read-only Memory (Electrically Erasable Programmable ROM is called for short " EEPROM "), Disk, CD, digital versatile disc (Digital Versatile Disc is called for short " DVD ") etc..
Four embodiment of the invention relates to the device of trade mark in a kind of automatic acquisition commodity image, and Fig. 5 is This obtains the structural representation of the device of trade mark in commodity image automatically.
Specifically, this device automatically obtaining trade mark in commodity image includes with lower module as shown in Figure 5:
Characteristic extracting module, for carrying out local shape factor to the commodity image in training set, obtains business Multiple characteristic points of product image;
Characteristic point weight computation module, for repeating following operation to each commodity image:
Obtain commodity image a characteristic point as current signature point, remaining every width commodity in training set The characteristic point of image is respectively chosen one minimum with current signature point distance as candidate feature point, according to Commodity corresponding to the distance-taxis of each candidate feature point and current signature point and each candidate feature point Whether image is the positive sample belonging to blanket brand sample classification with commodity image, calculates current signature point and uses In effective weight identifying trade mark in commodity image;
Module is chosen in trade mark region, for finishing when the weight of all characteristic points of commodity image all calculates After, choose weight in commodity image and be more than the region corresponding to characteristic point of predetermined threshold as commodity image In trade mark sample image.
Present embodiment can obtain the trade mark sample areas in shiploads of merchandise image automatically, obtains trade mark sample This image, it is not necessary to consume substantial amounts of artificial mark, and with the increase of trade mark quantity and commodity image, Maintenance cost remains unchanged.
First embodiment is the method embodiment corresponding with present embodiment, and present embodiment can be with First embodiment is worked in coordination enforcement.The relevant technical details mentioned in first embodiment is in this enforcement In mode still effectively, in order to reduce repetition, repeat no more here.Correspondingly, present embodiment carries To relevant technical details be also applicable in the first embodiment.
Fifth embodiment of the invention relates to the device of trade mark in a kind of automatic acquisition commodity image, and the 5th is real Mode of executing is improved on the basis of four embodiments, mainly thes improvement is that: if commodity The characteristic point of image is more likely to the candidate feature Point matching with training set corresponding to positive sample, then increase Its weight, on the contrary then reduce, improve accuracy rate and the recall rate of system;Local is used when extracting feature Feature goes for image aliasing and has situation about blocking.Specifically:
In features described above point weight computation module, including submodule:
Candidate feature point sorting sub-module, for the distance according to each candidate feature point and current signature point Size, carries out ascending order arrangement to each candidate feature point;
Positive sample judges submodule, for judging ascending order in arranging corresponding to each candidate feature point successively Whether commodity image is positive sample;
Whether weight calculation key submodule, be just for the commodity image according to corresponding to candidate feature point The judged result of sample, and the position that candidate feature point is in ascending order arrangement, calculate the power of this feature point Weight, wherein, corresponding commodity image is that the candidate feature point of positive sample is more forward in ascending order arrangement, Then bigger to the weight contribution value of current signature point.
Preferably, in weight calculation key submodule, including submodule:
Positive sample process submodule, if being positive sample for the commodity image corresponding to candidate feature point, Then the weight contribution value of this candidate feature point is increased in the weight of current signature point;
Negative sample processes submodule, if being negative sample for the commodity image corresponding to candidate feature point, Then do not increase the weight of current signature point, but reduce and come this candidate feature point candidate feature point thereafter Weight contribution value.
Preferably, in characteristic point weight computation module, submodule is also included:
Initialization submodule, is used for initializing Qk=0, Pk=0;Wherein, k represents kth width in training set Commodity image;
Weight adds up submodule, for the ascending order arrangement for each candidate feature pointWherein,Represent in kth width commodity image, j-th with commodity image i The minimum candidate feature point of characteristic point distance, S-1 represents in training set remaining commodity in addition to commodity image i The number of image, if candidate feature pointCorresponding commodity image is positive sample, then use following public Formula updates PkAnd Qk:
Pk=Pk-1+1
Q k = Q k - 1 + 1 k P k
If negative sample, then below equation is used to update PkAnd Qk:
Qk=Qk-1
Pk=Pk-1
Weight normalizes submodule, for current signature pointWeight use below equation be normalized:
Score i j = Q s - 1 S y i - 1
Wherein,Represent the quantity for positive sample in each commodity image corresponding to candidate feature point.
Preferably, in characteristic extracting module, local feature is Scale invariant features transform feature.
Second embodiment is the method embodiment corresponding with present embodiment, and present embodiment can be with Second embodiment is worked in coordination enforcement.The relevant technical details mentioned in second embodiment is in this enforcement In mode still effectively, in order to reduce repetition, repeat no more here.Correspondingly, present embodiment carries To relevant technical details be also applicable in the second embodiment.
Sixth embodiment of the invention relates to a kind of device retrieving trade mark in commodity image, and Fig. 6 is this inspection The structural representation of the device of trade mark in rope commodity image.
Specifically, in this retrieval commodity image trade mark device training set in the weight of all commodity image More than the characteristic point constitutive characteristic space in the trade mark sample areas of predetermined threshold, as shown in Figure 6, this dress Put and include with lower module:
Characteristic extracting module, obtains many for carrying out local shape factor to the commodity image to be retrieved of input Individual characteristic point;
Weight votes accumulator module, each characteristic point successively extraction being obtained as current signature point, And find and immediate K the characteristic point of distance of current signature point in feature space, and by K spy Levy weight a little and be added to the trade mark sample classification belonging to each self-corresponding commodity image of K characteristic point respectively Ballot score in;
Module retrieved by trade mark, for adding up the ballot score of each trade mark sample classification, by highest scoring The trade mark of trade mark sample classification is as the retrieval result of commodity image to be retrieved.
Preferably, in weight votes accumulator module, following submodule is also included:
Submodule set up in index, is used for using tree data structure to set up index to feature space.
Indexed search submodule, for by search index find with the characteristic point of commodity image to be retrieved away from From each characteristic point of immediate K.
Index is set up to feature space, spy that can be most like with input feature vector in search index more quickly Levy a little.Furthermore, it is to be understood that tree sets up index can use such as kd tree and improved Kd tree query mode (BBF, Best-Bin-First) etc..
Present embodiment according in feature space with the characteristic point of commodity image to be retrieved apart from immediate The weight of characteristic point, the trade mark sample classification belonging to commodity image to be retrieved is voted, and improves inspection The accuracy of rope.
3rd embodiment is the method embodiment corresponding with present embodiment, and present embodiment can be with 3rd embodiment is worked in coordination enforcement.The relevant technical details mentioned in 3rd embodiment is in this enforcement In mode still effectively, in order to reduce repetition, repeat no more here.Correspondingly, present embodiment carries To relevant technical details be also applicable in the 3rd embodiment.
The present invention proposes a kind of new scheme, by the trade mark region in automatic study and discovery picture Characteristic point, thus solve the collection of commodity original image and the problem that mark needs consumption is artificial in a large number; Meanwhile, the different weight of each characteristic point (depending on uniqueness and the robustness of this feature) is given, should If the positive sample (belonging to identical classification with picture to be retrieved) that feature is more likely to be trained to concentrate In characteristic matching, then increasing its weight, otherwise then reducing, the system that finally improves is in retrieving Accuracy rate and recall rate.
It should be noted that each module mentioned in the present invention each equipment embodiment is all logic module, Physically, a logic module can be a physical module, it is also possible to be the one of a physical module Part, can also realize with the combination of multiple physical modules, the physics realization side of these logic modules itself Formula is not most important, and the combination of the function that these logic modules are realized is only the solution present invention and is carried The key of the technical problem going out.Additionally, for the innovative part highlighting the present invention, the present invention is above-mentioned respectively to be set The module less close with solving technical problem relation proposed by the invention is not drawn by standby embodiment Entering, this is not intended that the said equipment embodiment does not exist other module.
It should be noted that in the claim and specification of this patent, the first and second grades it The relational terms of class is used merely to separate an entity or operation with another entity or operating space, And not necessarily require or imply there is the relation of any this reality or suitable between these entities or operation Sequence.And, term " includes ", "comprising" or its any other variant are intended to nonexcludability Comprise, so that include that the process of a series of key element, method, article or equipment not only include that A little key elements, but also include other key elements being not expressly set out, or also include for this process, The intrinsic key element of method, article or equipment.In the case of there is no more restriction, by statement " bag Include one " key element that limits, it is not excluded that at process, method, the article including described key element or set Other identical element is there is also in Bei.
Although by referring to some of the preferred embodiment of the invention, the present invention has been shown and Describe, but it will be understood by those skilled in the art that and can in the form and details it be made respectively Plant and change, without departing from the spirit and scope of the present invention.

Claims (14)

1. business's calibration method in an automatic acquisition commodity image, it is characterised in that comprise the following steps:
Local shape factor is carried out to the commodity image in training set, obtains multiple features of commodity image Point;
Repeat following steps to each commodity image:
Obtain a characteristic point of described commodity image as current signature point, remaining every width in training set The characteristic point of commodity image is respectively chosen one minimum with current signature point distance as candidate feature point, Distance-taxis according to each candidate feature point described and current signature point and each candidate feature point institute Whether corresponding commodity image is the positive sample belonging to blanket brand sample classification with described commodity image, meter Calculate current signature point for effective weight identifying trade mark in described commodity image;
After the weight of all characteristic points of described commodity image all calculates and finishes, choose described commodity image Middle weight is more than the region corresponding to characteristic point of predetermined threshold as the trade mark sample in described commodity image One's respective area.
2. business's calibration method in automatic acquisition commodity image according to claim 1, its feature exists In in described " distance-taxis according to each candidate feature point described and current signature point and each time Select whether the commodity image corresponding to characteristic point is to belong to blanket brand sample classification with described commodity image Positive sample, calculate current signature point for effective weight identifying trade mark in described commodity image " step In Zhou, including sub-step:
According to the distance size of each candidate feature point and current signature point, to each candidate feature point described Carry out ascending order arrangement;
Judge in the arrangement of described ascending order, whether each commodity image corresponding to candidate feature point is just successively Sample;
Whether the commodity image according to corresponding to candidate feature point is the judged result of positive sample, and candidate Position in the arrangement of described ascending order for the characteristic point, calculates the weight of this feature point, wherein, corresponding business Product image is that the candidate feature point of positive sample is more forward in the arrangement of described ascending order, then to described current signature The weight contribution value of point is bigger.
3. business's calibration method in automatic acquisition commodity image according to claim 2, its feature exists In, described " whether the commodity image according to corresponding to candidate feature point is the judged result of positive sample, And candidate feature point described ascending order arrangement in position, calculate this feature point weight " step in, Including sub-step:
If the commodity image corresponding to candidate feature point is positive sample, then the weight by this candidate feature point Contribution margin increases in the weight of described current signature point;
If the commodity image corresponding to candidate feature point is negative sample, then do not increase described current signature point Weight, but reduce and come the weight contribution value of this candidate feature point candidate feature point thereafter.
4. business's calibration method in automatic acquisition commodity image according to claim 2, its feature exists In, described " whether the commodity image according to corresponding to candidate feature point is the judged result of positive sample, And candidate feature point described ascending order arrangement in position, calculate this feature point weight " step in, Including sub-step:
Initialize Qk=0, Pk=0;Wherein, k represents kth width commodity image in training set,Represent In kth width commodity image, the candidate feature minimum with j-th characteristic point distance of described commodity image i Point;
Ascending order arrangement for each candidate feature point describedWherein, S-1 represents In training set in addition to commodity image i the number of remaining commodity image, if candidate feature pointCorresponding Commodity image be positive sample, then use below equation update PkAnd Qk:
Pk=Pk-1+1
Q k = Q k - 1 + 1 k P k
If negative sample, then below equation is used to update PkAnd Qk:
Qk=Qk-1
Pk=Pk-1
To described current signature pointWeight use below equation be normalized:
Score i j = Q s - 1 S y i - 1
Wherein,Represent the quantity for positive sample in each commodity image corresponding to candidate feature point.
5. the side of trade mark in automatic acquisition commodity image according to any one of claim 1 to 4 Method, it is characterised in that " described local shape factor carried out to the commodity image in training set, obtains In the step of multiple characteristic points of commodity image ", described local feature is Scale invariant features transform feature.
6. business's calibration method in a retrieval commodity image, it is characterised in that all commodity in training set The weight of image is more than the characteristic point constitutive characteristic space in the trade mark sample areas of predetermined threshold, the method Comprise the following steps:
Local shape factor is carried out to the commodity image to be retrieved inputting and obtains multiple characteristic point;
Each characteristic point obtaining described extraction successively is as current signature point and empty in described feature Between middle find and immediate K the characteristic point of distance of current signature point, and by described K characteristic point Weight is added separately to the ballot of the trade mark sample classification belonging to each self-corresponding commodity image of K characteristic point In score;
Add up the ballot score of each trade mark sample classification, by the trade mark of the trade mark sample classification of highest scoring Retrieval result as described commodity image to be retrieved.
7. business's calibration method in retrieval commodity image according to claim 6, it is characterised in that Described " each characteristic point successively described extraction being obtained as current signature point, and described spy Levy in space and find and immediate K the characteristic point of distance of current signature point " step in, including with Lower sub-step:
Tree data structure is used to set up index to described feature space;
The current signature point distance found with described commodity image to be retrieved by the described index of retrieval is connect most The each characteristic point of near K.
8. the device of trade mark in an automatic acquisition commodity image, it is characterised in that include with lower module:
Characteristic extracting module, for carrying out local shape factor to the commodity image in training set, obtains business Multiple characteristic points of product image;
Characteristic point weight computation module, for repeating following operation to each commodity image:
Obtain a characteristic point of described commodity image as current signature point, remaining every width in training set The characteristic point of commodity image is respectively chosen one minimum with current signature point distance as candidate feature point, Distance-taxis according to each candidate feature point described and current signature point and each candidate feature point institute Whether corresponding commodity image is the positive sample belonging to blanket brand sample classification with described commodity image, meter Calculate current signature point for effective weight identifying trade mark in described commodity image;
Module is chosen in trade mark region, and the weight for all characteristic points when described commodity image has all calculated Bi Hou, chooses weight in described commodity image and is more than the region corresponding to characteristic point of predetermined threshold as institute State the trade mark sample areas in commodity image.
9. the device of trade mark in automatic acquisition commodity image according to claim 8, its feature exists In, in described characteristic point weight computation module, including submodule:
Candidate feature point sorting sub-module, for the distance according to each candidate feature point and current signature point Size, carries out ascending order arrangement to each candidate feature point described;
Positive sample judges submodule, for judging that in the arrangement of described ascending order, each candidate feature point institute is right successively Whether the commodity image answered is positive sample;
Whether weight calculation key submodule, be just for the commodity image according to corresponding to candidate feature point The judged result of sample, and the position that candidate feature point is in the arrangement of described ascending order, calculate this feature point Weight, wherein, corresponding commodity image be positive sample candidate feature point described ascending order arrangement in More forward, then bigger to the weight contribution value of described current signature point.
10. the device of trade mark in automatic acquisition commodity image according to claim 9, its feature It is, in described weight calculation key submodule, including submodule:
Positive sample process submodule, if being positive sample for the commodity image corresponding to candidate feature point, Then the weight contribution value of this candidate feature point is increased in the weight of described current signature point;
Negative sample processes submodule, if being negative sample for the commodity image corresponding to candidate feature point, Then do not increase the weight of described current signature point, but reduce and come this candidate feature point candidate feature thereafter The weight contribution value of point.
The device of trade mark, its feature in 11. automatic acquisition commodity image according to claim 9 It is, in described characteristic point weight computation module, also include submodule:
Initialization submodule, is used for initializing Qk=0, Pk=0;Wherein, k represents kth width in training set Commodity image,Represent in kth width commodity image, j-th characteristic point distance with described commodity image i Minimum candidate feature point;
Ascending order arrangement for each candidate feature point describedWherein, S-1 represents In training set in addition to commodity image i the number of remaining commodity image, if candidate feature pointCorresponding Commodity image be positive sample, then use below equation update PkAnd Qk:
Pk=Pk-1+1
Q k = Q k - 1 + 1 k P k
If negative sample, then below equation is used to update PkAnd Qk:
Qk=Qk-1
Pk=Pk-1
To described current signature pointWeight use below equation be normalized:
Score i j = Q s - 1 S y i - 1
Wherein,Represent the quantity for positive sample in each commodity image corresponding to candidate feature point.
12. according to Claim 8 to trade mark in the commodity image of acquisition automatically according to any one of 11 Device, it is characterised in that in described characteristic extracting module, described local feature is scale invariant feature Transform characteristics.
13. 1 kinds of devices retrieving trade mark in commodity image, it is characterised in that all commodity in training set The weight of image is more than the characteristic point constitutive characteristic space in the trade mark sample areas of predetermined threshold, this device Including with lower module:
Characteristic extracting module, obtains many for carrying out local shape factor to the commodity image to be retrieved of input Individual characteristic point;
Weight votes accumulator module, each characteristic point being used for obtaining described extraction successively is as currently Characteristic point, and in described feature space, find immediate K the feature of distance with current signature point Point, and the weight by described K characteristic point is added separately to each self-corresponding commodity image institute of K characteristic point In the ballot score of the trade mark sample classification belonging to;
Module retrieved by trade mark, for adding up the ballot score of each trade mark sample classification, by highest scoring The trade mark of trade mark sample classification is as the retrieval result of described commodity image to be retrieved.
The device of trade mark in 14. retrieval commodity image according to claim 13, it is characterised in that In described weight votes accumulator module, also include following submodule:
Submodule set up in index, is used for using tree data structure to set up index to described feature space;
Indexed search submodule, for being found and described commodity image to be retrieved by the described index of retrieval Characteristic point is apart from each characteristic point of immediate K.
CN201510059267.2A 2015-02-04 2015-02-04 Method and device for automatically acquiring trademark in commodity image and searching trademark Pending CN105989043A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510059267.2A CN105989043A (en) 2015-02-04 2015-02-04 Method and device for automatically acquiring trademark in commodity image and searching trademark

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510059267.2A CN105989043A (en) 2015-02-04 2015-02-04 Method and device for automatically acquiring trademark in commodity image and searching trademark

Publications (1)

Publication Number Publication Date
CN105989043A true CN105989043A (en) 2016-10-05

Family

ID=57036064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510059267.2A Pending CN105989043A (en) 2015-02-04 2015-02-04 Method and device for automatically acquiring trademark in commodity image and searching trademark

Country Status (1)

Country Link
CN (1) CN105989043A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038122A (en) * 2017-11-03 2018-05-15 福建师范大学 A kind of method of trademark image retrieval
CN111125418A (en) * 2020-01-15 2020-05-08 广东工业大学 Trademark retrieval system
CN111199439A (en) * 2018-11-16 2020-05-26 阿里巴巴集团控股有限公司 Commodity information processing method and device
CN111241330A (en) * 2020-01-13 2020-06-05 苏宁云计算有限公司 Commodity picture auditing method and device
CN112136151A (en) * 2018-05-28 2020-12-25 株式会社理光 Image search device, image search method, commodity catalog generation system, and recording medium
WO2022083332A1 (en) * 2020-10-23 2022-04-28 华为技术有限公司 Commodity data management method and apparatus, and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541954A (en) * 2010-12-29 2012-07-04 北京大学 Method and system for searching trademarks
CN103049512A (en) * 2012-12-14 2013-04-17 杭州淘淘搜科技有限公司 Blocking, weighting and matching retrieval method based on commodity image saliency map
CN104143088A (en) * 2014-07-25 2014-11-12 电子科技大学 Face identification method based on image retrieval and feature weight learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541954A (en) * 2010-12-29 2012-07-04 北京大学 Method and system for searching trademarks
CN103049512A (en) * 2012-12-14 2013-04-17 杭州淘淘搜科技有限公司 Blocking, weighting and matching retrieval method based on commodity image saliency map
CN104143088A (en) * 2014-07-25 2014-11-12 电子科技大学 Face identification method based on image retrieval and feature weight learning

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038122A (en) * 2017-11-03 2018-05-15 福建师范大学 A kind of method of trademark image retrieval
CN108038122B (en) * 2017-11-03 2021-12-14 福建师范大学 Trademark image retrieval method
CN112136151A (en) * 2018-05-28 2020-12-25 株式会社理光 Image search device, image search method, commodity catalog generation system, and recording medium
US11900423B2 (en) 2018-05-28 2024-02-13 Ricoh Company, Ltd. Image retrieval apparatus image retrieval method, product catalog generation system, and recording medium
CN111199439A (en) * 2018-11-16 2020-05-26 阿里巴巴集团控股有限公司 Commodity information processing method and device
CN111199439B (en) * 2018-11-16 2023-04-14 阿里巴巴集团控股有限公司 Commodity information processing method and device
CN111241330A (en) * 2020-01-13 2020-06-05 苏宁云计算有限公司 Commodity picture auditing method and device
CN111241330B (en) * 2020-01-13 2022-11-18 苏宁云计算有限公司 Commodity picture auditing method and device
CN111125418A (en) * 2020-01-15 2020-05-08 广东工业大学 Trademark retrieval system
WO2022083332A1 (en) * 2020-10-23 2022-04-28 华为技术有限公司 Commodity data management method and apparatus, and server

Similar Documents

Publication Publication Date Title
CN110084292B (en) Target detection method based on DenseNet and multi-scale feature fusion
CN105989043A (en) Method and device for automatically acquiring trademark in commodity image and searching trademark
Shalunts et al. Architectural style classification of building facade windows
Dong et al. Tablesense: Spreadsheet table detection with convolutional neural networks
Lee et al. Classification of leaf images
CN103995889B (en) Picture classification method and device
CN110717534B (en) Target classification and positioning method based on network supervision
CN108549870A (en) A kind of method and device that article display is differentiated
CN106503727B (en) A kind of method and device of classification hyperspectral imagery
CN109857889A (en) A kind of image search method, device, equipment and readable storage medium storing program for executing
CN107506703A (en) A kind of pedestrian's recognition methods again for learning and reordering based on unsupervised Local Metric
CN111797239B (en) Application program classification method and device and terminal equipment
CN108985360A (en) Hyperspectral classification method based on expanding morphology and Active Learning
CN104281572B (en) A kind of target matching method and its system based on mutual information
CN109886295A (en) A kind of butterfly recognition methods neural network based and relevant device
CN113705570B (en) Deep learning-based few-sample target detection method
CN108647595A (en) Vehicle recognition methods again based on more attribute depth characteristics
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN109165666A (en) Multi-tag image classification method, device, equipment and storage medium
Zhu et al. Deep residual text detection network for scene text
CN103839078A (en) Hyperspectral image classifying method based on active learning
CN102385592A (en) Image concept detection method and device
CN107315984B (en) Pedestrian retrieval method and device
CN106295498A (en) Remote sensing image target area detection apparatus and method
CN111178196B (en) Cell classification method, device and equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161005