CN110263202A - Image search method and equipment - Google Patents

Image search method and equipment Download PDF

Info

Publication number
CN110263202A
CN110263202A CN201910244643.3A CN201910244643A CN110263202A CN 110263202 A CN110263202 A CN 110263202A CN 201910244643 A CN201910244643 A CN 201910244643A CN 110263202 A CN110263202 A CN 110263202A
Authority
CN
China
Prior art keywords
image
search
images
user
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910244643.3A
Other languages
Chinese (zh)
Inventor
桑德拉·莫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
See Out Pty Ltd
Original Assignee
See Out Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2013905002A external-priority patent/AU2013905002A0/en
Application filed by See Out Pty Ltd filed Critical See Out Pty Ltd
Publication of CN110263202A publication Critical patent/CN110263202A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Equipment for using when searching for multiple reference pictures, the equipment include one or more electronic processing devices, the one or more electronic processing device: an acquisition at least image;The image is handled to determine multiple subgraphs and multiple images feature associated with the image and/or subgraph;And, picture search is executed using the image, the subgraph and these characteristics of image, wherein, the image is at least one of one of a sample image and multiple reference pictures, and wherein, search is performed to identify multiple reference pictures similar with the sample image at least partially through searching for multiple reference pictures.

Description

Image search method and equipment
Cross reference to related applications
The application is that international filing date is on 09 26th, 2014, and application No. is the entitled " figures of " 201480053618.2 " The divisional application of the Chinese patent application of picture searching method and equipment ".
Technical field
The present invention relates to a kind of image search method and equipment and more particularly to one kind for searching for multiple with reference to figure As the method and apparatus of (such as trade mark, mark).
Background technique
In the present note to any existing publication (or deriving from information therein) or any of item, no Be also be not construed as recognizing perhaps can or implying in any form the existing publication (or from information therein) or Known item forms a part of the public common sense in the field for the work that this specification is related to.
For company, brand (often taking the trade mark of its name or the form of mark) is often its most important money It produces.Global country, which has, provides formal mechanism for company to protect the intellectual property (IP) of its brand by trade mark registration Office.This Accreditation System enables brand to oppose or force to oppose possible similar and other marks for causing business to be obscured or quotient Mark, therefore allow them to establish unique common identity to establish their business based on it.
In addition, creation in 1996 and the agreement of Madrid that about 90 countries endorsed and protocol by can be with The mode of an application of more than one country is covered to provide international trademark registration.With independent national registration investment combination phase Instead, with single registration with cover the chance of a wide range of country not only imparted in investment combination management but also in terms of cost savings it is more A advantage.
However, if there is essentially similar and may cause another trade mark (nothing having built up that brand obscures By chartered or common law trade mark, that is, it is unregistered but on the market have established offer product), then trade mark Application may be rejected.This can be the very big waste to time and money, therefore, before application for registration trade mark, it is proposed that be directed to The trademark database of the State Bureau applied to it scans for and carries out common law search.
Unfortunately, the device (or mark or image) for searching for trade mark can be very difficult to realization, because it must at present It must be described by the text to image to complete.For trademark office, many countries have been directed to trade mark device and have used based on text The standard of this descriptor (also known as image construction details (for example, square, square+, rectangle, prismatic thousands of)) List.This includes multiple and different pacts in the whole world, and image construction is such as categorized into major category and subclass (example with number Such as, 1.1.1- ' celestial body ', 1.1.15- ' comet, with tail celestial body ') Vienna Convention and USPTO classify pact, and It the use of text based image construction details but the Australian pact that is ultimately at foundation level is entirely to image and wherein The text based of component part (composition) describes.
Unfortunately, that the device (or mark or image) for searching for trade mark can be very difficult to processing and expend the time, because It must be described by the text of the text meta-data for image for current its to complete.For common or abstract mark, For example, " whirlwind " of Nike or " flower " of Adidas, it can be very challenging for finding most suitable text descriptor.It wants Having for selection is so much, because often there is thousands of matchings to go through.In addition, if miss certain descriptors (that is, If it is not identical that searcher obtains the content indexed with IP innings to it to same iamge description), then it is possible that missing similar Coalignment.Common law search on the internet is also at all remarkable.For major part, search must also lead to really Text description is crossed to complete.
Summary of the invention
In a broad form, the present invention is provided to execute the equipment of the search of multiple reference pictures, the equipment packet One or more electronic processing devices are included, the one or more electronic processing device:
A) multiple reference pictures are searched for identify multiple first reference pictures similar with a sample image;
B) multiple images label associated at least first reference picture in these first reference pictures is identified;
C) multiple reference pictures are searched for and identify multiple to use at least one image tag in these image tags Second reference picture;Also,
D) multiple search results including at least some first and second reference pictures are provided.
In general, the one or more electronic device:
A) a first image ranking is determined according to the similarity of these first reference pictures and the sample image;Also,
B) at least first reference picture is selected based in part on the first image ranking.
In general, the one or more electronic device:
A) at least some first reference pictures are presented to a user;
B) it is inputted to order according to multiple users and determines at least one the first selected reference picture;Also,
C) multiple images label associated at least one first selected reference picture is identified.
In general, multiple are presented first with reference to figure to the user according to the first image ranking in the one or more electronic device Picture.
In general, the one or more electronic device:
A) an image tag ranking is determined according to a frequency of occurrences;
B) at least one image tag is selected based in part on the image tag ranking.
In general, the one or more electronic device:
A) multiple images label associated at least first reference picture is presented to the user;
B) it is inputted to order according to multiple users and determines at least one selected image tag;Also,
C) multiple reference pictures are searched for using at least one selected image tag.
In general, multiple images label is presented according to image tag ranking in the one or more electronic device.
In general, these image tags include multiple metadata tags.
In general, the one or more electronic device:
A) the result ranking of these the first and second reference pictures is determined;Also,
B) these search results are provided according to the result ranking.
In general, the one or more electronic device determines the result ranking according at least one of the following:
A) the first image ranking;
B) the second image ranking;And
C) the first image ranking and the second image ranking combined.
In general, the one or more electronic device according to the similarity of these second reference pictures and these image tags come Determine the second image ranking.
In general, the one or more electronic device is from the reception of the user sample image.
In general, the one or more electronics process sample image.
In general, the one or more electronic device handles the sample image by following operation:
A) divide the sample image to form multiple sample subgraphs;Also,
B) multiple reference pictures are searched for using the sample image and these sample subgraphs.
In general, the one or more electronic device divides the sample image by following operation:
A) multiple images feature clustering is determined;Also,
B) according to multiple cluster segmentations image.
In general, the one or more electronic device divides the sample image by following operation:
A) sample image is converted into a gray level image;
B) gray level image is filtered to generate a gray level image through filtering;
C) image intensity of the gray level image through filtering is normalized to generate a normalized image;Also,
D) multiple clusters in the normalized image are determined.
In general, the one or more electronic device handles the sample image by least one in following operation:
A) sample image and these sample subgraphs are zoomed in and out;
B) multiple images feature is determined from the sample image and these sample subgraphs;Also,
C) at least one of the following is removed:
I) image background;
Ii) noise;And
Iii) text.
In general, the one or more electronic processing device is by following operation to the sample image and these sample subgraphs It zooms in and out:
A) these images and subgraph are cut out to remove background and form multiple images through cutting out;Also,
B) size of the image through cutting out is readjusted to the image size limited at one.
In general, the one or more electronic processing device handles the sample image by following operation:
A) optical character identification is executed to detect text;Also,
B) text is removed from the image.
In general, the one or more electronic processing device:
A) it is special with the multiple sample images of determination to handle at least image in the sample image and these sample subgraphs Sign;Also,
B) sampling feature vectors are determined using these sample image features.
In another broad form, the present invention provides a kind of method for executing the search of multiple reference pictures, should Method includes:
A) multiple reference pictures are searched for identify multiple first reference pictures similar with a sample image;
B) multiple images label associated at least first reference picture in these first reference pictures is identified;
C) multiple reference pictures are searched for and identify multiple to use at least one image tag in these image tags Second reference picture;Also,
D) multiple search results including at least some first and second reference pictures are provided.
In another broad form, the present invention provides a kind of equipment for using when searching for multiple reference pictures, The equipment includes one or more electronic processing devices, the one or more electronic processing device:
A) an at least image is acquired;
B) image is handled to determine that multiple subgraphs and multiple images associated with the image and/or subgraph are special Sign;Also,
C) picture search is executed using the image, the subgraph and these characteristics of image, wherein the image is a sample At least one of one of this image and multiple reference pictures, and wherein, search is at least partially through multiple ginsengs of search Image is examined to identify multiple reference pictures similar with the sample image and be performed.
In general, this method includes creation index, which includes multiple reference pictures, every reference picture and multiple subgraphs Picture and multiple images feature are associated.
In general, the one or more electronic device forms these subgraphs by dividing the image to handle the sample graph Picture.
In general, the one or more electronic device divides the image by following operation:
A) multiple feature clusterings in the image are determined;Also,
B) according to these cluster segmentations image.
In general, the one or more electronic device divides the image by following operation:
A) gray level image is converted the image to;
B) gray level image is filtered to generate a gray level image through filtering;
C) image intensity of the gray level image through filtering is normalized to generate a normalized image;Also,
D) multiple clusters in the normalized image are determined.
In general, the one or more electronic device handles the image by least one in following operation:
A) image and these subgraphs are zoomed in and out;
B) multiple images feature is determined from the image and these subgraphs;Also,
C) at least one of the following is removed:
I) image background;
Ii) noise;And
Iii) text.
In general, the one or more electronic processing device scales these images by following operation:
A) these images and subgraph are cut out to remove background and form multiple images through cutting out;Also,
B) size of the image through cutting out is readjusted to the image size limited at one.
In general, the one or more electronic processing device handles the image by following operation:
A) optical character identification is executed to detect text;Also,
B) text is removed from the image.
In general, when the image is reference picture, the one or more electronic processing device is by the ginseng of the text and index Image is examined to be associated.
In general, the one or more electronic processing device:
A) at least image in the image and these subgraphs is handled to determine multiple images feature;Also,
B) feature vector is determined using these characteristics of image.
In another broad form, the present invention provides a kind of method for using when searching for multiple reference pictures, This method comprises:
A) an at least image is acquired;
B) image is handled to determine multiple subgraphs and multiple images feature associated with the image;Also,
C) picture search is executed using the image, the subgraph and these characteristics of image, wherein the image is a sample At least one of one of this image and multiple reference pictures, and wherein, search is at least partially through multiple ginsengs of search Image is examined to identify multiple reference pictures similar with the sample image and be performed.
In another broad form, it is an object of the present invention to provide a kind of method for executing picture search, this method The following steps are included:
A) query image is uploaded to a search engine by user;
B) search engine identifies multiple visually similar matching images using image recognition in a database;
C) multiple matching image results are presented for user;
D) user selects all or part of matching image results in those matching image results as maximally related matching Image result;
E) search system extracts the metadata of multiple selected results so that maximally related image tag is arranged and be arranged Name;
F) list of image tag is presented for user;Also,
G) it is presented for user and is combined formula image and text based on one or more image tags in these image tags The option of this search.
In another broad form, it is an object of the present invention to provide a kind of for executing the search system of picture search, should Search system includes a search engine, and wherein:
A) query image is uploaded to search engine by user;
B) search engine identifies multiple visually similar matching images using image recognition in the database;
C) multiple matching image results are presented for user;
D) user selects all or part of matching image results in those matching image results as maximally related matching Image result;
E) search system extracts the metadata of multiple selected results so that maximally related image tag is arranged and be arranged Name;
F) list of image tag is presented for user;Also,
G) it is presented for user and is combined formula image and text based on one or more image tags in these image tags The option of this search.
In another broad form, it is an object of the present invention to provide a kind of for the image progress from trademark database Pretreated method, this method comprises:
A) divide multiple subgraphs in the image;
B) image and these subgraphs are zoomed into a predefined size;
C) image and subgraph generated to every executes feature extraction, so that in the image or these subgraphs Multiple patterns are summarized as multiple features;Also,
D) it indexs for these images, subgraph and feature for searching in a database.
In another broad form, it is an object of the present invention to provide for being located in advance to the image from trademark database The equipment of reason, the equipment include a computer system, which executes:
A) divide multiple subgraphs in the image;
B) image and these subgraphs are zoomed into a predefined size;
C) image and subgraph generated to every executes feature extraction, so that in the image or these subgraphs Multiple patterns are summarized as multiple features;Also,
D) it indexs for these images, subgraph and feature for searching in a database.
Detailed description of the invention
It is now described with reference to the drawings example of the invention, in the accompanying drawings :-
Figure 1A is the exemplary flow chart of the method for executing the search of multiple reference pictures;
Figure 1B is the exemplary flow chart for handling image in the method for searching for multiple reference pictures;
Fig. 2 is the exemplary schematic diagram of distributed computer framework;
Fig. 3 is the exemplary schematic diagram of the processing system of Fig. 2;
Fig. 4 is the exemplary schematic diagram of the computer system of Fig. 2;
Fig. 5 A and Fig. 5 B are the further exemplary flow charts of image processing method;
Fig. 6 is the exemplary flow chart of the method for creation search index;
Fig. 7 A and Fig. 7 B are the further exemplary flow charts for searching for the method for multiple reference pictures;
Fig. 8 is the further exemplary flow chart for searching for the method for image;
Fig. 9 is the further exemplary flow chart for searching for the method for image;
Figure 10 is the schematic diagram of the user interface used in search process;
Figure 11 is the schematic diagram for showing the user interface of multiple search results;
Figure 12 is the exemplary schematic diagram of the user interface for the selection for showing multiple search results;
Figure 13 is the exemplary schematic diagram for showing the user interface of identified image tag;
Figure 14 is the exemplary schematic diagram for showing the user interface of multiple search results;
Figure 15 is the further exemplary flow chart for creating the method for search index;
Figure 16 is the further exemplary flow chart for executing the method for search;Also,
Figure 17 is the exemplary schematic block diagram of search process.
Specific embodiment
The example of the method for the search for executing multiple reference pictures is more fully described now with reference to Figure 1A.
In this illustration, as will be described in more detail, it is assumed that at least partly use and formed at one or more One or more electronic processing devices of a part of reason system execute the process, and the one or more electronic processing device is in turn Other one or more computer systems are connected to via the network architecture.
For exemplary purposes, following technology will be used.Term " user " be used to refer to interact with processing system with Such as the entity of search is executed, such as personal, company.Term " reference picture " refers to the storage image that search is executed for it.? In one example, reference picture is trade mark or mark, but also may include may or may other also unregistered trade marks Image asset, such as icon, cartoon characters., it will also be appreciated that this is not substantive.Term " sample image ", which refers to, to be submitted as The example image of a part of the inquiry for searching for reference image.
Term " image tag " is used to refer to the information of object or semantic information in description image.In the case where trade mark, Image tag is sometimes referred to as image descriptor, dictionary item, design searching code, Vienna sorting item or code etc..Image tag warp It is not exclusively limited often but manually and can be stored as associated with image, this image is allowed then to be searched Metadata.
In this illustration, in step 100, which searches for multiple reference pictures to identify The first reference picture similar with sample image.This mode can make to execute but be usually directed to one of in various manners Sample image is analyzed with image recognition technology to identify the characteristic of image, and then executes multiple using this analysis result The search of reference picture.In a specific example, this is related to analyzing sample image to determine one of the feature in instruction image Then it is compared by a or multiple feature vectors with the feature vector of reference picture.
In step 110, one or more electronic processing device mark and at least one in these first reference pictures The associated multiple images label of first reference picture.These image tags usually combine the first reference picture together for example with member The forms such as data are stored, and refer to the form for showing the text descriptor of picture material in one example.
This can be executed for each image in the first reference picture, but more typically be regarded for sample image The subset of the first most like reference picture executes in feel.This can one of in several ways mode determining and It can be related to showing the first reference picture to user, to allow user to check these images and select interested first reference Image.Alternatively, these first reference pictures can be ranked with the similarity of sample image based on them, wherein ranking is most The first high reference picture is automatically selected.
In step 120, which uses at least one image mark in these image tags Label are to search for multiple reference pictures to identify the second reference picture.Therefore, can by step 110 determine image tag with The associated image tag of every reference picture in these reference pictures is compared, to allow to identify these the second references Image.
Although any suitable technology can be used alternatively to transmit search result, in step 130, usually pass through These search results its be shown to user to provide these search results, wherein these search results include at least some first With the second reference picture.
Therefore, the above process is wide with sample image to identify automatically by executing search first with image recognition technology General similar first reference picture.Then, figure associated at least one some images in these first reference pictures is utilized Additional searching is executed as label.This can be used to return the search result column based on image recognition and image tag search the two Table.
This is therefore using there is two kinds of pending independent searching methods, to make the interested associated picture of mark most Bigization.This is especially important when searching for database (such as trademark database), wherein image is normally based on image tag quilt Mark.These image tags may be unfamiliar for the personnel using database, so that individual is difficult to Trademark database is searched in the case where without suitably training.In addition, these images when trade mark is initially stored in database Label is usually manually created and this can change with time according to the use of the personnel, descriptor of creation descriptor To be inconsistently performed.This means that different image descriptors can be used for describing similar image, and similar image Label can be used for describing different images, so that search process is more difficult.
However, the use of image tag is still usually the more strong searching method of use than independent image recognition. Therefore, the above process is using image recognition as coarse filter, to identify interested first reference picture, these images It is subsequently used for display image tag.For for the user for being unfamiliar with image tag, they can also check these image tags And it identifies potentially image tag relevant to sample image, to allow to be identified in finer search process into one The reference picture of step.
It is preferred upper that pretreatment is executed to image to ensure sample and refer in order to which the above process operates as efficiently as possible The consistency of the format and content of image.This may be used to image recognition processes and the therefore validity of search process in turn It is maximized with speed, and B referring to Fig.1 is described to the example of image processing techniques.
In this illustration, image is acquired in step 150.The image can be the sample image that search is executed for it It or alternatively may include a reference picture in multiple reference pictures.
In step 160, image is processed to determine subgraph associated with the image and characteristics of image, in step 170, For executing picture search, this can for example be executed using above-mentioned technology for image, subgraph and characteristics of image.
Therefore, processing is executed in order to identify specific characteristics of image across multiple images and subgraph.These subgraphs and The property of characteristics of image will change according to preferred implementation.In one example, subgraph corresponds to specific group of image At part, such as text, mark, character portion.Similarly, characteristics of image may include the position of the concrete composition part of image It sets, shape, color, density etc..By the way that these are identified as individual subgraph, this allows sample image component part and ginseng It examines image component part to be directly compared, thus increases a possibility that similar image is by accurate identification.
Multiple other features will now be described.
In general, the one or more electronic device determines first according to the similarity of the first reference picture and sample image Image ranking and based in part on the first image ranking select at least first reference picture.Furthermore it and/or replaces Dai Di, the one or more electronic device are presented at least some first reference pictures to user, are inputted and ordered according to multiple users It determines at least one the first selected reference picture and identifies image associated at least one first selected image Label.As this part, which can be presented multiple to user according to the first image ranking First reference picture.Therefore, these processes allow to be chosen as the base searched further for the first most like image of sample image Thus plinth enhances the validity searched further for.
The one or more electronic device can determine image tag ranking and at least partly root according to the frequency of occurrences At least one image tag is selected according to the image tag ranking.The frequency of occurrences can be one in these the first reference pictures Or the frequency occurred in multiple images.In this regard, similar image tag can be used for more in these first reference pictures It opens in image, in this case, these images are more likely relevant and are therefore presented prior to other image tags To user.Additionally and/or alternatively, ranking can based on image tag multiple reference pictures as a whole rather than just As the frequency of occurrences in those of the first reference picture mark reference picture.For example, if having and CIRCLE descriptor phase The reference picture of anti-HAND is less, then descriptor HAND may be more characteristic than descriptor CIRCLE.In one example, The combination that two frequencies can be used, as TF-IDF (item frequency-inverse document frequency) combines.
The one or more electronic device can also at least first reference picture be associated with this to user's presentation Multiple images label inputs to order according to multiple users and determines at least one selected image tag and using this at least one Selected image tag searches for multiple reference pictures.As part of it, which can be according to image Multiple images label is presented in label ranking.Therefore, this allows user that user is selected to think most accurately to describe sample image Image tag, therefore enhance the validity searched further for.
Image tag can be any suitable form, but in one example include metadata tag.
The one or more electronic device can determine the result ranking of the first and second reference pictures and according to the knot Fruit ranking provides search result.This can according to the first image ranking, the second image ranking or the first image ranking of combination and Second image ranking definitive result ranking, wherein the second image ranking is the phase according to the second reference picture and image tag It is determined like degree.Therefore, any one of first and second reference pictures or both can be ranked so that via appointing The more relevant reference picture of one search technique mark can be displayed to user prior to less relevant reference picture.
The one or more electronic device usually receives sample image from the user, although alternatively this can be from data It is retrieved in library etc..
The one or more electronic device usually handles sample image and makes it is easier to sample image with these with reference to figure As being compared.Similarly, when originally receiving reference picture, which usually handles these ginsengs Image and then index of the creation including multiple reference pictures are examined, every reference picture and multiple subgraphs, multiple images are special Sign and optionally multiple images label are associated.
When processing the images, which divides the image to form these subgraphs.This is usually By the feature clustering in determining image and the image is performed according to these cluster segmentations.Specifically, this can be related to by Image is converted into gray level image, is filtered to gray level image to generate gray level image through filtering, to the image through filtering Intensity is normalized to generate normalized image and determine the cluster in the normalized image.This allows being individually composed for image Part is separately processed, for example, allowing text discriminatively to be handled with image, so that search process is more effective.
The one or more electronic device, which is generally gone through, to be zoomed in and out image and subgraph, is true from image and subgraph Determine characteristics of image and removes image background, noise or text to handle image.Scaling is usually to pass through clipping image and subgraph As to remove background and form the image through cutting out and image that the size of the image through cutting out is readjusted restriction is big It is small to be performed, so that all images in reference picture, sample image and corresponding subgraph have similar size, To make the comparison of these images more effective again.
The one or more electronic processing device, which may also pass through, executes optical character identification to detect text and from image Middle removal text handles image.Then, this can also be compared with image tag (such as metadata), these image tags are logical Instruction often including any text in image, or for ensuring in character recognition process accuracy and/or for search graph As the purpose of label.When the image is reference picture, the one or more electronic processing device is usually also by the text and rope Reference picture in drawing is associated.
In addition, the one or more processing unit handles at least image in image and subgraph usually to determine figure Feature vector is determined as feature and using these characteristics of image.This allow by comparing feature vector come comparative sample image with Reference picture, to allow to execute more accurate matching.
In one example, the process be by one or more processing systems as distributed structure/architecture a part operation Lai It executes, its example is described now with reference to Fig. 2.
In this illustration, base station 201 is via communication network (such as internet 202 and/or multiple local area networks (LAN) 204) It is attached to multiple computer systems 203.It will recognize that, configuration the being given for example only property purpose of network 202,204, and practical Upper base station 201 and computer system 203 can be communicated via any appropriate mechanism, such as via including but not limited to movement Network, the dedicated network of such as 802.11 networks, internet, the wirelessly or non-wirelessly connection of LAN, WAN and via such as bluetooth etc. Direct or point-to-point connection.
In one example, base station 201 includes the one or more processing systems for being attached to one or more databases 211 210.For example, base station 201 is adapted for executing the index of search and processing image to create reference picture.May be used also base station Such as to manage charging and other relevant operations for executing supporting process.Computer system 203 is therefore adapted for and base It stands 201 communications, such as allows sample image to be submitted by selecting relevant first reference picture and image tag, checks and search Hitch fruit and command deployment process.
Although base station 201 is shown as single entity, it will recognize that, base station 201 can be for example by using being mentioned Be provided as a part of environment based on cloud processing system 210 and/or database 211 and be distributed in multiple geographically separated On position.However, above-mentioned arrangement is not essential and other suitable configurations can be used.
The example of suitable processing system 210 is shown in Fig. 3.In this illustration, processing system 210 includes as shown At least one microprocessor 300 interconnected via bus 304, memory 301, optional input/output device 302 and outer Portion's interface 303.In this illustration, external interface 303 can be used for processing system 210 being connected to peripheral unit, such as communicate Network 202,204, database 211, other storage devices etc..Although showing single external interface 303, this be only for Exemplary purpose, and can actually provide using the multiple of various methods (for example, Ethernet, serial ports, USB, wireless etc.) Interface.
When in use, microprocessor 300 execute in application software form be stored in a plurality of instruction in memory 301 to Allow to execute search and relevant process and allows to be communicated with computer system 203.Application software may include one Or it multiple software modules and can be performed in suitable performing environment (such as operating system environment).
Therefore, it will recognize that, processing system 210 can be formed by any suitable processing system, such as properly programmed Computer system, PC, web server, network server etc..In a specific example, processing system 210 is standard processing system It unites (such as 32 or 64 processing systems based on Intel Architecture), processing system execution is stored in non-volatile (for example, hard Disk) software application (although this is not essential) on memory.However, it will be further understood that processing system can be Any electronic processing device, as microprocessor, microchip processor, logic gate configuration, optionally with realize logic it is associated Firmware (such as FPGA (field programmable gate array)) or any other electronic device, system or arrangement.
As shown in Figure 4, in one example, computer system 203 interconnects extremely including as shown via bus 404 A few microprocessor 400, memory 401, input/output device 402 (such as keyboard and/or display) and external interface 403.In this illustration, external interface 403 can be used for computer system 203 being connected to peripheral unit, such as communication network 202,204, database 211, other storage devices etc..Although showing single external interface 403, this is only for example Purpose, and multiple interfaces using various methods (for example, Ethernet, serial ports, USB, wireless etc.) can actually be provided.
When in use, microprocessor 400 execute in application software form be stored in a plurality of instruction in memory 401 to Permission is communicated with base station 201, for example, allowing to supply image to it and allowing to show the details of search process to user.
Therefore, it will recognize that, computer system 203 can be formed by any suitable processing system, such as properly programmed PC, internet terminal, laptop computer, hand-held PC, smart phone, PDA, web server etc..Therefore, show at one In example, processing system 210 is standard processing system (such as 32 or 64 processing systems based on Intel Architecture), the processing system System executes the software application (although this is not essential) being stored on non-volatile (for example, hard disk) memory.However, It will be further understood that computer system 203 can be any electronic processing device, such as microprocessor, microchip processor, logic Door configuration is optionally filled with the associated firmware (such as FPGA (field programmable gate array)) of logic or any other electronics is realized It sets, system or arrangement.
The example of search process will be described in further detail now.For these exemplary purposes, it is assumed that processing system 210 supervisor's webpages are to allow user to submit sample image and check search result.Therefore processing system 210 is usually basis can The server that specific network infrastructure is communicated via communication network etc. with computer system 203.In order to realize this A purpose, the processing system 210 of base station 201, which is usually executed, searches for and indexs for reference picture for being responsible for webpage and executing Application software, wherein the movement executed by processing system 210 is by processor 300 according to being stored as answering in memory 301 It is received with the instruction of software and/or the input order received via I/O device 302 from user or from computer system 203 Order execute.
It will also be assumed that user via GUI (graphic user interface) being presented in computer system 203 etc. and The browser application for the webpage being responsible for via display by base station 201 in one specific example is interacted with processing system 210.So And alternatively, the API that connect with existing customer end application interface can be used to realize in this.It is executed by computer system 203 Movement by processor 401 according to the instruction for the application software being stored as in memory 402 and/or via I/O device 403 from Family, which receives, to be input a command for executing.
However, will recognize that, the above-mentioned configuration assumed for the purposes of the following examples is not essential, and Many other configurations can be used.It will also recognize, the division of the function between computer system 203 and base station 201 can be with Changed according to concrete implementation mode.
The example of the method for processing image is more fully described now with reference to Fig. 5 A and Fig. 5 B.
In this illustration, image is acquired in step 500.In the case where reference picture, this is usually from such as conduct It indexes below with reference to Fig. 6 and obtains in the existing reference image data storehouse of a part of program in greater detail.In sample In the case where image, for example as shown in Figure 10 and as described in more detail below, this be can be via user, via such as net The suitable user interface such as page is submitted.
In step 505, acquired image is converted into gray level image and is then filtered in step 510, such as makes Make the edge-smoothing in image with Gaussian filter.Before executing this step, two-value threshold can be used from image Remove background color.In step 515, image normalization can be made by application local maxima filter, so that having The pixel of maximum intensity is arranged to maximum value, and the pixel with minimum intensity is arranged to minimum value.In addition, such as will with It is discussed in lower particular example, further processing, such as hole, the smoothing processing of filling mask can also be performed.
In step 520, multiple feature clusterings are determined.The mode for executing this will depend on the property of feature.For example, if Image includes text, this can use optical character identification (OCR) technology and is identified, wherein these letters indicate specific poly- Class, and for mark, multiple continuous elements of image can indicate respective cluster.
In step 525, bounding box is drawn around different feature clusterings so that reference picture is divided into multiple subgraphs.? This stage, these subgraphs and image can be presented to user, divide so that user be allowed to modify bounding box and therefore modify It cuts.This allows to execute the optional manual inspection to image segmentation, is difficult in the different piece of image by pure automatic technology It can be useful in the case where mark.
Then, in step 530, these images and subgraph are cut out to remove any external background.This is usually to pass through mark Know background color and then gradually removal multirow pixel being implemented to perform without background parts until image.
Therefore, in this stage, corresponding to independent feature clustering original reference image and subgraph multiple through cutting out Image be ready.Then, in step 535, the size of these images is readjusted as normal size, so that Directly compare these standard image sizes.
In step 540, multiple features are extracted from the image through cutting out, wherein these features are used to form one or more A feature vector.A feature vector is obtained generally directed to every image through cutting out and therefore will be directed to every reference and sample This image determines multiple feature vectors.These feature vectors are indicated generally at feature, if pixel is in the intensity of specific location, and it is raw It will be understood by those skilled in the art at the mode of these feature vectors.Therefore this will not be described in further detail.
Anyway, by handling both sample image and reference picture using common technique, which ensure that sample image Feature vector with reference picture is equivalent, to allow to execute in the case where not requiring extra process to feature vector Directly relatively.
The example of the process of creation index is more fully described now with reference to Fig. 6.
In this illustration, in step 600, reference picture is received.These reference pictures be usually from reference database (such as Trademark database etc.) in extract.In step 605, next image is selected to handle this image in step 610, thus such as Above with respect to determining feature vector described in Fig. 5 A and Fig. 5 B.
In step 615, reference picture is added together with the details of subgraph and the feature vector of the image and subgraph It adds in the index of reference picture.In addition, it is associated with image, using metadata tag as any image tag of form also by It is stored as a part of index, although alternatively this can be stored as a part of separated index.
In step 620, determine whether image is complete, and if not complete, then the process, which is gone to, allows to select Select the step 605 of image.Otherwise, in step 625, once index is completely, this may be used for executing search.
The instantiation procedure for executing search is more fully described now with reference to Fig. 7 A and Fig. 7 B.
In this illustration, sample image is by processing system 210 for example using such as in the computer system of user 203 Via the interface for the webpage that browser is presented.It is shown in FIG. 10 including sample image 1001 and multiple options 1002 Example interface, these options can be selected to control search process.It is anti-that option generally includes text search, color of image Turn, subgraph segmentation, press state (for example, reference picture that search corresponds to the trade mark with special state) or classification or data Collection (TM, web graph picture, application shop image, the online retail image of such as country variant) is filtered.Each option usually makes Additional option can be controlled, wherein the segmentation of subgraph is shown to allow to adjust the segmentation.Control 1003 allows to upload With search image.
In step 705, using handling sample image above with respect to processing technique described in Fig. 5 A and Fig. 5 B, thus by This determines multiple feature vectors.Then, in step 710, according to index for included reference picture search for these features to Amount.Therefore, this be related to executing sample image feature vector with reference picture feature vector compared in step 715 to identify and Similar first reference picture of sample image.
In step 720, (such as presented in the computer system of user 203 via browser via suitable user interface Webpage) to user show the first image.The example at this interface is shown in Figure 11.As indicated, which show sample images 1101 and multiple first reference pictures 1102.Multiple search options 1103 can also be provided, such as such as category to knot Fruit filtering or the filter for showing text or image search result.This allows user to check these samples first with reference to figure It is considered relevant image in picture and these images that then selection is such as example shown in FIG. 12.
In step 730, processing system 210 obtains image tag associated with the first reference picture selected and then Ranking is carried out to these image tags in step 735.In this regard, it will recognize that, it is every in these selected reference pictures Image will have one or more image tags and common image tag can be used across multiple images.It therefore, can be with Frequency analysis is executed to determine the relative frequency of the electric current of each image tag in these image tags, to allow to these Image tag carries out ranking.
In step 740, these ranked image tags are shown via boundary's user oriented such as example in figure 13 illustrates. In this illustration, which includes sample image 1301, image tag list 1302 and selected image list 1303.Tool Having the search option 1304 of drop-down field can also be presented, so that result be allowed to be filtered, wherein show merely for example purpose It does well, classification and the text field.
This allows user to select to seem to be best suited for those of sample image image tag, wherein these image tags It is subsequently used for executing further search to reference picture in step 750.
Once identify relevant second image in step 755, can for example based on degree of similarity, share Image tag etc. ranking is carried out to the first and second images.In this regard, it will recognize that, fall into the first and second two figures As the reference picture in grouping will be usually prioritized.It, then can be in step 765 to user for example as shown in Figure 14 Show these results.
Therefore, the above process allows to execute search based on sample image and descriptor associated with reference picture.
The particular example of the iterative search of combination image recognition and metadata will now be described.
This example describes a kind of system for searching for image in one or more databases in an iterative manner and sides Method so that identified using computer based image recognition algorithm visually similar image, can identify pair Text based metadata that maximally related result images are described and can be combined formula image and text search with Just the correlation of search result is improved.
As summarized in fig. 8, an example of the system proposed includes the step of Client-initiated picture search 800 Suddenly, query image is thus uploaded to search system using interface shown in Figure 10.System executes image recognition 805 to identify The image most like with query image in database, 810, for user present those of as shown in Figure 11 as a result, and It is all in those maximally related results as shown in Figure 12 that him/her can be selected to find 815 via user interface Or some results.Then system extracts text based image tag from the metadata of those selected results 820 to arrange And it is in the maximally related descriptor list progress ranking and as shown in Figure 13 returned it into 825 that are directed to those results Now give user.As shown in Figure 14, user then can 830 by those text based image tags with it is his/her Search image is combined to start new search with combined type text and image.
Then, it is combined formula image and text search with these steps summarized in such as Fig. 9 using similar process, by Query image and query text (text based image tag) are supplied to search system 900 by this.905, search system As previously described via its image recognition subsystem execute image identification search and it 910 also otherwise for image metadata Database carries out text search (via its metadata text search subsystem).The generated image from two subsystems search It is combined and is returned as shown in Figure 14 920 915 and be presented to the user.The rest part of system with before in Fig. 8 Described in as.User can be made iteratively and be refined with attaching metadata via this process 925 to 940 and searched Rope.
It is readily apparent that Fig. 9 is the general form of Fig. 8, thus user can be with having used combined image and text The inquiry of the two starts.
For this system, required component includes: image identification search subsystem, metadata text search subsystem System is combined the image result from each search subsystem and the method for ranking and to from these selected results Text based image tag is combined the method with ranking.The example of these components described below.
There are many potential image recognition algorithms to be possibly used for image identification search subsystem, and thus query image is able to It is compared with the database of known image.In Zhao (Zhao), Qie Lapa (Chellappa) and Karen Phillips (Phillips) A variety of figures have been examined in " recognition of face: literature survey (Face recognition:A literature survey) " (2003) As recognizer.A kind of possible way for image recognition technology is based on bag of words method.Bag of words method is from natural language It is come out derived from reason, wherein ignore the sequence of word when analyzing document.In computer vision, bag of words method is inspired The similar mind indicated for image, wherein do not save the exact order of the position of extracted characteristics of image.
According to an example, this system utilizes a kind of multizone probability histogram method for image recognition.SANDERSON Cael (Sanderson) et al. (SANDERSON Cael et al., " Multi-Region Probabilistic Histograms for Robust And Scalable Identity Interference (the multizone probability histogram for robust and expansible mark interference Figure) ", biostatistics international conference, computer science handout, volume 5558, the 198-208 pages, 2009) (hereinafter referred to " SANDERSON Cael ") describe exemplary plural zone domain probability histogram technology.Multizone probability histogram method proposes that image is divided into Several large areas.According to an example, the image closely cut out is divided into 3 × 3 grid, corresponds roughly to generate Nine regions of eyes, forehead, nose, cheek, mouth and chin area.In each region, image is extracted from lesser piece Feature.SANDERSON Cael proposes that one kind is for extraction discrete cosine transform (DCT) feature from 8 × 8 block of pixels and by these Number normalization, only retains lower frequency coefficients (preceding 16 coefficients) and discarded first constant coefficient (generates 15 remaining systems Number) method.
In the training process, visual dictionary is established using the mixing of Gauss method to cluster the DCT feature of these extractions simultaneously According to as each Gaussian clustering main Gauss and associated probability-distribution function expressed by generation vision word likelihood Model.In evaluation process, the DCT feature of each extraction is compared with visual dictionary for each vision list in visual dictionary The posterior probability of word calculating feature vector.This generates the probability histogram vector for the Gaussage that dimension is equal in visual dictionary.This System is directed to each piece of generating probability histogram and equalizing them in each image-region.Characteristics of image signature is these Region histogram links and is the characteristics of image for indicating the object images in image.Two images can be compared by making Compare two characteristics of image signatures with distance/measuring similarity to determine that they indicate whether same target.SANDERSON Cael proposes A method of for calculating the L1 norm between two signatures.Apart from smaller, two images are more possible to indicate same target.
For metadata text search subsystem, there are many open-sources and commercially available system for being used for text search. Among other things, well-known open-source system includes Lucene (full-text search engine), SOLR and ElasticSearch (open source distributed search engine).For example, Lucene is directed to the metadata in each single item ergodic data library presented in queries And document will be matched to be put into the heap that size is K to calculate and return preceding K document matches (in this case, Mei Gewen Shelves are metadata associated with every image in database, therefore are substantially K figures before being returned based on meta data match Picture).
Image and text search subsystem usually both return to score and match and inquire relative to other results to indicate Degree of correlation.
In order to combine image recognition result and text search results and carry out ranking to it, different methods can be used. Assuming that searched database is identical or is overlapped across both image and text search subsystem that one kind is used for combinatorial search As a result method is to see whether these matchings meet different standards and have priority for each standard.Another example is Order of priority is determined to the result (that is, the result found by both image and text search subsystem) occurred simultaneously.For not The rest part of these results occurred simultaneously, they can be combined in many ways based on the search score of each result. For example, being then inserted into these results by rank order by being ranked up according to the score of each set.Alternatively, system can To attempt (for example, if it is linear, to carry out it these score normalizations between 0 and 1 across different search subsystems Scaling), or such as in Mau 2012, " Gaussian Probabilistic Confidence Score for Biometric Score is turned based on known distribution described in Applications (the gaussian probability confidence score of biometric application) " Change probability into.If the score or probability of the image based on return by threshold application in these images and text results, still Above method (sequence, interpolation, normalization) so can be used.A kind of or even simpler method side before and after each other or each other Only show one group of result.
In the proposed system, him/her is selected to think maximally related image result (as shown in Figure 12) in user Later, system is above mentioned using the text based image tag metadata from those results in text based image tag There is the suggestion (as shown in Figure 13) to be added to search inquiry out, thus enhances result shown in Figure 14.
The list of the text based image construction presented in Figure 13 to user is arranged and ranking can be simply It is the frequency of occurrences after the ranking in the metadata of generated image.It can also quilt to the ranking of text based image construction Weight the measured value of the uniqueness (or differentiability) of specific composition.For example, word " CIRCLE (circle) " and " FOUR (four) " are The descriptor more more common than word " QUATREFOIL (quatrefoil) ", therefore QUATREFOIL would be possible to preferably reduce and search Hitch fruit.It may is that sum/database with that result constituted for a this weight of the composition in metadata In overall result.
Usually in trademark database, these text based image constructions or descriptor are provided as first number by trademark office According to.It is clear that this system can usually be extended to image data base (not only trade mark) easily.
For other databases, text based iamge description can by include as Object identifying, optical character identification, Color and shape filtering, manually tagged information or the metadata tag (example of image tag, EXIF data or image peripheral Such as, html tag) the various means of image processing techniques obtain.
In addition, text based image tag can be word, multiple words, phrase, color or position (including are sat Mark).
The variant example of system is the automation version of this iterative search system, thus without user.For this System, system will be automatically using forward search result (for example, by using top ns as a result, as described earlier By to these search result score thresholds), rather than user is allowed to select maximally related search result.It is forward from those As a result the ranked list of maximally related text based image tag is generated.Then system can choose preceding K most frequent bases In text image tag (or to frequency threshold application) and add those text based image tags and complete subsequent figure Picture+text query.For this automated system, iterative search can be based on certain predefined rules (for example, when subsequent Search in overlapping number of matches between picture search and text search when stopping increasing) and stop.
This system can also include the pre-treatment step of the segmentation correlator component part in original query image, thus It carries out respectively with the sub multiple queries sequence for forming parts of images and starting.
Another variant example is that iterative search is allowed to add text based image tag or selected result images Itself (that is, using multiple images+multiple texts as the search of inquiry).This can search for or make by combining multiple single images With image set matching process (Ha Landi (Harandi) et al., 2011, " Graph embedding discriminant Analysis on Grassmannian manifolds for improved image set matching is (for improved Image set, which is matched, is embedded in discriminant analysis to the figure of grassmann manifold) ") and be readily accomplished.
Image processing techniques as Object identifying, optical character identification, color and shape filter can also be applied to its quilt More text descriptors and therefore more bases are carried out in indexed image data base and query image the two to have In the metasearch of text.In addition, it can be useful for the image preprocessing for segmentation, for example, by the text of image Local area is split from the character portion of image.
The matched particular example of pictorial trademark using image procossing will now be described.
This example broadly provides a kind of handle from one or more databases (for example, pictorial trademark database) The system and method for image so that these visually computer based image recognition algorithm can be used in similar image It is identified and ranked.
There are another challenges for trademark database, that is, has changed for many years for the rule of picture format.Moreover, These rules are very little, so that providing the modes of these patterns hereof, there are many variations.For example, many is earlier Pictorial trademark is scanned from paper document, wherein there are management frameworks, side in the edge of a part for not being pattern Boundary and text.In addition, many images in trademark database include multiple patterns in single image file.It is another compound to ask Topic is that many pictorial trademarks include both mark and hand-written name.Although name is considered as a part of pattern, searchers Also often individually consider the similarity of mark component part.
It is all these to mean that the system of visually similar picture search is allowed to need to carry out image significant pre- place Reason with these relevant component parts are isolated to be compared as defined in obscuring for trade mark based on similarity rule (and It is not dogmatic judgement).Because there are many registered trademarks (usually up to a million), this pretreatment must be automated substantially 's.
One example of the system proposed in image the following steps are included: via subgraph is divided to from trade mark number Automation pretreatment is carried out according to the image in library, image and subgraph are then zoomed into predefined size, the every figure then generated Picture and subgraph experience marks extraction step, thus the pattern in image is summarized as " feature ", and then these images, son Image and feature are indexed into database for searching for.
Another example of the system proposed include via in image divide subgraph to from trademark database Image carries out automating pretreated step and then checks the manual steps of those subgraphs by people, then by image and subgraph Picture zooms to predefined size, the every image and subgraph experience marks extraction step then generated, thus the pattern in image It is summarized as " feature ", and then these images, subgraph and feature are indexed into database for searching for.
Various forms can be taken by carrying out pretreated step to the image from trademark database via segmentation subgraph.
One specific example is related to saving original image and is partitioned into subgraph from the original image.
Another example is to save original image, and then the textual portions of segmented image, then make the textual portions of image It is covered by the subgraph of the segmentation from original image.
Another example is as above, wherein all images and subgraph are trimmed first.
Pre-shaping step can be by determining or estimating background pixel color (for example, by assuming that the image upper left corner is background Color or by by 4 apex angles of image or boundary equalization) or by only assume that background should be white or white Shade (that is, intensity value is higher than 240 but inadequate 255, wherein 255 be pure white) is completed.Then, from all 4 sides of image Edge is inwardly cut out until one or more pixels satisfaction is not background pixel color as defined above.Then image is cut It is cut into the bounding box of that generation, to return to the image being corrected.
However, it is possible to alternatively use technology, iteratively increase threshold value such as on gray level image until any in threshold value Until pixel quantity on side stops variation.The larger pixel quantity that background color is then chosen as having is as background color Side, wherein and then remove this background color using the mask of those pixels.
It detects and the example of the mode of the textual portions of segmented image can be based on initially by Wall (Wahl), king (Wong) With Kai Xi (Casey) " Constrained Run-Length Algorithm (controlled run length algorithm) " proposed (" Block segmentation and text extraction in mixed text/image documents is (mixed for variant Close the block segmentation in text/image file and Text Feature Extraction) ", computer graphic image processing, 1982).Another method be by The stroke width that Epstein (Epshtein), Ao Feike (Ofek) and Haskell Wexler (Wexler) propose converts (SWT) algorithm Variant (" Detecting text in natural scenes with stroke width transform (and with stroke it is wide Spend the text in change detection natural scene) ", computer vision and pattern identification, 2010).
When text is detected, that part is tailored and is saved as subobject.Then, it is covered in original image That region, for example, by creating new figure based on the urtext region filled with image background color (usually white) Picture.In one example, text detection is executed according to the white list of expected text items.For example, member associated with image Data are generally included comprising the instruction to any text in image, and therefore this can be used for establishing white list, the white name It is single to be then compared with the result of OCR process easily and accurately to detect text.
How to divide multiple and different subgraphs (that is, in an image text in the image of original image or removal text Multiple patterns in part, or image file pattern multiple component parts) example be related to first converting the image At gray level, then use the combination of these steps: Gaussian filter helps edge-smoothing (by neighbouring adjacent shape Link together), using local maxima value filtering, thus set maximum for all pixels of the maximum value in its neighborhood (white), the hole for filling mask make edge-smoothing, then to the mask and executing two-value opening and closing operations on that mask Thresholding is to be clustered and be exported the bounding box that those are clustered.These boxes can be ranked based on size requirements or by mistakes Filter, and finally it is trimmed and is saved as subgraph.However, will recognize that, other technologies can be used.
Once image and subgraph (both text and image) have been tailored, modify and have scaled, spy can be extracted Sign.
Different features can be applied to different types of image.For example, the image that text is tailored can be treated with difference To carry out optical character identification (OCR).However different features can be used in the image of " only image ".
A kind of possible way for image recognition technology is based on bag of words method.Bag of words method is from natural language processing It is derived, wherein ignore the sequence of word when analyzing document.In computer vision, bag of words method, which has inspired, to be directed to The similar mind that image indicates, wherein do not save the exact order of the position of extracted characteristics of image.
According to an example, this system utilizes a kind of multizone probability histogram method for image recognition.SANDERSON Cael (Sanderson) et al. (SANDERSON Cael et al., " Multi-Region Probabilistic Histograms for Robust And Scalable Identity Interference (the multizone probability histogram for robust and expansible mark interference Figure) ", biostatistics international conference, computer science handout, volume 5558, the 198-208 pages, 2009) (hereinafter referred to " SANDERSON Cael ") describe exemplary plural zone domain probability histogram technology.Multizone probability histogram method proposes that image is divided into Several large areas.According to an example, the image closely cut out is divided into 3 × 3 grid, corresponds roughly to generate Nine regions of eyes, forehead, nose, cheek, mouth and chin area.In each region, image is extracted from lesser piece Feature.SANDERSON Cael proposes that one kind is for extraction discrete cosine transform (DCT) feature from 8 × 8 block of pixels and by these Number normalization, only retains lower frequency coefficients (preceding 16 coefficients) and discarded first constant coefficient (generates 15 remaining systems Number) method.
In the training process, visual dictionary is established using the mixing of Gauss method to cluster the DCT feature of these extractions simultaneously According to as each Gaussian clustering main Gauss and associated probability-distribution function expressed by generation vision word likelihood Model.In evaluation process, the DCT feature of each extraction is compared with visual dictionary for each vision list in visual dictionary The posterior probability of word calculating feature vector.This generates the probability histogram vector for the Gaussage that dimension is equal in visual dictionary.This System is directed to each piece of generating probability histogram and equalizing them in each image-region.Characteristics of image signature is these Region histogram links and is the characteristics of image for indicating the object images in image.Two images can be compared by making Compare two characteristics of image signatures with distance/measuring similarity to determine that they indicate whether same target.SANDERSON Cael proposes A method of for calculating the L1 norm between two signatures.Apart from smaller, two images are more possible to indicate same target.
According to an example, this system executes automatic object detection with the object in detection image (for example, mark, product And brand).Object matching is usually directed to detection and extracts the specific characteristic in image.In order to execute reliable Object identifying, weight What is wanted be the feature extracted from image is detectable in the case where image scaled, noise, illumination change and visual angle change. This system detects the point being usually located in the high-contrast area (such as target edges) of image.
According to an example, this system converts (SIFT) critical point detection technology, the technology using Scale invariant measure feature Including calculating in scale space applied to a series of gradually results of the difference of the Gaussian function of smooth/blurry versions of image Maximum and minimum.This system is that each key point specifies dominant direction, and analyzes these gradient magnitudes and direction with true Determine feature vector.Then feature vector can use one kind in the similar method of multizone histogram method as described via general The each feature vector extracted from image be compared with the visual dictionary of feature and probability histogram caused by storing and It is converted to feature histogram.This system compares those features across image by using nearest neighbor search and carrys out further matching image Between feature, to find out some match-percentage higher than acceptable thresholds.
Therefore, all processes can be as shown in figure 15.This broadly includes these steps for establishing search engine 1500, It is related to pre-processing image with one or more steps in following steps for every image.In step 1505, execute Background and noise remove are related to some for detecting background color, production reversion copy or removal background and removal noise Pretreatment.In step 1510, execute the OCR based on metadata, via OCR (that is, white list) using in trade mark known to Word related metadata is detected.If being detected, then these words are just removed, wherein are saved and closed In the information whether expected word is found and is removed.
1515, segmentation is executed, is related to dividing spot when removing image model text.Neighbouring object is grouped simultaneously And small object is deleted, as a result, cutting the mark component part of trade mark.1520, in original and segmented image Every image executes feature extraction, wherein handles as needed image (for example, finishing, gray processing, scaling) and as described above Extract feature.1530, after all images are processed, feature is organized into several data arrays and is loaded into figure As in search work device.
As shown in Figure 16, when search, one group of similar step is executed.Therefore, in this case, for each sample This image is pre-processed with one or more steps in following steps in 1600 pairs of images.
1605, execute background and noise remove, be related to it is some for detect background color, production reversion copy or It removes background and removes the pretreatment of noise.1610, execute OCR to detect word, these words if being detected by Removal, wherein save the information for whether being found and being removed about expected word.
1615, segmentation is executed to divide spot when removing image model text.Execute nearby object grouping and The deletion of small object, as a result, cutting the mark component part of trade mark.It, can optionally will be through locating 1620 in this stage The image of reason is presented to the user, to allow user's approval, modification or creation segment.
1625, feature extraction is executed, wherein handle as needed image, for example, in the advance for extracting feature Row finishing, gray processing, readjustment size etc..
1630, search is executed by picture search working device, wherein these results are organized and return to user.
This is process and associated workflow is further shown in Figure 17.
In this illustration, image 1701 is uploaded to server 210, at the server, 1702, such as by holding Row OCR, segmentation and feature extraction handle the image.As a part of this process, user's input can be found 1703 (for example) to instruct cutting procedure.
After processing, image is forwarded to one or more search modules 1704 1705.It in this regard, such as will be by ability Field technique personnel recognize that each module may include a part of the entire set of reference picture, so as to entire The processing of set can be executed by multiple modular concurrents.
Then, it is as a result organized, wherein 1706, obtain from the text meta-data stored in index 1707 in image The metadata of label form.1708, the result including metadata is presented to user, wherein executes user's selection 1709 Associated picture and/or metadata.1710, upload user selection is searched to allow to execute text based on text meta-data 1707 Rope.
These results are combined 1711 and are provided to user 1712, to allow to repeat step 1709 as needed To 1712, these results is allowed to be further refined.
Through this specification and following claims, unless the context requires otherwise, otherwise word " including (comprise) " it will be understood as implying and include with such as " including (comprises) " or " including (comprising) " variant The integer illustrated or one group of integer or step, but it is not excluded for any other integer or other group of integer.
Those skilled in the art will appreciate that many change and modification will be apparent.Become to those skilled in the art Obtaining apparent all such changes and modifications should be considered within the spirit and scope that the aforementioned present invention broadly occurs.

Claims (15)

1. the equipment for using when searching for multiple reference pictures, which includes one or more electronic processing devices, should One or more electronic processing devices:
A) an at least image is acquired;
B) image is handled to determine multiple subgraphs and multiple images feature associated with the image and/or subgraph; Also,
C) picture search is executed using the image, the subgraph and these characteristics of image, wherein the image is a sample graph At least one of one of picture and multiple reference pictures, and wherein, search are at least partially through searching for multiple with reference to figure As being performed to identify multiple reference pictures similar with the sample image.
2. equipment according to claim 1, wherein this method includes one index of creation, which includes multiple references Image, every reference picture are associated with multiple subgraphs and multiple images feature.
3. equipment according to claim 1, wherein the one or more electronic device is by dividing the image to form this A little image handles the sample image.
4. equipment according to claim 3, wherein the one or more electronic device divides the figure by following operation Picture:
A) multiple feature clusterings in the image are determined;Also,
B) according to these cluster segmentations image.
5. equipment according to claim 3, wherein the one or more electronic device divides the figure by following operation Picture:
A) gray level image is converted the image to;
B) gray level image is filtered to generate a gray level image through filtering;
C) image intensity of the gray level image through filtering is normalized to generate a normalized image;Also,
D) multiple clusters in the normalized image are determined.
6. equipment according to claim 1, wherein the one or more electronic device passes through at least one in the following terms Handle the image:
A) image and these subgraphs are zoomed in and out;
B) multiple images feature is determined from the image and these subgraphs;Also,
C) at least one of the following is removed:
I) image background;
Ii) noise;And
Iii) text.
7. equipment according to claim 6, wherein the one or more electronic processing device is by following operation come to this A little images zoom in and out:
A) these images and subgraph are cut out to remove background and form multiple images through cutting out;Also,
B) size of the image through cutting out is readjusted to the image size limited at one.
8. equipment according to claim 1, wherein the one or more electronic processing device is handled by following operation The image:
A) optical character identification is executed to detect text;Also,
B) text is removed from the image.
9. according to equipment described in any claim 8, wherein when the image is a reference picture, the one or more The text and the reference picture in an index are associated by electronic processing device.
10. equipment according to claim 1, wherein the one or more electronic processing device:
A) at least image in the image and these subgraphs is handled to determine multiple images feature;Also,
B) feature vector is determined using these characteristics of image.
11. a kind of method for being used when searching for multiple reference pictures, this method comprises:
A) an at least image is acquired;
B) image is handled to determine multiple subgraphs and multiple images feature associated with the image;Also,
C) picture search is executed using the image, the subgraph and these characteristics of image, wherein the image is a sample graph At least one of one of picture and multiple reference pictures, and wherein, search are at least partially through searching for multiple with reference to figure As being performed to identify multiple reference pictures similar with the sample image.
12. a kind of method for executing picture search, method includes the following steps:
A) query image is uploaded to a search engine by user;
B) search engine identifies multiple visually similar matching images using image recognition in a database;
C) multiple matching image results are presented for user;
D) all or part of matching image results in those matching image results are selected as maximally related matching image knot by user Fruit;
E) search system extracts the metadata of multiple selected results to carry out arrangement and ranking to maximally related image tag;
F) list of image tag is presented for user;Also,
G) formula image is combined based on one or more image tags in these image tags for user's presentation and text is searched The option of rope.
13. a kind of for executing the search system of picture search, which includes a search engine, and wherein:
A) query image is uploaded to a search engine by user;
B) search engine identifies multiple visually similar matching images using image recognition in a database;
C) multiple matching image results are presented for user;
D) user selects all or part of matching image results in those matching image results as maximally related matching image As a result;
E) search system extracts the metadata of multiple selected results to carry out arrangement and ranking to maximally related image tag;
F) list of image tag is presented for user;Also,
G) formula image is combined based on one or more image tags in these image tags for user's presentation and text is searched The option of rope.
14. it is a kind of for carrying out pretreated method to multiple images from trademark database, this method comprises:
A) divide multiple subgraphs in the image;
B) image and these subgraphs are zoomed into a predefined size;
C) image and subgraph generated to every executes feature extraction, so that multiple in the image or these subgraphs Pattern is summarized as multiple features;Also,
D) it indexs for these images, subgraph and feature for searching in a database.
15. pair image from trademark database carries out pretreated equipment, which includes a computer system, the calculating Machine system executes:
A) divide multiple subgraphs in the image;
B) image and these subgraphs are zoomed into a predefined size;
C) image and subgraph generated to every executes feature extraction, so that multiple in the image or these subgraphs Pattern is summarized as multiple features;Also,
D) it indexs for these images, subgraph and feature for searching in a database.
CN201910244643.3A 2013-12-20 2014-09-26 Image search method and equipment Pending CN110263202A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
AU2013905002A AU2013905002A0 (en) 2013-12-20 Iterative search combining image recognition and meta data
AU2013905002 2013-12-20
CN201480053618.2A CN105793867A (en) 2013-12-20 2014-09-26 Image searching method and apparatus

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201480053618.2A Division CN105793867A (en) 2013-12-20 2014-09-26 Image searching method and apparatus

Publications (1)

Publication Number Publication Date
CN110263202A true CN110263202A (en) 2019-09-20

Family

ID=56389815

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910244643.3A Pending CN110263202A (en) 2013-12-20 2014-09-26 Image search method and equipment
CN201480053618.2A Pending CN105793867A (en) 2013-12-20 2014-09-26 Image searching method and apparatus

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201480053618.2A Pending CN105793867A (en) 2013-12-20 2014-09-26 Image searching method and apparatus

Country Status (1)

Country Link
CN (2) CN110263202A (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268510B (en) * 2016-12-30 2022-01-28 华为技术有限公司 Image annotation method and device
US11538159B2 (en) * 2017-04-13 2022-12-27 Siemens Healthcare Diagnostics Inc. Methods and apparatus for label compensation during specimen characterization
CN108133745B (en) * 2017-12-21 2020-08-11 成都真实维度科技有限公司 Clinical path complete data correlation method taking medical image as core
EP3611733A1 (en) * 2018-08-15 2020-02-19 Siemens Healthcare GmbH Searching a medical reference image
CN109740007B (en) * 2018-08-27 2022-03-11 广州麦仑信息科技有限公司 Vein image fast retrieval method based on image feature signature
CN110532413B (en) * 2019-07-22 2023-08-08 平安科技(深圳)有限公司 Information retrieval method and device based on picture matching and computer equipment
CN111161234B (en) * 2019-12-25 2023-02-28 北京航天控制仪器研究所 Discrete cosine transform measurement basis sorting method
CN113282779A (en) 2020-02-19 2021-08-20 阿里巴巴集团控股有限公司 Image searching method, device and equipment
EP3882568B1 (en) * 2020-03-16 2022-07-06 Carl Zeiss Industrielle Messtechnik GmbH Computer-implemented method for automatically creating a measurement plan and corresponding computer program, computer program product and coordinate measuring machine
CN112861656B (en) * 2021-01-21 2024-05-14 平安科技(深圳)有限公司 Trademark similarity detection method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100046842A1 (en) * 2008-08-19 2010-02-25 Conwell William Y Methods and Systems for Content Processing
CN102402582A (en) * 2010-09-30 2012-04-04 微软公司 Providing associations between objects and individuals associated with relevant media items
US8489627B1 (en) * 2008-08-28 2013-07-16 Adobe Systems Incorporated Combined semantic description and visual attribute search

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132462A1 (en) * 2007-11-19 2009-05-21 Sony Corporation Distributed metadata extraction
EP2313847A4 (en) * 2008-08-19 2015-12-09 Digimarc Corp Methods and systems for content processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100046842A1 (en) * 2008-08-19 2010-02-25 Conwell William Y Methods and Systems for Content Processing
US8489627B1 (en) * 2008-08-28 2013-07-16 Adobe Systems Incorporated Combined semantic description and visual attribute search
CN102402582A (en) * 2010-09-30 2012-04-04 微软公司 Providing associations between objects and individuals associated with relevant media items

Also Published As

Publication number Publication date
CN105793867A (en) 2016-07-20

Similar Documents

Publication Publication Date Title
US20240070214A1 (en) Image searching method and apparatus
CN110263202A (en) Image search method and equipment
EP3028184B1 (en) Method and system for searching images
Wang et al. Salient object detection for searched web images via global saliency
JP5503046B2 (en) Image search based on shape
US20110188713A1 (en) Facial image recognition and retrieval
US10482146B2 (en) Systems and methods for automatic customization of content filtering
CN104537341B (en) Face picture information getting method and device
CN110059156A (en) Coordinate retrieval method, apparatus, equipment and readable storage medium storing program for executing based on conjunctive word
JP2011248596A (en) Searching system and searching method for picture-containing documents
Wasson An efficient content based image retrieval based on speeded up robust features (SURF) with optimization technique
Zheng et al. Constructing visual phrases for effective and efficient object-based image retrieval
KR101910825B1 (en) Method, apparatus, system and computer program for providing aimage retrieval model
AbdElrazek A comparative study of image retrieval algorithms for enhancing a content-based image retrieval system
Deshmukh et al. An improved content based image retreival
Bhoyar et al. A Review Paper On Automatic Text And Image Classification For News Paper
Ansari et al. A refined approach of image retrieval using rbf-svm classifier
Lotfi et al. Wood image annotation using gabor texture feature
Xu Cross-Media Retrieval: Methodologies and Challenges
Kumari et al. A Study and usage of Visual Features in Content Based Image Retrieval Systems.
Michaud et al. Adaptive features selection for expert datasets: A cultural heritage application
Nodari et al. Color and texture indexing using an object segmentation approach
Wang et al. A new web image searching engine by using sift algorithm
Thinzar et al. The design of content based image retrieval with a combination of visual content features
Ayoub et al. Demo of the SICOS tool for Social Image Cluster-based Organization and Search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination