CN100414549C

CN100414549C - Image search system, image search method, and storage medium

Info

Publication number: CN100414549C
Application number: CNB2006101056207A
Authority: CN
Inventors: 小山刚弘; 川边惠久
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-07-21
Filing date: 2006-07-17
Publication date: 2008-08-27
Anticipated expiration: 2026-07-17
Also published as: CN1900933A; JP2007026386A

Abstract

An image search system includes a first calculation section that calculates a first similarity score of each registered image with respect to an input image on the basis of image features of the registered and the input image, a second calculation section that calculates a second similarity score of each registered image with respect to the input image on the basis of text features of the registered and the input image, a candidate extraction section that extracts one or more candidate images on the basis of the first and the second similarity scores of each registered image, a third calculation section that calculates a third similarity score of each candidate image on the basis of projection waveforms of the input image and the candidate image, and a search section that determines one or more registered images similar to the input image on the basis of the third similarity score.

Description

Image search system, image search method and storage medium

Priority information

The application requires the right of priority of Japanese patent application No. 2005-211775 that applied on July 21st, 2005 and the Japanese patent application No. 2005-365409 that applied on Dec 19th, 2005, and their integral body is incorporated in this as a reference.

Technical field

The present invention relates to the technology of the image similar such as search and retrieval in a kind of image from be recorded in database etc. to input picture.

Background technology

Recently, consider abidance by rules or protection of personal information, quite pay attention to the security of enhancing information and information processing in business circles.For example, in order to respond audit or analogue, require company according to professional public information that it carried out.Therefore, company is necessary to write down visit to its service as daily record data and manage this visit, so that they can specify the personnel of processed information, the kind of information and process of having used or the like.

Consider this trend, a kind of system has been proposed, wherein under the situation of several operations, such as by duplicate, printing, facsimile transmission or similar fashion output file and the operation by scanning electron input paper spare, view data in these operating period outputs or input was stored as daily record data with the date, operator's name etc., then, if the time is afterwards suspected the data leak that relates to a certain file, then search for the file identical, thereby offer some clarification on source of leaks or the like with this specific file by this daily record data.

In order to realize said system, must from described daily record data, search for and retrieve image corresponding to file destination.Here, can carry out search as search condition, rather than user's input then can realize very high operability such as detailed information such as search keys if utilize by the image itself that scans this file destination acquisition.For any purpose except above-mentioned security purpose, if the user can search for the whole image data storehouse and retrieve the image similar to scan image, it will be favourable equally.

The Jap.P. spy opens 2004-139210 number (list of references 1), spy and opens flat 9-270902 number (list of references 2), spy and open 2003-281176 (list of references 3) and spy and open flat 10-49659 (list of references 4) and disclose the related art that relates to aforesaid picture search.In these related art, by the view data computed image characteristic quantity of scanned document, the similar view data of search on the basis of this image feature amount.

In addition, the Jap.P. spy opens 2005-149071 number (list of references 5) and discloses a kind of device, its receive document image and additional information (search inquiry statement) thereof as search condition after, the search entire database is to search the file with similar document image and similar characteristics of image that is recorded in the database, to obtain intermediate result, as final Search Results, from described intermediate result, obtain the file of the search condition coupling of its additional information and input then.

Equally, the Jap.P. spy opens 2003-91730 number (list of references 6) and discloses a kind of method, wherein obtain the projection waveform of the input picture of level or vertical direction, and with the projection waveform of document image relatively, thereby obtain the document image similar to input picture.

Further, the Jap.P. spy opens 2001-319232 number (list of references 7) and discloses a kind of method, wherein input picture is divided into a plurality of, obtain the various image feature amount that comprise outline line feature, frequency distribution feature etc. for each piece, thereby retrieve the distribution and the similar image of input picture of its image feature amount.

In above-mentioned all list of references 1-4, the similarity of image all is to determine on the basis of single image eigenwert basically.Yet this similarity based on the single image feature is determined that method does not always produce and is approached the similarity of utilizing vision to obtain, because employed characteristics of image may or can not be realized senior identification with respect to the type of input picture.More particularly, when a people wants to calculate similarity between the picture with scenes image, for example, be otiose based on the similarity calculation method of the text-string use characteristic amount that obtains by OCR (optical character identification).Simultaneously, if by the similarity calculation method of using image feature amount based on the density or the Density Distribution of image, determine to have closely similar structure but the similarity between a plurality of document images with diverse penman text content, the score of resulting relevant these document images is impossible too big difference arranged each other, makes to be difficult to the required file of identification between these images.

Above-mentioned relevant prior art can provide the definite result with suitable precision, as long as ferret out is limited only specific image type, and photo for example, and use the feature that is suitable for this specific image type.Yet, in business activity, in fact to handle various types of files, comprise the combination of photo, picture, text or these contents, and these Miscellaneous Documents are stored in all in daily record or the database as ferret out.The characteristics of image that can be used for discerning photo is not suitable for discerning text usually, and vice versa.Therefore, the related art of above-mentioned all use single image features all is considered to be not sufficient to search for various types of document images.

Disclosed technology attempts to increase search precision by the combination of using single image characteristic quantity and additional information in list of references 5.Yet, utilize this method, can not obtain suitable intermediate result, therefore in final search result, can not reach desired precision, unless the image type of employed image feature amount and the search condition of input mates during search.In addition, this method need make additional information be associated with the image of document image and search condition.For example, when a people want to search for by like the images category that scans paper spare simply and obtained during image, just can not use this method because in the case, do not have additional information.Therefore, the use of this method needs the user to import additional information such as the search inquiry statement, and this has increased with the user operates relevant burden.

In addition, in list of references 6 and 7 in the disclosed technology, with the disclosed technology type of list of references 1-4 seemingly, the similarity of image is to determine on the basis of single image eigenwert basically.Therefore, often do not obtain to approach the similarity that obtains by vision, because as mentioned above, employed characteristics of image may maybe can not be realized the senior identification with respect to the input picture type.

Summary of the invention

According to an aspect of the present invention, provide a kind of image search system, comprise first calculating section, its characteristics of image based on each document image and input picture calculates the first similarity score of this each document image with respect to input picture; Second calculating section, its text feature based on each document image and this input picture calculate the second similarity score of this each document image with respect to this input picture; The candidate extraction part, its first and second similarity scores based on each document image are extracted one or more candidate record images; The 3rd calculating section, its projection waveform based on this input picture and each candidate record image calculate the third phase of this each candidate record image like the property score; And the search part, determine the one or more document images similar based on the third phase of each document image like the property score to this input picture.

Description of drawings

Disclosed these and other aspect of this instructions will become clearer by the description below in conjunction with accompanying drawing, wherein identical to identical certain applications Ref. No..In the accompanying drawing:

Accompanying drawing 1 is the functional block diagram of the structure of expression image search apparatus according to an embodiment of the invention;

Accompanying drawing 2 is the views that are used to illustrate the photograph image search;

Accompanying drawing 3 is the views that are used to illustrate the similar text search of the position of considering that word wherein occurs;

Accompanying drawing 4 is to be used to illustrate the view that calculates the processing of similarity on the basis of the projection waveform of image;

Accompanying drawing 5 is process flow diagrams of the processing procedure partly carried out by candidate extraction of expression;

Accompanying drawing 6 is functional block diagrams of the structure of expression modified example image search apparatus;

Accompanying drawing 7 is functional block diagrams of the structure of another modified example image search apparatus of expression; And

Accompanying drawing 8 is views of exemplary hardware structure of representing to realize the computer system of image search system thereon;

Embodiment

Below with reference to accompanying drawings exemplary embodiments of the present invention is described in detail.

By reference accompanying drawing 1, will the structure according to the image search apparatus of exemplary embodiments be described.The document image that search and retrieval have high similarity with respect to input document image 100 in the document image of this image search apparatus from be recorded in existing image data base or image log store equipment (not shown).This input document image 100 for example can be to read the view data that paper spare obtains by scanner, or by the bitmap images of conversion by the image file acquisition of any program creation in various types of application programs.After receiving input document image 100, image search apparatus offers in photograph image search part 10 and the text search part 20 each with this input document image 100.

Photograph image search part 10 is to carry out to be suitable for the part handled such as the search of the continuous-tone image of photo.Photograph image search part 10 can be divided into a plurality of with input picture, and carries out image is searched on the basis of the similarity of the image feature amount of relevant each piece.Photograph image search part 10 typically can realize by executive program.

The image feature amount of photograph image search part 10 is extracted the image feature amount that part 12 is extracted this input document image 100.As image feature amount, for example can use two-dimentional edge amount to distribute.Specifically, as shown in Figure 2, the image 200 of pre-sizing is divided into the piece 210 of the predetermined quantity of pre-sizing (for example 8 * 8,16 * 16,32 * 32).Then with the edge extracting filter application in image 200 calculating the edge amount of each piece 210, and combination; That is, the distribution of the edge amount of each piece 210 is acquired as image feature amount.The method that this use edge amount distributes is suitable for by digital camera or similar devices captured pictures image.Interchangeablely be, also can obtain the average color of each piece, rather than the edge amount, and the combination (or distribution) of using the average color of each piece is as image feature amount.In addition, can also make image binaryzation, and the distribution of obtaining the ratio of black picture element in each piece 210 is as image feature amount.Similarly, various types of image feature amount and corresponding that separating method puts forward in a usual manner, and described photograph image search part 10 can be used any and corresponding separating method that gets in these image feature amount.In addition, the outline line feature that obtains for each piece or the distribution of frequecy characteristic can be used as the image feature amount that is used to mate.In addition, the combination in any of above-mentioned two or more image feature amount also can be used for the coupling of image.

Here, the above-mentioned image feature amount that is used for photograph image search part 10 has relative simple algorithm, even therefore utilize software processes also can calculate with high speed.In addition, the matching treatment of the calculating of this image feature amount and use image feature amount also can realize by hardware circuit, therefore is suitable for high speed processing.

Characteristic quantity compatible portion 14 calculate by image feature amount extract the image feature amount (or combination of a plurality of image feature amount) of the input document image 100 that part 12 obtains and be stored in image data base or image daily record (all not shown) in the image feature amount (or combination of a plurality of image feature amount) of each document image between similarity.The image feature amount of each document image used be used in image characteristics extraction part 12 in identical algorithm calculate, then each image recording is being recorded among the characteristic quantity DB (database) 30 in image data base or image daily record the time.Specifically, the image feature amount of each document image (distribution of for example above-mentioned edge amount) is recorded among the characteristic quantity DB30 with the file ID (identifying information) of this document image relatedly.Characteristic quantity compatible portion 14 is that each document image calculates the similarity score, and this similarity score is indicated the grade of the similarity between the image feature amount of the image feature amount of described document image and input document image 100.Can utilize traditional known method to calculate this similarity score.

As mentioned above, photograph image search part 10 is for each document image calculating and export this document image with respect to the similarity score of input document image 100 score of image feature amount (promptly based on).This similarity score is represented as first score 110 in accompanying drawing 1.Similarly, photograph image search part 10 for each document image for example export by the file ID of this document image and first score 110 constitute right, provide it to candidate extraction part 50 then.

Below text search part 20 will be described.Text search part 20 is the parts that are used for using the similar document image of signature search of the text-string that exists at input document image 100, and realize by computer program typically.Text search part 20 comprises that character recognition part 22, word extract part 24 and search processing section 26.

Character recognition part 22 utilizes OCR (optical character identification) algorithm or OCR circuit to discern the character that is included in the input document image 100.Here, can use conventional OCR algorithm or OCR circuit.

Word extracts 24 pairs of character strings from 22 outputs of character recognition part of part and carries out known natural language analysis (for example lexical analysis), thereby extracts the word in the present input document image 100.Therefore, word extracts for example relevant data that are included in one group of word in the input document image 100 of part 24 outputs.Preferable is that the data of relevant described one group of word comprise all that for each word relevant this word appears at the information of the number of times in the input document image 100.In addition, the word that is extracted can be restricted to specific word class (for example only limiting to noun).

Search processing section 26 uses the one group of word that occurs that obtains by word extraction part 24 to search for whole text DB 40, and calculates the similarity score of each document image with respect to this input document image 100.

Here, the tabulation of the file ID of several document images is recorded among the text DB 40, has some words in each document image, and each word is used as an index.Text DB40 can create as follows.Specifically, when the image recording of each document image is in image data base, daily record etc., character recognition and the word carried out about each document image extract, and the file ID of described document image is recorded in the row of text DB 40, and text DB 40 has the word that is extracted as index.Here, consider that single word very likely occurs repeatedly in document image, preferably,, also can write down described word and appear at number of times in this document image except the file ID that record simply is associated with each index word.

For example calculate the similarity score in the following manner by cross reference file DB 40.Specifically, text search part 20 is used by word and is extracted each word that part 24 extracts as key, the whole text DB 40 of search, and obtains right that number of times by the file ID of the log file that wherein has described word and the appearance of this word constitutes for each word.Here, exist certainly wherein to single word and obtain a plurality of right situation that the number of times by file ID and appearance constitutes.Yet,, on the basis of the information that obtains like this, score is added in the file ID of the log file that wherein has this word for each word.

Utilize aforementioned calculation, for each file ID tissue for example relevant for each word obtain by this document ID with and the right information that constitutes of the number of times of appearance, so that for each document image, all obtain by appearing at right that number of times that word in this document image and this word occur constitutes.Then, calculating each word poor on the number of times that occurs between input document image 100 and each document image, is each document image absolute value sum of further calculating described difference (or quadratic sum, root mean square or the like) in addition.Here, if input document image 100 is equal to described document image, result of calculation is " 0 ", yet the value of result of calculation increases along with the difference between input document image 100 and the described document image.Therefore, the result of calculation by using opposite in sign can be provided with this similarity score as the similarity score of the document image of correspondence, makes this get score value and becomes big along with described document image and as the similarity between the input document image 100 of text.All document images that are recorded in image data base or the image daily record are all carried out aforementioned calculation.

Though under above-mentioned sample situation all log files that are recorded in image data base or the image daily record are all carried out calculating, the present invention is not limited to above-mentioned situation, also can only carries out and calculate the file that satisfies predetermined condition.The scope of this date and time when being used for reducing the condition of calculating target and can being designated as document image and being recorded in image data base or image daily record, write down group under the user of document image or the like.Specifically, can be only to the file that in appointed day and time range, writes down or for example only calculate to carrying out by the file of the user record that belongs to designated groups.Can specify this reduction condition by the user interface of this image search apparatus by the user.

The output of search processing section 26 by the file ID of log file be the right of similarity score (second score 120 shown in the accompanying drawing 1) formation that obtains like this of all log files.Then output data is offered candidate extraction part 50.

Text based similarity score computing method are not limited to the above-mentioned exemplary method by 20 execution of text search part.In the field of text search, develop various types of methods that are used to obtain the search score by usual manner, this search score is indicated the matching degree of each log file with respect to the search condition of being represented by the logical expression of groups of keywords or key word.Any method in these conventional methods can be used the method for using as in the text search part 20 of present embodiment certainly.

Here, has the characteristic that relates to the search purpose that is different from common file search based on key word according to the text search of present embodiment.Particularly, keyword search is to search for the file that wherein comprises a certain key word basically, and the key word that will search for and file are bar items independently.By contrast, in leaking checking (it is one of search purpose of present embodiment), will search for basically and the identical or closely similar target of input clauses and subclauses, for example by searching for document image, its image identical or closely similar (being input document image 100) complete and corresponding to one page or multipage corresponding to one page or multipage.Therefore, in the search of present embodiment, by considering not only that each word appears at the number of times in the image but also consider that each word wherein appears at the position in the image, search accuracy can further improve.The example of this searching method below will be described.

According to this method, image 300 at first is divided into a plurality of 310, as shown in accompanying drawing 3.In the example shown, image 300 is divided into 64 (on the vertical direction on 8 * horizontal direction 8).Then, for each word that from described image, extracts, the position of specifying the piece under first character of described word to occur as described word.In the method, for each index word, the tabulation that comprises the position that occurs described word in the file ID of this document image and the image (for example piece number) is recorded among the text DB 40.In order to search for, distribute higher similarity must give a document image, the word identical with the word that extracts from input document image 100 appears at this word and appears at identical position, position in this input document image in this document image, and distribute lower similarity score (promptly in other cases, wherein the word that extracts from input document image 100 does not appear at the situation of the same position in the document image, wherein the word in this document image is not positioned at the same position in the input document image 100, or the like).As a specific example, about each word in the input document image 100, obtain the ratio (when described word appeared at two kinds of same positions in the image, described position was counted as 1) of the sum of the position that the quantity addition that occurs the position of described word in quantity and quantity by wherein said word being appeared at the position in the input document image 100 and the document image of situation that wherein said word appears at the same position in input picture and each document image obtained.Then, the twice of inverse of using (this ratio+1) is as coefficient, and this word difference of appearing at the number of times between input document image 100 and the document image multiply by this coefficient as mentioned above.Subsequently, to appearing at all words in input document image 100 and the document image, make the absolute value of described multiplied result be obtained summation (or, for example root mean square) mutually.Then, described result's symbol is reversed to obtain the similarity score.Should be noted in the discussion above that said method only is a kind of exemplary method, wherein the position that occurs of word is reflected in the similarity score, also can use any in other the whole bag of tricks.

Though in above-mentioned example, from character identification result, extract word, but always do not need to carry out this Accurate Analysis that is used to obtain a malapropism, can obtain each local character string of appearing in the character identification result yet and local character string is carried out processing same as described above.

The advantage that above-mentioned text search part 20 has is that search speed is higher relatively, when document image comprises a large amount of text, can obtain higher search precision.Yet,, can not obtain high search precision for therefrom not extracting the document image of text or therefrom only can extracting the document image of a small amount of text.Similarly, there is the type that is unsuitable for by the document image of text search part 20 search.

Processing of carrying out in photograph image search part 10 and the processing carried out in text search part 20 executed in parallel simultaneously perhaps can once be carried out one mode and carry out in proper order as mentioned above.

First score 110 of candidate extraction part 50 combinations output from photograph image search part 10 and second score 120 of output from text search part 20 are to calculate the combination score as the overall evaluation of characteristics of image and text feature.Then, candidate extraction part 50 is extracted the document image of the combination score with higher level, as the candidate that will be used in the ferret out in the subsequent file picture search part 52.

Here, device shown in the accompanying drawing 1 adopts the principle of statistical standardization, so that combination is based on the similarity score of characteristics of image with based on the similarity score of the text feature with suitable different qualities.Specifically, because these similarity scores are different measurement results, probably can not obtain suitable score by these scores of simple comparison or direct these scores are used such as summation, multiplication or the like computing.Therefore, according to present embodiment, this original similarity score is standardized as the value of the position of each document image of expression in whole document image group.Can describe a kind of standardized means that is used for as a kind of exemplary method, wherein the similarity score of document image is converted into the deviate of all log files.Process flow diagram shown in 5 is described this method below with reference to accompanying drawings.

Here, consider a kind of similar image search system, wherein from certain set of diagrams picture (a group record image) G={G ₁, G ₂, G ₃..., G _nIn select one group with the similar image of a certain image (input document image 100), and selected image is by the series arrangement and exporting of successively decreasing of similarity.In described document image group, use various characteristic quantity F _j(j=1,2,3 ..., each image G m) _i(i=1,2,3 ..., n) got with respect to the similarity score of this input picture and made S _Ij(a).Characteristic quantity F _jText feature amount of the number of times that can be the image feature amount that distributes such as the edge amount, occur based on word or the like.Each document image G _iWith respect to each characteristic quantity F _jSimilarity score S _Ij(a) can be acquired, as mentioned above by the performed result of photograph image search part 10 and text search part 20.

Then, candidate extraction part 50 is calculated by each the score S to every type characteristic quantity j _Ij(a) carry out the deviate Z that standardization obtains _Ij(a).This calculating can be carried out on the basis of following formula.

Expression formula (1)

Z_{ij} (a) = \frac{S_{ij} (a) - \overset{&OverBar;}{S_{j} (a)}}{D_{j} (a)}

\overset{&OverBar;}{S_{j} (a)} = \frac{Σ_{i = 1}^{n} S_{ij} (a)}{n}

D_{j} (a) = \sqrt{\frac{Σ_{i = 1}^{n} {(S_{ij} (a) - \overset{&OverBar;}{S_{j} (a)})}^{2}}{n - 1}}

Here,

Expression formula (2)

S _j(a)，D _j(a)，Z _ij(a)

Expression is as use characteristic amount F respectively _jThe time, document image G _iSimilarity S with respect to input picture a _Ij(a) mean value, standard deviation and deviate.

In the example shown in the accompanying drawing 1, deviate based on the similarity score of each document image of the text feature in all document images is acquired, and be arranged to standardized text score (S2a), the deviate of while based on the similarity score of each document image of the characteristics of image in all document images is acquired, and is arranged to standardized image score (S2b).

By using deviate Z _Ij(a), can relatively use the value of the similarity score of different characteristic amount as aforesaid score.Yet, in the case, should suppose that the quantity of document image is enough big, and relate to same characteristic features amount F _jThe distribution of similarity score of document image represent to approach the distribution of normal distribution.If document image comprises for example combination in any or the like of text, picture, photo, these images of various types of images, and if they are quantitatively enough big, and then this supposition is considered to suitable usually.

In case relate to each characteristic quantity F as mentioned above _jThe similarity score all by standardization, the standardization score of relevant each characteristic quantities of document images combination that then 50 pairs of candidate extraction parts are identical, thus calculate the combination score (S3) of relevant this document image.Hypothetical record image G _iWith respect to making up of input picture a be divided into S _i(a), can obtain this combination score S according to following formula _i(a):

Expression formula (3)

S _i(a)＝f(Z _i1(a)，Z _i2(a)，Z _i3(a)，…，Z _im(a))

Here, f represents to be used for from relating to each characteristic quantity F _jDeviate Z _Ij(a) function of acquisition combination score.Can use and adopt in each variable clauses and subclauses peaked function as this function f.As mentioned above, the identification grade of each characteristic quantity of image there are differences in dissimilar images.Particularly, when using when can provide the characteristic quantity of senior identification, be equal to input picture or the similarity score of closely similar document image uprises, and the similarity score of the document image different with input picture becomes relatively low to the type of input picture.Therefore, for being equal to input picture or closely similar document image, the deviate that obtains from this similarity score will be significantly greater than the deviate to other document images.On the contrary, when using type to input picture that the characteristic quantity of rudimentary identification only is provided, and input picture is equal to or the similarity score of the similarity score of closely similar document image and other document images between do not have significant difference.Therefore, for being equal to input picture a or closely similar document image, not being big especially by changing the deviate that this similarity score obtains.Therefore, relate to each characteristic quantity F as function f by using to adopt _jStandardization score Z _IjThe function of the maximal value (a) (deviate), for being equal to or closely similar document image with input picture a, described combination score will have very high value, and be equal to for input picture a or closely similar document image is compared, will have much smaller value (and no matter those which deviates that relates in the deviate of various characteristic quantities are used as maximal value) with respect to other document images.Like this, the combination score that obtains according to said method can be considered to be fit to very much that search is equal to input picture or the purpose of closely similar document image, and this input picture can be various any kinds in dissimilar.

In addition, also can use other function of the arithmetic mean that adopts each variable clauses and subclauses or geometric mean as function f.

The advantage of above-mentioned score combined treatment is, by supposing one group with searched document image G _iBe a set of samples, can realize the standardization of score, this is impossible with single document image comparison the time, and is not needing relevant characteristic quantity F _jUnder the situation of the detailed knowledge of the similarity score of correspondence, can provide the score of the combination similarity with a certain high grade of fit.(it is corresponding to image G under the situation of only considering single document image _iPicture number in the group is 1 situation), can consider to use the similarity that relates to feature with diverse evaluation criteria or characteristic (for example text and image density), and do not need to proofread and correct, perhaps can consider to produce the possible model that is used to make up these similaritys and assess with execution parameter.Yet, under the former situation, can not obtain the score of high-grade fitness, and in the latter case, it needs relevant at least characteristic quantity F _jWith the detailed knowledge of corresponding similarity, in the time will searching for various types of image, be difficult to carry out.

The actual file that uses comprises a large amount of dissimilar files in business activity, for example has fixed layout and almost comprises the file of literal specially, for example patent specification; File with the set form that constitutes by literal and lines, for example tabulation of list of names or other type by using the electronic chart program creation; The file that on the space of a whole page, has high similarity, for example the reference material that is used to demonstrate by using specific template to create; The main file that constitutes and have a small amount of literal by figure for example has the presentation material of a plurality of accompanying drawings or patent accompanying drawing; Have some pamphlets that are almost whole page or leaf photo; The pamphlet that comprises the combination of photo and text; Or the like.Therefore, it is extremely difficult providing the universal model that can be applied to these different files.On the other hand, utilize the score combination of present embodiment can obtain suitable combination score, will be based on the score calculating section of various characteristic quantities, for example use the score calculating section that the characteristic quantity of senior identification is provided as text, use the score calculating section that the characteristic quantity of senior identification is provided as photo, using to picture provides the score calculating section or the like of the characteristic quantity of senior identification to make up, and the score as a result that obtains in each part is combined then by standardization.

Here, the similarity score that is combined can be obtained according to the aspect that has low relative correlativity mutually.For example, when above-mentioned text based score was used with the score that obtains according to the feature that the edge amount distributes or average color distributes, its comparison film provided senior identification, can all carry out high-precision search to text and photo.

In above-mentioned example, photograph image search part 10 and text search part 20 offer candidate extraction part 50 with the similarity score of relevant all document images.Interchangeable is that the similarity score of the relevant document image that is equal to or greater than predetermined value also can only be provided to candidate extraction part 50.In the case, the similarity score of the document image that candidate extraction part 50 can relevant candidate extraction part 50 also receive is set to predetermined value, and for example " 0 " can carry out above-mentioned score combined treatment then.

Under this mode, candidate extraction part 50 is calculated the combination score of expression document image with respect to the similarity of input picture a (input document image 100) for each document image.

Candidate extraction part 50 continues to select the candidate of document image as the search that will carry out according to the order of successively decreasing of thus obtained combination score in subsequent file picture search part 52 then.Different is that candidate extraction part 50 is carried out being stored in the reduction first time of the candidate in the document image group in document image database or the document image log store equipment (not shown).When extracting the candidate, can then extract the document image of the predetermined number of combination score with higher level, perhaps can extract and have corresponding to the relative document image of the senior combination score of the estimated rate of all document images.Also can extract the document image of combination score with the threshold value of being equal to or greater than.Certainly, said extracted candidate's method only is an example, also can use other conditions, is used for extracting document image continuously by the order of successively decreasing of combination score.Candidate extraction part 50 offers document image search part 52 with each candidate's who extracts like this file ID.Here, consider combination score execution search in order to make document image search part 52, candidate extraction part 50 offers document image search part 52 with each candidate's file ID and combination score.

Should be noted that the method for calculation combination score only is an example as mentioned above, this combination score also can be calculated according to additive method.

Document image search part 52 is by utilizing the matching treatment of the projection waveform that uses image, the document image that the relative input document image 100 of search has high similarity in one group of received candidate.Concise and to the point summary, as shown in Figure 4, the projection waveform 410 on the horizontal direction is by each pixel value acquisition of projection input document image 100 in the horizontal direction; Promptly pass through each pixel value of the same row interpolation in upper edge in the horizontal direction, and the distribution of the interpolation result in each row is provided on row (vertically) direction.The projection waveform that projection by in vertical direction obtains also can be used as image feature amount, replace the projection waveform on the above-mentioned horizontal direction, perhaps also can use by projection waveform on the horizontal direction and the projection waveform on the vertical direction constitute to as image feature amount.In addition, projecting direction is not limited to horizontal direction and vertical direction.

In order to use the projection waveform to calculate the similarity score, for projection waveform 410 that obtains from input document image 100 and the projection waveform 430 that is recorded in the document image 420 the projection waveform DB 54, after the ratio and location matches between these projection waveforms, obtain the related function of the intensity of correlativity between these waveforms of expression or the differentiated waveform, and be assigned therein as the similarity score.The projection shape information of all document images all is logged in projection waveform DB 54, and the document image search part 52 projection shape information of reading each candidate who extracts by candidate extraction part 50 be used for the projection waveform 410 of input document image 100 relatively.When the projection waveform on usage level and the vertical direction, the summation of the similarity score of the waveform on the similarity score that for example can use the waveform on the relevant vertical direction and the relevant horizontal direction or mean value are as final similarity score.In addition, can use the whole bag of tricks that is used for determining similarity according to the projection waveform.

Owing to realize the high precision grade usually according to the coupling and the searching method of projection waveform as mentioned above, and provide analysis to the feature that is different from those features of handling by photograph image search part 10 and text search part 20, this coupling and searching method can with

search part

10 and 20 both combinations, thereby can provide the similarity of different aspect to assess.On the other hand, because this method adds big burden to calculation process, therefore need time computing or needs to use and have the extremely arithmetic facility of levels of performance.Yet according to present embodiment, because this matching treatment is only carried out at the candidate who has been reduced by candidate extraction part 50 in advance, operation time, the exclusive disjunction performance was unimportant.

As mentioned above, document image search part 52 is obtained the similarity score according to each candidate's projection waveform, and will list by each candidate's file ID and export by the order of successively decreasing of similarity score with the right tabulation that corresponding similarity score constitutes.Here, the tabulation that will export can only comprise the file of the predetermined quantity with high-level similarity score or only comprise that its similarity score is equal to or greater than the file of predetermined value.

In addition, also the combination score that obtains in the candidate extraction part 50 can be added in the similarity score based on the projection waveform that obtains by document image search part 52, thereby obtain the second combination score, arrange each candidate according to the second combination score, and arrange the establishment candidate list according to this.By the similarity score being carried out above-mentioned standardization according to the projection waveform, make up this standardized end value and the combination score that obtains from candidate extraction part 50 as mentioned above then, can calculate this second combination score.

Search Results output 60 receives the tabulation of creating from document image search part 52, by making up to such an extent that the order of successively decreasing of score value sorts to listed document image, and they is exported as Search Results.In this way, can provide document image wherein according to the tactic Search Results 150 that successively decreases with respect to the similarity of input picture a.

According to said system as shown in Figure 1, at first use photograph image search part 10 and the text search part 20 that to carry out relative high speed processing, from all document images, to extract the log file more similar as the candidate to input picture.Then, document image search part 52 (but it provides higher relatively accuracy needs the running time) thus the candidate of such reduction is carried out search obtains final search result.Utilize this structure, can realize high-precision high-speed search, time and performance that whole operation is required all have been lowered.

In the present embodiment, photograph image search part 10 can be carried out high-precision high-speed search usually, and only provides rudimentary identification to a few files type, and it is used for reducing the candidate in the earlier stages of the structure shown in the accompanying drawing 1, thereby realizes reducing at a high speed the candidate.In addition, the high precision search be operated and be realized to text search part 20 can with relative high speed when amount of text is big, and it also is used for reducing the candidate in the earlier stages of the structure shown in the accompanying drawing 1, thereby realizes candidate's high speed reduction.Though text search part 20 can not realize the high precision search to the image with a small amount of text, the parallel photograph image search part 10 that is provided with can be to this image compensation search with a small amount of text.

The matching treatment of being carried out by the document image search part 52 that is provided with in the level in the back based on the projection waveform is favourable, even because when input when comprising the image that the symbol that adds in the document image or mark form, can high precision search for corresponding document image.Specifically, because the area of these symbols and mark ratio is less usually with respect to full page, when they were represented with the form of projection waveform, these symbols etc. were very little to the influence of full page.Therefore by the projection waveform between more this input picture and the document image, can obtain accurately, high similarity score.By contrast, utilize the matching treatment of carrying out by photograph image search part 10 based on edge amount distribution etc., the similarity score since the influence of this symbol or mark may significantly reduce.Therefore, the situation of existence is that wherein document image search part 52 can be more suitable for searching for the document image with senior similarity according to picture search part 10.Particularly, because mark and symbol are added in the source document usually, so that for the leakage at enterprise's Monitoring Files, in fact the file of being revealed all comprises this mark and symbol usually, and source document does not comprise.Therefore, adopting document image search part 52 is very important as the part of search mechanisms.

Though according to present embodiment, document image search part 52 can only realize the processing of relative low speed, but the shortcoming of this low-speed handing can only compensate being carried out to handle by the candidate of high-

speed search part

10 and 20 reductions by allowing document image search part 52.

With reference to the accompanying drawings 6, will describe another example.In accompanying drawing 6, be equal to or the element that is similar to those elements shown in the accompanying drawing 1 is represented by identical numbering, and no longer repeat description of them.

The image search apparatus of this example is included in the distribution portion 5 before photograph image search part 10 and the text search part 20.This distribution portion 5 is analyzed input document image 100 and whether is had the attribute that is fit to by the search of photograph

image search part

10 or 20 execution of text search part with definite this input document image 100, and selectively input picture 100 is distributed to suitable search part.

Here, 5 pairs of input document images 100 of distribution portion apply automatic separating treatment (being also referred to as the text/image separating treatment), and it is known for for example duplicating machine and scanner, thereby image 100 is divided into text area and (photo) image area.Then, when text area during greater than image area, distribution portion 5 offers text search part 20 with image 100, when image area during greater than text area, image 100 is offered photograph image search part 10.Candidate extraction part 50a receives the score tabulation by the document image of an acquisition of selection in photograph image search part 10 and the text search part 20 then, and extracts one group of document image with higher score as the candidate that will offer document image search part 52.Document image search part 52 can be carried out and be similar to the processing of carrying out in the device shown in the accompanying drawing 1.

Here, the difference in size between text area and image area hour exists and selects to use the possibility that enough accuracy can not be provided as the constriction candidate of searching in the part 10 and 20.Therefore, when the difference in size between text area and the image area is equal to or less than predetermined threshold value, input picture can be offered the

search part

10 and 20 both, and can make up the score of dividing acquisition by two search section by candidate extraction part 50a, thereby on the basis of the combination score that obtains, extract described candidate.

Though in above-mentioned example, in the search part of selecting on the basis of the size comparison between text area and image area to use 10 and 20 one, described system of selection is not limited to this example.For example, owing to depending on to a great extent, the search precision of being carried out by text search part 20 is included in the characters in images number, distribution portion 5 can be counted the number of characters that is included in the input document image 100, and distributes input picture on the basis of the information of relevant number of characters.Usually, the search precision of text search part 20 is along with number of characters increases.Therefore can adopt such structure, if wherein the number of characters of Huo Deing has exceeded predetermined threshold, then distribution portion is selected text search part 20, and distribution portion 5 is selected photograph image search part 10 in other cases.For example, the image that 20 pairs of text search parts have a small amount of text can not obtain enough precision, even and photograph image search part 10 also can be carried out the search of certain precision to the image that only is made of text.Therefore, when input file comprises small numbers of characters, can select photograph image search part 10, so that in the certain search precision of maintenance, can carry out candidate's extraction.

In addition, also can first threshold be set to number of characters, with second threshold value less than first threshold, and carry out and control, thereby when surpassing first threshold, the number of characters in being included in input document image 100 selects text search part 20, when described number of characters is selected photograph image search part 10 during less than second threshold value.In the case, when number of characters is between the first threshold and second threshold value, input document image 100 can be offered

search part

10 and 20 both, can in candidate extraction part 50a, make up the score of dividing acquisitions by these search section then.

Here, handle for example OCR, then thus obtained number of characters is counted, can obtain to be included in the number of characters in the input document image 100 by the identification of execution character in distribution portion 5.At this moment, it is just enough only to understand number of characters, also there is no need to distinguish each character.Therefore, do not need the character recognition of complete to handle in this stage, the unique demand in this stage is cutting apart of each character.By allowing text search part 20 to use relevant this Character segmentation result's data, the character recognition part 22 of text search part 20 does not need to dispose the function of the part character recognition processing of being carried out by distribution portion 5.Interchangeable is that distribution portion 5 can also be configured to the character recognition of complete to be handled, and makes text search part 20 needn't comprise character recognition part 22.

In addition, only replace as mentioned above according to being included in determining of number of characters in the input document image 100, also can use above-mentioned separation of images further to obtain image area ratio and number of characters that image area (continuous tone images part) is obtained with respect to the ratio and the combination of whole page or leaf, thereby determine that in the search part 10 and 20 which is more suitable for the search of relevant input document image 100.For example, when number of characters is worth less than the first predetermined threshold district greater than the ratio of the first predetermined threshold character numerical value and image area, can determine that text search part 20 is more suitable for, and when number of characters is worth (it is greater than first threshold district value) less than the ratio of second predetermined threshold character numerical value (it is less than first threshold character numerical value) and image area greater than the second predetermined threshold district, determine that photograph image search part 10 is more suitable for.Under the situation except above-mentioned two kinds of situations, input document image 100 can be offered search part 10 and 20 both, and candidate extraction part 50a can extract the candidate on the result's who is obtained by the similarity score that provides of search part 10 and 20 by combination basis.

In addition, in the example of this modification, because the distribution of distribution portion 5, input document image 100 is provided for one that is fit in photograph image search part 10 and the text search part 20, exists wherein to carry out the situation that search obtains sufficiently high similarity score by any one of being searched in part 10 and the text search part 20 by photograph image.Therefore, be used as result for the search of carrying out by photograph image search part 10 or text search part 20, in the time of can obtaining to have the document image of the similarity score that is higher than predetermined value (it is determined separately in photograph image search part 10 and the text search part 20 each), can eliminate the search of carrying out by the document image that in following stages, is provided with search part 52, and output has the document image of this similarity score that exceeds threshold value as Search Results.

Said structure is favourable in the following areas.Specifically, the precision of handling based on the search of projection waveform is very poor with respect to the image that the unique peak value of projection waveform do not occur, for example comprises the image of background.Therefore, even in photograph image search part 10 or text search part 20, can obtain very high similarity score to a certain document image, if the type of this document image is not suitable for searching for according to the projection waveform, the similarity score that then is included in this document image in the document image search part 52 may be lower.Yet, by adopting above-mentioned control method, wherein in the time can obtaining very high similarity score by photograph image search part 10 or text search part 20, skip the search of document image search part 52 execution that in following stages, are provided with, therefore can eliminate carrying out the demand of unnecessary operational processes, also can reduce the risk that reliable Search Results is had a negative impact by unsuitable search technique.

With reference to accompanying drawing 7, will another example of image search apparatus be described.In accompanying drawing 7, be equal to or the element that is similar to those elements shown in the accompanying drawing 1 is represented by identical Ref. No., and will no longer repeat description of them.

The device of this modified example adopts the photograph image search part 10 that wherein is arranged on the first order, be arranged on partial text search part 20 and be arranged on the document image search part 52 continuously arranged structures of the third level.Utilize this structure, photograph image search part 10 is at first carried out search, with search first candidate similar from all document images to input document image 100, text search part 20 relative input document images 100 use text searches are checked each image in the first candidate record image then, and extract second candidate with high score from first candidate.

Here, for the search of carrying out by text search part 20, advantageously, similarity score that combination is obtained by text search and similarity score based on the characteristics of image that receives from previous stage, and on the basis of the combination score that obtains, constriction is used to obtain the search of the second candidate record image, rather than constriction search on the basis of the similarity score that is obtained by text search separately.Utilize this structure, when input document image 100 comprises a large amount of text, owing to the search of partly carrying out by text search, can preferentially extract the document image more similar to input document image, even when input document image 100 comprises a small amount of text, by increasing the similarity score that obtains by search, can suppress the reduction of search precision based on characteristics of image.Then, document image search part 52 is finally carried out coupling according to each image in the second candidate record image and the projection waveform between the input document image 100, thereby final search result is provided.

Utilize this structure, can carry out the high-speed search of high search precision usually and handle.In addition, by will only realizing that to the file of a small amount of type the photograph image search part 10 of rudimentary identification is configured in the first order, can reduce described candidate with high speed and high precision.Then, utilize and be configured in the second level, can carry out the text search part 20 of high speed processing, can further reduce the candidate, obtain the second candidate record image to utilize high speed processing.In addition, by allowing 20 combinations of text search part from the first order similarity score that obtains and the similarity score that obtains by text search, and the combination score the basis on the constriction candidate, even also can be suppressed at the reduction of search precision aspect to the input document image 100 that comprises a small amount of text.Then, the third level, can search for part 52 by document image and carry out last constriction, document image search part 52 can be carried out senior identification to overlapping clauses and subclauses, for example symbol on the image and mark, thus can consider the document image that acquisitions such as these symbols have high similarity.In the case, even it is a bit slow by the processing that document image search part 52 is carried out, handling the required time only is a minor issue, because, the processing of this document picture search part 52 and only the second candidate record image being carried out, because the reduction in preceding two-stage is handled, the second candidate record image is quantitatively enough few.

In this example, with above-mentioned example class seemingly, obtain if having in the photograph image search part 10 that the document image of the similarity score that is equal to or greater than predetermined threshold can be provided with on the first order, can skip the search of on the second level and following stages, carrying out and handle, and obtain final search result.Similarly, obtain if having in the text search part 20 that the document image of the similarity score that is equal to or greater than predetermined threshold (its threshold value that is independent of the first order obtains) can be provided with on the second level, then handle, can obtain Search Results by skipping the search of on the third level and following stages, carrying out.

Embodiments of the invention and example have below been described.The above-mentioned image search apparatus typically function by carrying out each part wherein described computer system in multi-purpose computer or the program of contents processing (following will the detailed description in detail) is realized.This computing machine has wherein, and CPU (CPU (central processing unit)) 80, storer (primary memory) 82, various I/O (I/O) interface 84 or similar devices pass through the circuit structure that bus 86 connects.In addition, hard disk drive 88 and being used to read various standards portable non-volatile memory medium for example the disk drive 90 of CD, DVD or flash memory for example be connected to bus 86 by I/O interface 84.This

driver

88 or 90 relative storeies are as External memory equipment.Especially, wherein describe described embodiment contents processing program by such as the storage medium stores of CD, DVD etc. or by the network storage in the fixed memory device such as hard disk drive 88, be installed in the computer system then.Then read and be stored in the program in the fixed memory device and be stored in the storer, and further carry out, thereby realize the processing of described embodiment by CPU.

Because existing search for application can directly be used as photograph image search part 10 and text search part 20, only need to provide wherein described provide input document image 100 to a plurality of

search parts

10,20 and 52 so that function that these search parts count the score and the functional programs that will carry out by candidate extraction part 50, as the distinctive program of present embodiment.Each search part that comprises photograph image search part 10, text search part 20 or similar portions can be arranged such that they can be for example be added in the program with the form of plug-in unit.

Though in above-mentioned example, image search apparatus is constructed on single computing machine, this structure only is an example, and the system architecture of each above-mentioned component distribution on network (such as internet or LAN) that wherein constitutes described image search apparatus also falls within the scope of the invention.A kind of possible example system architecture is to make to provide among characteristic quantity DB 30 and the text DB40 one or both as independent database equipment on network, is independent of the computing machine that the miscellaneous part group wherein is installed and is used by network by photograph image search part 10 and text search part 20.In addition, characteristic quantity DB 30 and text DB 40 each can be arranged on separately on the network or and be arranged on the network with two or more quantity.

Though used concrete term that exemplary embodiments of the present invention is described, this description only is for schematic purpose, should be appreciated that, can make various variations and change under the situation of the spirit and scope that do not break away from appending claims.

Claims

1. image search system comprises:

First calculating section calculates the first similarity score of this each document image with respect to this input picture based on the characteristics of image of each document image and input picture;

Second calculating section calculates the second similarity score of this each document image with respect to this input picture based on the text feature of each document image and this input picture;

The candidate extraction part is extracted one or more candidate record images based on this first and second similaritys score of each document image;

The 3rd calculating section calculates the third phase of this each candidate record image like the property score based on the projection waveform of this input picture and each candidate record image; And

The search part is determined the one or more document images similar to this input picture based on this third phase of each document image like the property score.

2. image search system according to claim 1, wherein:

This first calculating section is divided into a plurality of zones with this input picture, and obtain each regional image feature amount, and, calculate the first similarity score of this each document image with respect to this input picture based on the distribution of each regional image feature amount of the distribution of each regional image feature amount of this input picture that is obtained and each document image.

3. image search system according to claim 1, wherein:

This second calculating section obtains the text feature amount of relevant text-string, wherein, described text feature amount obtains by this input picture being carried out the character recognition processing, and, calculate the second similarity score of this each document image with respect to this input picture based on the text feature amount of the text characteristic quantity that is obtained and each document image.

4. image search system according to claim 1, wherein:

Described the 3rd calculating section obtains the projection waveform of this input picture, and based on the projection waveform of this input picture and the projection waveform of each candidate record image, calculate this each candidate record image with respect to the third phase of this input picture like the property score.

5. image search system according to claim 1, wherein:

This candidate extraction part is applied to the statistical standard processing the first similarity score of each document image, to obtain the first standardization score of each document image, the statistical standard processing is applied to the second similarity score of each document image, obtaining the second standardization score of each document image, and extract the candidate record image based on this first standardization score and this second standardization score.

6. image search system according to claim 5, wherein:

The first similarity score that described candidate extraction is partly calculated document image is with respect to the deviate of the first similarity score of all document images first standardization score as this document image, and the second similarity score of calculating document image is with respect to the deviate of the second similarity score of all document images second standardization score as this document image.

7. image search system comprises:

Second calculating section calculates the second similarity score of each document image with respect to this each input picture based on the text feature of each document image and this input picture;

The candidate extraction part, it offers this first calculating section or this second calculating section with this input picture, offer in this first calculating section or this second calculating section which and determine according to the amount of text in this input picture, and based on resulting each document image this first or this second similarity score extract one or more candidate record images;

The 3rd calculating section, the third phase that calculates this each candidate record image based on the projection waveform of this input picture and each candidate record image is like the property score; And

The search part is determined the one or more document images similar to this input picture based on the third phase of each document image like the property score.

8. image search system according to claim 7, wherein:

Described candidate extraction part determines that based on the amount of text in this input picture which is main in continuous tone images and the text in this input picture, and when determining that continuous tone images is main, this input picture is offered this first calculating section, when definite text when being main, this input picture is offered this second calculating section.

9. image search system according to claim 8, wherein:

When definite text or this continuous tone images are not main in described input picture, described candidate extraction part offers this first calculating section and this second calculating section with this input picture, and extracts one or more candidate record images according to this first similarity score and this second similarity score of each document image.

10. image search system comprises:

The first candidate extraction part is calculated the first similarity score of this each document image with respect to this input picture based on the characteristics of image of each document image and input picture, and is extracted the first candidate record image based on this first similarity score;

The second candidate extraction part, text feature based on each first candidate record image and this input picture calculates the second similarity score of this each first candidate record image with respect to this input picture, and extracts the second candidate record image based on this second similarity score; And

The search part, based on the projection waveform of this input picture and each second candidate record image calculate this each second candidate record image with respect to the third phase of this input picture like the property score, and determine the one or more document images similar like the property score to this input picture based on the third phase of each document image.

11. an image search method comprises:

Characteristics of image based on each document image and input picture calculates the first similarity score of this each document image with respect to this input picture;

Text feature based on each document image and this input picture calculates the second similarity score of this each document image with respect to this input picture;

The first and second similarity scores based on each document image are extracted one or more candidate record images;

Projection waveform based on this input picture and each candidate record image calculates the third phase of this each candidate record image like the property score; And

Third phase based on each document image is determined the one or more document images similar to this input picture like the property score.

12. an image search method comprises:

Calculate the first similarity score of this each document image by first calculating section based on the characteristics of image of each document image and input picture with respect to this input picture;

Calculate the second similarity score of this each document image by second calculating section based on the text feature of each document image and this input picture with respect to this input picture;

Based on the amount of text in this input picture, determine and to offer in this first calculating section or this second calculating section which to this input picture;

Result according to described determining step offers this first calculating section or this second calculating section with this input picture;

The first similarity score or the second similarity score based on resulting each document image are extracted one or more candidate record images;

13. an image search method comprises:

Extract the first candidate record image based on this first similarity score;

Text feature based on each first candidate record image and this input picture calculates the second similarity score of this each first candidate record image with respect to this input picture;

Extract the second candidate record image based on this second similarity score;

Based on the projection waveform of this input picture and each second candidate record image calculate this each second candidate record image with respect to the third phase of this input picture like the property score; And

This third phase based on each document image is determined the one or more document images similar to this input picture like the property score.