CN103559220A - Image searching device, method and system - Google Patents

Image searching device, method and system Download PDF

Info

Publication number
CN103559220A
CN103559220A CN201310492161.2A CN201310492161A CN103559220A CN 103559220 A CN103559220 A CN 103559220A CN 201310492161 A CN201310492161 A CN 201310492161A CN 103559220 A CN103559220 A CN 103559220A
Authority
CN
China
Prior art keywords
theme
picture
text
textual description
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310492161.2A
Other languages
Chinese (zh)
Other versions
CN103559220B (en
Inventor
何锐邦
唐会军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310492161.2A priority Critical patent/CN103559220B/en
Publication of CN103559220A publication Critical patent/CN103559220A/en
Application granted granted Critical
Publication of CN103559220B publication Critical patent/CN103559220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an image searching device, method and system. The image searching device comprises an information receiver which is used for receiving user input search terms, a theme converter which is used for confirming a theme which the search items belong to, an image similarity calculator which is used for calculating similarity of images and the search items and a displayer which is used for displaying a confirmed image. The image searching device, method and system enables a searching result to be accurate and improves user experience.

Description

Picture searching equipment, method and system
Technical field
The present invention relates to internet arena, be specifically related to a kind of equipment in picture theme storehouse, a kind of picture searching equipment, a kind of method in picture theme storehouse, a kind of image searching method and a kind of image searching system set up set up.
Background technology
Picture searching is by search utility, and the service of picture information relevant on internet is provided to user.From used technology, classify, can be divided into search and the search based on image content based on context text.Search based on context text is that the label information such as attribute in webpage (hereinafter to be referred as alt) is set up index by picture, thereby searches for.Search based on image content is the visual signature that extracts picture itself, carrys out match search request.Visual signature can be divided into again general visual signature and the visual signature of domain-specific.
Searching method based on image content is a kind of method conventional in traditional image searching method.The content of the every pictures of this methods analyst, the feature of extraction picture self, as color, texture, shape etc.Then using these features of picture self as index, set up feature to the database of picture.When user search picture, search word and the aspect indexing in database of user's input are carried out to similarity calculating, then by similarity order from high to low, represent corresponding picture.
But this method has very large defect.First, this method can only be carried out accurate signature search, can not carry out fuzzy search.But on actual Product Experience, the search word of user's input in most of the cases can not be accomplished very accurate.For example user input " circle ", and be " ellipse " in actual database.Like this because search condition is too strict, the search word of user's input cannot with database in aspect indexing reach larger similarity, cause searching object picture.Secondly because the aspect indexing in database is clear and definite, be difficult to exhaustive all synonym index, namely cannot building database using all synonym features all as the index of this picture, cause having omission for the search of synonym feature.For example user input " potato ", and be " potato " in actual database.
Summary of the invention
In view of the above problems, having proposed the present invention overcomes the problems referred to above or the equipment of setting up picture theme storehouse addressing the above problem at least in part and picture searching equipment and a kind of image searching system and sets up accordingly the method in picture theme storehouse and the method for picture searching to provide a kind of.
According to one aspect of the present invention, a kind of equipment of setting up picture theme storehouse is provided, comprising:
Figure film source storehouse, is configured to store the context text of at least one pictures and this picture;
Graphical information getter, is configured to read picture from described figure film source storehouse, and every pictures is carried out to graphic feature analysis, obtains its graphic feature information, and this graphic feature information is converted into graphic feature text;
Text combiner, is configured to for every pictures, and the context text of the graphic feature text obtaining and this picture is combined to generate textual description, and each textual description comprises a plurality of textual description words;
Subject determination device, be configured to set up at least one theme according to the textual description of each picture, generating pictures theme storehouse, wherein, each theme storehouse comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
Alternatively, described subject determination device is also configured to use LDA algorithm or LSA algorithm to set up each theme.
Alternatively, described LSA algorithm is PLSA algorithm.
Alternatively, described text combiner is also configured to: for any image, enumerate the graphic feature text of this picture and the context text of this picture; To enumerating result, carry out re-scheduling, the graphic feature text after re-scheduling and context text will be combined to the textual description of generating pictures.
Alternatively, context text is html statement.
According to another aspect of the present invention, provide a kind of picture searching equipment, having comprised:
Message recipient, is configured to receive the search word that user inputs;
Theme converter, be configured to obtain described search word from message recipient, and determine theme and probability distribution thereof under this search word according to theme storehouse, wherein, in described theme storehouse, store the probability distribution situation of a plurality of themes and each theme, each theme comprises a plurality of and textual description word this Topic relative;
Picture analogies degree counter, is configured to determine with this search word similarity degree and surpass a certain proportion of picture according to similar probability distribution.
Alternatively, described picture searching equipment also comprises: display, is configured to the picture that shows that described picture analogies degree counter is determined.
Alternatively, described display is also configured to show from high to low according to similarity degree the picture that described picture analogies degree counter is determined.
According to another aspect of the present invention, provide a kind of image searching system, comprised described equipment and the described picture searching equipment of setting up picture theme storehouse.
According to another aspect of the present invention, provide a kind of method of setting up picture theme storehouse, having comprised:
Obtain plurality of pictures;
Respectively every pictures is processed, obtained its graphic feature information and context text;
Graphic feature information is converted into graphic feature text, in conjunction with the context text of this picture and the textual description that transforms this picture of graphic feature text generation generating;
According to the textual description of each picture, set up at least one theme, and the distribution situation of definite each theme, generating pictures theme storehouse, wherein, each theme comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
Alternatively, the described textual description according to each picture is set up at least one theme, comprising: the latent semantic analysis PLSA algorithm of probability of use is set up each theme.
Alternatively, the described textual description according to each picture is set up at least one theme, comprising: use LDA algorithm or LSA algorithm to set up each theme
Alternatively, the generating mode of the textual description of picture is as follows, comprising:
For every pictures, enumerate the graphic feature text of this picture and the context text of this picture;
To enumerating result, carry out re-scheduling, the graphic feature text after re-scheduling and context text will be combined, generate the textual description of described picture.
Alternatively, described context text is html statement.
According to a further aspect in the invention, provide a kind of image searching method, having comprised:
Receive the search word of user's input;
According to theme storehouse, determine theme and probability distribution thereof under this search word, wherein, store the probability distribution situation of a plurality of themes and each theme in described theme storehouse, each theme comprises a plurality of and textual description word this Topic relative;
According to similar probability distribution, determine with the similarity degree of this search word and surpass a certain proportion of picture.
Alternatively, the generation method in described theme storehouse is as follows:
Obtain plurality of pictures;
Respectively every pictures is processed, obtained its graphic feature information and context text;
Graphic feature information is converted into graphic feature text, in conjunction with the context text of this picture and the textual description that transforms this picture of graphic feature text generation generating;
According to the textual description of each picture, set up at least one theme, and the distribution situation of definite each theme, generating pictures theme storehouse, wherein, each theme comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
Alternatively, the similar probability distribution of described basis also comprises: according to similarity degree, show from high to low the picture of determining after determining and surpassing a certain proportion of picture with the similarity degree of this search word.
In embodiments of the present invention, context text and the graphic feature text of comprehensive utilization picture carry out picture searching.The search based on image content with respect to prior art, the way of search of both combinations that the embodiment of the present invention provides, can solve utilize separately that graphic feature text carries out that picture searching causes can only carry out accurate signature search for clear and definite graphic feature, the problem that can not search for generally, improve the search for synonym feature that precise search causes and have the problem of omitting, reach and make Search Results more accurate, improve the beneficial effect that user experiences.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to better understand technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Accompanying drawing explanation
By reading below detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing is only for the object of preferred implementation is shown, and do not think limitation of the present invention.And in whole accompanying drawing, by identical reference symbol, represent identical parts.In the accompanying drawings:
Fig. 1 shows the structural representation of the equipment of setting up according to an embodiment of the invention picture theme storehouse;
Fig. 2 shows the part schematic diagram that adopts according to an embodiment of the invention theme-Word probability matrix that PLSA algorithm draws;
Fig. 3 shows the structural representation of picture searching equipment according to an embodiment of the invention;
Fig. 4 shows the structural representation of image searching system according to an embodiment of the invention;
Fig. 5 shows the processing flow chart of the method for setting up according to an embodiment of the invention picture theme storehouse; And
Fig. 6 shows the processing flow chart of image searching method according to an embodiment of the invention.
Embodiment
The algorithm providing at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with demonstration.Various general-purpose systems also can with based on using together with this teaching.According to description above, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.It should be understood that and can utilize various programming languages to realize content of the present invention described here, and the description of above language-specific being done is in order to disclose preferred forms of the present invention.
In correlation technique, mention, the searching method based on image content is a kind of method conventional in traditional image searching method, and the method is set up feature to the database of picture with these features of picture self.But this method can only be carried out accurate signature search, can not carry out fuzzy search.Because search condition is too strict, use searching method based on image content cause search word that user inputs cannot with database in aspect indexing reach larger similarity, cause searching object picture.
The problem existing based on prior art, the embodiment of the present invention has realized a kind of search equipment, system and method for simultaneously describing picture by picture context text and picture self content, makes full use of the information of picture context text and the important information of picture self and improves picture searching quality.In addition, the embodiment of the present invention can solve the problem of searching for generally with synonym search, improves the problem that the user who too strictly causes due to search condition cannot find object picture.
Based on foregoing invention design, the embodiment of the present invention provides a kind of equipment and a kind of picture searching equipment of setting up picture theme storehouse, and Fig. 1 shows the structural representation of the equipment of setting up according to an embodiment of the invention picture theme storehouse.
Now introduce each device of the equipment 100 of setting up picture theme storehouse in image searching system or the function of composition and the annexation between each several part.As shown in Figure 1, first, the equipment 100 of setting up picture theme storehouse at least comprises following each parts: picture library source 110, graphical information getter 120, text combiner 130 and subject determination device 140.
Particularly, set up in the equipment 100 in picture theme storehouse, picture library source 110 is used for storing at least one pictures and storing the context text that every pictures is corresponding.It should be noted that the context text adopting in the embodiment of the present invention can be HTML (Hypertext Markup Language) (Hypertext Markup Language, hereinafter to be referred as html) statement.Graphical information getter 120 and 110 couplings of picture library source, first read the picture of storing in picture library source 110, subsequently all pictures of storage in picture library source 110 carried out to graphic feature analysis one by one.By graphic feature, analyze, graphical information getter 120 can obtain the graphic feature information of every pictures in picture library source 110.After graphic feature acquisition of information finishes, graphical information getter 120 can be converted into graphic feature text by the graphic feature information of these pictures.
Any image to storage in picture library source 110, can obtain its graphic feature information and corresponding graphic feature text by graphical information getter 120.Graphical information getter 120 1 sides and the coupling of picture library source, opposite side and 130 couplings of text combiner.In text combiner 130, at least comprise: enumerate unit 131, re-scheduling unit 132 and generation unit 133.These three unit are coupling successively in text combiner 130.Particularly, the quantity of the picture obtaining from picture library source 110 due to graphical information getter 120 should be at least one, be generally multiple even tens, even more, the similarity degree of considering every pictures and affiliated theme is different, and the unit 131 of enumerating in text combiner 130 is enumerated the context text of graphic feature text to getting from graphical information getter 120 and corresponding picture and enumerated.And then, for simplifying, enumerate result, can obtain the result of enumerating of enumerating unit 131 in the re-scheduling unit 132 in text combiner 130, and it is carried out to re-scheduling, to get rid of enumerating the part repeating in result.After re-scheduling, the generation unit 133 in text combiner 130, by graphic feature text and the combination of context text, generates the textual description that every pictures is corresponding.Wherein, each textual description of generation comprises a plurality of textual description words.
As the above analysis, any image in picture library source 110 all can obtain its corresponding textual description through the operation of figure getter 120 and text combiner 130.At opposite side, text combiner 130 can be coupled with subject determination device 140.The textual description for every pictures in picture library source 110 that can obtain by text combiner 130, according to above-mentioned textual description, subject determination device 140 can use some algorithms to set up theme.It should be noted that algorithm that the embodiment of the present invention is pointed out refers to that to be applicable to any particular algorithms that theme sets up tactful or regular, be not limited to any particular algorithms provided herein, the some algorithms that are below about to mention are only giving an example of several algorithms wherein.
Preferred linear discriminate analysis (Linear Discriminant Analysis in the embodiment of the present invention, hereinafter to be referred as LDA) algorithm or Link State Advertisement (Link-State Advertisement, hereinafter to be referred as LSA) algorithm sets up theme, generating pictures theme storehouse.Wherein, each theme comprises a plurality of and textual description word this Topic relative.Subject determination device 140 can be determined the theme under each textual description word and the probability that belongs to this theme, and subject determination device 140 can be determined the theme that the textual description of each picture is affiliated and the probability joining with this Topic relative.
It should be noted that, the core of LDA method is to use Di Li Cray (Dirichlet) to distribute, the main thought of this method is first to select a theme vector, determine the selecteed probability of each theme, then when generating each word, from theme distribution vector, select a theme, the word probability of the theme being selected out by this distributes and generates a word.
Subject determination device 140 is set up the preferred another kind of algorithm of theme, LSA algorithm, it uses the mathematical measure of svd (Singular Value Decomposition, hereinafter to be referred as SVD), by such decomposition means, document and vocabulary can be shown as to the form into matrix.Preferably, the LSA algorithm using in the embodiment of the present invention is latent semantic analysis (Probabilistic Latent Semantic Analysis, hereinafter to be referred as the PLSA) algorithm of probability.
Can see, in the process of picture searching, many times need to probe into the implication that is hidden in word, word behind, simple literal coupling is because the synonym and the polysemy that extensively exist can make Search Results depart from expection, in judgement document relevance, need to consider the semanteme of document, and the semantic sharp weapon that excavate are topic models.In topic model, theme represents a concept, an aspect, shows as a series of relevant words, is the conditional probability of these words.Image, theme is exactly a bucket, and the inside has filled the higher word of the frequency of occurrences, and these words and this theme have very strong correlativity.
PLSA is a kind of topic model algorithm, and this algorithm adopts the thought of cluster, finally generates a lot of classes.Each class can be a theme that has optional network specific digit numbering, and for example, certain class comprises following word " film, director, premiere, reward, box office ".A plurality of words that each class comprises can be followed appearance in the higher frequency of occurrences, in this algorithm, can obtain the probability that it belongs to this class, and each word can occur with different probability in a plurality of classes.The frequency of occurrences is herein higher there is not a limit, but determines according to actual words number.For example, have 20 words in text, wherein have two words to appear at more than five times, all the other words all appear at below three times, can think that appearing at five two above words is the word that the frequency of occurrences is higher.
Adopt PLSA algorithm, corresponding result of calculation is to generate two matrixes, is respectively document-theme probability distribution matrix and theme-Word probability matrix.This algorithm is used the method for statistics to set up the probability distribution relation between " document-potential semanteme-word " three, and utilizes this probability to carry out semantic analysis, its objective is and will from text, find implicit theme.For example, when calculating the probability distribution of " apple, mobile phone, dark color " theme, use PLSA algorithm, in substitution theme-Word probability matrix, calculate, can obtain the theme probability distribution of " apple, mobile phone, dark color ".
Each theme is a cluster result of a plurality of words in fact, and the word of same class the inside, has plenty of synonym, has plenty of near synonym, has plenty of related term.Word such as " fat-reducing, weight reducing, slender, degrease, body weight " such as may comprise in certain theme, " fat-reducing " and " weight reducing " is exactly synonym so, and " fat-reducing " and " slender " is near synonym, and " fat-reducing " and " body weight " is related term.And in actual applications, conventionally do not discriminate among synonyms and near synonym, but search for using all words in the theme at each word place as search word, so Search Results both comprised synonym result, also comprise near synonym result, and related term result.
PLSA algorithm has the ability of processing polysemy and adopted many words, can from text, find implicit theme, can probe into and be hidden in word, word implication behind.By implementing PLSA algorithm, inputted text is divided into different classes more accurately, a plurality of words and this theme that in resulting each theme, comprise have very strong correlativity, can improve the accuracy rate of coupling.
In prior art, carried picture searching and classify from used technology, and can be divided into two classes, a class is the search based on image content, and another kind of is search based on context text.The former is based upon on the basis of some priori of institute's Description Image content (or hypothesis), closely relevant with concrete application, such as people's facial characteristics or fingerprint characteristic etc.The latter is for describing the total feature of all images, irrelevant with particular type or the content of image, mainly comprises color, texture and shape etc.
Now the image searching method based on context text is analyzed.The description that this method contains picture code section by analysis package carrys out building database, and the similarity of then calculating the search terms database Chinese version index of user's input represents picture.Although the image searching method based on context text can solve picture searching problem to a certain extent, but the descriptor of the code segment that comprises picture in the html code of this each webpage of methods analyst is as the description document of picture, and this way cannot guarantee to carry out comprehensive and accurate description to picture.
The method that judges two document similarities in image searching method based on context text is how many by checking the common word occurring of two documents, as word frequency-reverse file frequency (Term Frequency – Inverse Document Frequency, TF-IDF) etc., this method is not considered word semantic association behind, may seldom even not have at the common word occurring of two documents, but these two documents are similar, this has just caused traditional image searching method can not accurately describe all sidedly picture, also just cannot meet user's picture searching demand.
And the picture searching equipment that adopts the embodiment of the present invention to provide, in conjunction with graphic feature textual description image content, effectively utilize graphic feature text, solved and utilized separately picture context text to be described picture, can not accurate description image content, cause Search Results defect of low quality, can to picture, be described more exactly, improve picture searching quality.
For employing that the embodiment of the present invention is provided the application process of the equipment of setting up picture theme storehouse of PLSA algorithm and the corresponding beneficial effect bringing set forth clearer clearlyer, now with specific embodiment, be described.
Embodiment mono-
In this example, have 6 pieces of documents, every piece of document only comprises several words.In actual applications, number of documents is not limit, every piece of word quantity not restriction equally that document package contains.
Use PLSA algorithm, can from the document of input, calculate final document-theme matrix and theme-word matrix.As mentioned above, 6 pieces of documents are:
Document 0: computer apple apple
Document 1: computer computer software
Document 2: the apple orange of sleeping
Document 3: apple orange banana
Document 4: film film
Document 5: film movie
Can see, in above-mentioned 6 pieces of documents, one has 8 different words.If number of topics is decided to be to 3, use theme-Word probability distribution matrix that PLSA calculates as shown in Table 1:
Table one
? Computer Software Apple Fruit Orange Banana Film Film
Theme 0 0.4975 0.1658 0.3367 0 0 0 0 0
Theme 1 0 0 0 0 0 0 0.7500 0.2500
Theme 2 0 0 0.3333 0.1667 0.3333 0.1667 0 0
From result above, can find out, theme 0 is the word of infotech (InformationTechnology, hereinafter to be referred as IT) class, and theme 1 is the word of film class, and theme 2 is words of fruits.Wherein " apple " both belonged to theme 0, belonged to again theme 2, met true cognition, because apple may refer to fruit, also may refer to i Phone.
And the document-theme probability distribution matrix calculating is as shown in Table 2:
Table two
? Theme 0 Theme 1 Theme 2
Document 0 0.3300 0 0.6700
Document 1 1.0000 0 0
Document 2 0 0 1.0000
Document 3 0 0 1.0000
Document 4 0 1.0000 0
Document 5 0 1.0000 0
From result above, can find out content and the corresponding theme of each paper trail:
Document 0 record is about the content of theme 0 and theme 2, the i.e. content of IT class and fruits;
Document 1 records the content of IT class;
Document 2 records the content of fruits;
Document 3 records the content of fruits;
Document 4 records the content of film class;
Document 5 records the content of film class.
Embodiment bis-
In embodiments of the present invention, when implementing PLSA algorithm, need to input a large amount of texts, in inputted text, search more much higher word of the frequency of occurrences, search which word in the word that these frequencies of occurrences are higher and occur simultaneously, and then word higher and that simultaneously occur is polymerized to a class by these frequencies of occurrences.There is not a fixedly limit in the frequency relating in the higher word of the frequency of occurrences herein, but determines according to actual words number.For example, have 20 words in text, wherein have two words to appear at more than five times, all the other words all appear at below three times, can think that appearing at five two above words is the word that the frequency of occurrences is higher.Also such as, in text, have 600 words, wherein have four words to occur that more than 50 times all the other words all appear at 30 times once, can think and occur that 50 four above words are the word that frequency is higher.The result of calculation of PLSA algorithm can obtain theme-Word probability matrix.
Fig. 2 is the part schematic diagram that adopts according to an embodiment of the invention theme-Word probability matrix that PLSA algorithm draws.As shown in Figure 2, the word and the probable value thereof that in the embodiment of the present invention, in the 5th, 43,56,247 themes, comprise have been listed.As can be seen from Figure 2, theme 247 is about medical content, and theme 5 is the contents about color, and theme 43 is the contents about thinking, and theme 56 is the contents about doctor and patient.
Each device of the equipment 100 of setting up picture theme storehouse or the function of composition and the annexation between each several part have specifically been introduced above.In the embodiment of the present invention, by setting up the equipment 100 in picture theme storehouse, can access the textual description of stored picture, according to the textual description of each picture, can set up corresponding theme, and then picture searching equipment 300 carries out picture searching and demonstration according to the search word of user input and the information set up in the equipment 100 in picture theme storehouse.
Fig. 3 shows the structural representation of picture searching equipment according to an embodiment of the invention.Now introduce each device or the function of composition and the annexation of each several part of picture searching equipment 300.First, message recipient 310 receives the search word of user's input, and due to the side coupling of message recipient 310 with theme converter 320, message recipient 310 is forwarded to theme converter 320 by the search word of reception.Be provided with in embodiments of the present invention theme storehouse, store a plurality of themes and the probability distribution situation corresponding with each theme in this theme storehouse, wherein, each theme comprises a plurality of and textual description word Topic relative.Theme converter 320 receives after the search word that message recipient 310 forwards, and search word is mated with the content of storing in theme storehouse, thereby can determine theme and probability distribution thereof under the search word that user inputs according to theme storehouse.
The opposite side of theme converter and 330 couplings of picture analogies degree counter, after theme and probability distribution thereof under the search word of the definite user's input of theme converter 320, picture analogies degree counter 330 determines that according to the probability distribution of theme under search word the similarity degree of this search word of inputting with user surpasses a certain proportion of picture.In addition, the opposite side of picture analogies degree counter 330 and display 340 couplings.When picture analogies degree counter 330 gets after satisfactory picture by calculating, be presented in display 340.It should be noted that, what obtain due to picture analogies degree counter 330 is similarity degree picture within the specific limits, and the quantity of picture should be at least one, is generally multiple even tens, even more.And consider the limitation of display 340 aspect plurality of pictures demonstration simultaneously, and different pictures are different from search word similarity, its quantity of information comprising is also different, therefore, in the embodiment of the present invention, can utilize the similarity degree of each picture that picture analogies degree counter 330 calculates and search word by sorting from high to low, and show successively according to ranking results, to complete picture searching.
In embodiments of the present invention, context text and the graphic feature text of comprehensive utilization picture carry out picture searching.Search based on image content of the prior art can only be carried out accurate signature search for clear and definite graphic feature, can not search for generally.The embodiment of the present invention is utilized the context text of picture, makes hunting zone no longer be confined to clear and definite graphic feature, but can carry out searching for generally of appropriateness, improves the problem that the search for synonym feature being caused by precise search has omission.In addition, in the embodiment of the present invention, in conjunction with graphic feature textual description image content, effectively utilize graphic feature text, solved and utilized separately picture context text to be described picture, can not accurate description image content, cause Search Results defect of low quality, can to picture, be described more exactly, improve picture searching quality.
In addition, the text combiner in the embodiment of the present invention is enumerated and re-scheduling the context text of figure and graphic feature text, is more conducive to generate graph text accurately and describes.And the algorithm of setting up theme that the embodiment of the present invention is pointed out refers to that to be applicable to any particular algorithms that theme sets up tactful or regular, be not limited to any particular algorithms provided herein.And the embodiment of the present invention is used PLSA algorithm to carry out semantic analysis, this algorithm is used the method for statistics to set up the probability distribution relation between " document-potential semanteme-word " three, and utilize this probability to carry out semantic analysis, its objective is and will from text, find implicit theme.This algorithm can carry out near synonym coupling, processes synonym search situation, makes Search Results more meet expection, more closing to reality application scenarios.To sum up, it is of low quality that the embodiment of the present invention has solved picture searching, waste picture self important information and the problem such as search condition is too strict, and obtaining is to search for the beneficial effect that closing to reality application scenarios more and Search Results more meet expection.
The equipment of setting up picture theme storehouse and picture searching equipment based on above each preferred embodiment provides, based on same inventive concept, the embodiment of the present invention also provides a kind of image searching system, and Fig. 4 shows the structural representation of image searching system according to an embodiment of the invention.As shown in Figure 4, in image searching system 400, at least comprise picture theme library facilities 410 and the picture searching equipment 420 set up.
Now introduce each device of the equipment 410 of setting up picture theme storehouse in image searching system or the function of composition and the annexation between each several part.First, the equipment 410 of setting up picture theme storehouse at least comprises: picture library source 411, graphical information getter 412, text combiner 413 and subject determination device 414.
Particularly, set up in the equipment 410 in picture theme storehouse, picture library source 411 is used for storing at least one pictures and storing the context text that every pictures is corresponding.It should be noted that the context text adopting in the embodiment of the present invention can be html statement. Graphical information getter 412 and 411 couplings of picture library source, first read the picture of storing in picture library source 411, subsequently all pictures of storage in picture library source 411 carried out to graphic feature analysis one by one.By graphic feature, analyze, graphical information getter 412 can obtain the graphic feature information of every pictures in picture library source 411.After graphic feature acquisition of information finishes, graphical information getter 412 can be converted into graphic feature text by the graphic feature information of these pictures.
Any image to storage in picture library source 411, can obtain its graphic feature information and corresponding graphic feature text by graphical information getter 412.Graphical information getter 412 1 sides and the coupling of picture library source, opposite side and 413 couplings of text combiner.In text combiner 413, at least comprise: enumerate unit 4131, re-scheduling unit 4132 and generation unit 4133.These three unit are coupling successively in text combiner 413.
Particularly, the quantity of the picture obtaining from picture library source 411 due to graphical information getter 412 should be at least one, be generally multiple even tens, even more, the similarity degree of considering every pictures and affiliated theme is different, and the unit 4131 of enumerating in text combiner 413 is enumerated the context text of graphic feature text to getting from graphical information getter 412 and corresponding picture and enumerated.And then, for simplifying, enumerate result, the re-scheduling unit 4132 in text combiner 413 obtains the result of enumerating of enumerating unit 4131, and it is carried out to re-scheduling, to get rid of enumerating the part repeating in result.After re-scheduling, the generation unit 4133 in text combiner 413, by graphic feature text and the combination of context text, generates the textual description that every pictures is corresponding.Wherein, each textual description of generation comprises a plurality of textual description words.
Any image in picture library source 411 all can obtain its corresponding textual description through the operation of figure getter 412 and text combiner 413.In addition, text combiner 413 and 414 couplings of subject determination device.According to the textual description for every pictures in picture library source 411 obtaining by text combiner 413, subject determination device 414 preferably LDA algorithm or LSA algorithm is set up a theme, generating pictures theme storehouse.Wherein, each theme comprises a plurality of and textual description word this Topic relative.Subject determination device 414 can be determined the theme under each textual description word and the probability that belongs to this theme, and subject determination device 414 can be determined the theme that the textual description of each picture is affiliated and the probability joining with this Topic relative.The core that it should be noted that LDA method is to use Dirichlet to distribute.
Subject determination device 414 is set up the another kind of algorithm that theme is used, and LSA algorithm has been used the mathematical measure of SVD, by such decomposition means, document and vocabulary can be shown as to the form into matrix.
Particularly, the LSA algorithm using in the embodiment of the present invention is PLSA algorithm.PLSA adopts the thought of cluster, finds the frequency of occurrences a plurality of words higher and that simultaneously occur in the text of input, is polymerized to a class.And distribute a numeral number to each class, to identify this class in ensuing processing.The frequency of occurrences is herein higher there is not a limit, but determines according to actual words number.By implementing PLSA algorithm, inputted text is divided into different classes more accurately, a plurality of words and this theme that in resulting each theme, comprise have very strong correlativity, can improve the accuracy rate of coupling.
Specifically introduced the equipment 410 of setting up picture theme storehouse in image searching system 400 above.By setting up the equipment 410 in picture theme storehouse, can access the textual description of stored picture, according to the textual description of each picture, can set up corresponding theme, and then picture searching equipment 420 carries out picture searching and demonstration according to the search word of user input and the information set up in the equipment in picture theme storehouse.
Now introduce each device or the function of composition and the annexation of each several part of picture searching equipment 420 in image searching system 400.First, message recipient 421 and 422 couplings of theme converter, message recipient 421 receives the search word of user's input, and the search word of reception is forwarded to theme converter 422.Be provided with in embodiments of the present invention theme storehouse, store a plurality of themes and the probability distribution situation corresponding with each theme in this theme storehouse, wherein, each theme comprises a plurality of and textual description word Topic relative.Theme converter 422 receives after the search word that message recipient 421 forwards, and according to theme storehouse, determines theme and probability distribution thereof under the search word that user inputs.
From above, the equipment 410 of setting up picture theme storehouse can utilize PLSA algorithm to draw theme-Word probability matrix and document-theme probability distribution matrix.Referring to Fig. 4, theme converter 422 and 423 couplings of picture analogies degree counter.After theme under the definite search word of theme converter 422, picture analogies degree counter 423 determines that according to the probability distribution of theme under search word the similarity degree of this search word of inputting with user surpasses a certain proportion of picture.
At the opposite side of picture analogies degree counter 423, picture analogies degree counter 423 and display 424 couplings.When picture analogies degree counter 423 gets after satisfactory picture, be presented in display 424.It should be noted that, what obtain due to picture analogies degree counter 423 is similarity degree picture within the specific limits, and the quantity of picture should be at least one, is generally multiple even tens, even more.And consider the limitation of display 424 aspect plurality of pictures demonstration simultaneously, and different pictures are different from search word similarity, its quantity of information comprising is also different, therefore, in the embodiment of the present invention, can utilize the similarity degree of each picture that picture analogies degree counter 423 calculates and search word by sorting from high to low, and show successively according to ranking results, to complete picture searching.
In embodiments of the present invention, context text and the graphic feature text of comprehensive utilization picture carry out picture searching.Search based on image content of the prior art can only be carried out accurate signature search for clear and definite graphic feature, can not search for generally.The embodiment of the present invention is utilized the context text of picture, makes hunting zone no longer be confined to clear and definite graphic feature, but can carry out searching for generally of appropriateness, improves the problem that the search for synonym feature being caused by precise search has omission.In addition, in the embodiment of the present invention, in conjunction with graphic feature textual description image content, effectively utilize graphic feature text, solved and utilized separately picture context text to be described picture, can not accurate description image content, cause Search Results defect of low quality, can to picture, be described more exactly, improve picture searching quality.
In addition, the text combiner in the embodiment of the present invention is enumerated and re-scheduling the context text of figure and graphic feature text, is more conducive to generate graph text accurately and describes.And the algorithm of setting up theme that the embodiment of the present invention is pointed out refers to that to be applicable to any particular algorithms that theme sets up tactful or regular, be not limited to any particular algorithms provided herein.This and the embodiment of the present invention are used PLSA algorithm to carry out semantic analysis, this algorithm is used the method for statistics to set up the probability distribution relation between " document-potential semanteme-word " three, and utilize this probability to carry out semantic analysis, its objective is and will from text, find implicit theme.This algorithm can carry out near synonym coupling, processes synonym search situation, makes Search Results more meet expection, more closing to reality application scenarios.To sum up, it is of low quality that the embodiment of the present invention has solved picture searching, waste picture self important information and the problem such as search condition is too strict, and having obtained is to search for the beneficial effect that closing to reality application scenarios more and Search Results more meet expection.
Image searching system based on above each preferred embodiment provides and each equipment, based on same inventive concept, the embodiment of the present invention also provides a kind of method and a kind of image searching method of setting up picture theme storehouse.Fig. 5 shows the processing flow chart of the method for setting up according to an embodiment of the invention picture theme storehouse.Referring to Fig. 5, the method comprising the steps of S502 to S508.
In the flow process shown in Fig. 5, first, execution step S502, obtains plurality of pictures, and in the embodiment of the present invention, the context text of at least one pictures and this picture can be stored in picture library source.Wherein, context text is html statement.Secondly, execution step S504, the plurality of pictures that read step S502 has got, processes respectively the plurality of pictures getting subsequently, and all pictures that obtain are carried out to graphic feature analysis one by one.The graphic feature analysis operation of carrying out by step S504, can get the graphic feature information of every pictures.Subsequently, execution step S506, is converted into graphic feature text by the graphic feature information of the every pictures obtaining in step S504.For every pictures, continue execution step S506, enumerate the graphic feature text of this picture and the context text of this picture.Particularly, after the graphic feature text of picture and context text are enumerated, to enumerating result, carry out re-scheduling.Graphic feature text and the context text of picture enumerated, the result of re-scheduling can continue to generate by text combiner the textual description that this picture is corresponding.
Draw after the textual description of picture, execution step S508, is used LDA algorithm or LSA algorithm to set up at least one theme according to the textual description of every pictures, and wherein, the LSA algorithm of use is PLSA algorithm.PLSA algorithm is used the method for statistics to set up the probability distribution relation between " document-potential semanteme-word " three, and utilizes this probability to carry out semantic analysis, finds implicit theme from text.When setting up at least one theme, determine the distribution situation of each theme, generating pictures theme storehouse.It should be noted that each set up theme includes a plurality of and textual description word this Topic relative.In addition, the theme under it and the probability that belongs to this theme determined in each textual description word, and the theme being associated with the textual description of each picture and the probability joining with this Topic relative determined in each textual description word.
In embodiments of the present invention, context text and the graphic feature text of comprehensive utilization picture carry out picture searching.Search based on image content of the prior art can only be carried out accurate signature search for clear and definite graphic feature, can not search for generally.The embodiment of the present invention is utilized the context text of picture, makes hunting zone no longer be confined to clear and definite graphic feature, but can carry out searching for generally of appropriateness, improves the problem that the search for synonym feature being caused by precise search has omission.In addition, in the embodiment of the present invention, in conjunction with graphic feature textual description image content, effectively utilize graphic feature text, solved and utilized separately picture context text to be described picture, can not accurate description image content, cause Search Results defect of low quality, can to picture, be described more exactly, improve picture searching quality.
In addition, in the embodiment of the present invention, the context text of figure and graphic feature text are enumerated and re-scheduling, be more conducive to generate graph text accurately and describe.And the algorithm of setting up theme that the embodiment of the present invention is pointed out refers to that to be applicable to any particular algorithms that theme sets up tactful or regular, be not limited to any particular algorithms provided herein.This and the embodiment of the present invention are used PLSA algorithm to carry out semantic analysis, this algorithm is used the method for statistics to set up the probability distribution relation between " document-potential semanteme-word " three, and utilize this probability to carry out semantic analysis, its objective is and will from text, find implicit theme.This algorithm can carry out near synonym coupling, processes synonym search situation, makes Search Results more meet expection, more closing to reality application scenarios.To sum up, it is of low quality that the embodiment of the present invention has solved picture searching, waste picture self important information and the problem such as search condition is too strict, and having obtained is to search for the beneficial effect that closing to reality application scenarios more and Search Results more meet expection.
Introduce a kind of method of setting up picture theme storehouse in the present invention above, introduced a kind of image searching method in the present invention below.Fig. 6 shows the processing flow chart of image searching method according to an embodiment of the invention.Referring to Fig. 6, the method comprising the steps of S602 is to step S606.
In the flow process shown in Fig. 6, first, execution step S602, receives the search word of user's input, and the search word of reception is sent to theme converter.Secondly, execution step S604, after theme converter receives the search word of user's input, can determine theme and probability distribution thereof under the search word that user inputs according to theme storehouse.
In the embodiment of the present invention, be provided with theme storehouse, its generation method is: obtain plurality of pictures, then respectively the every pictures getting is processed.Through processing, can obtain graphic feature information and the context text of every pictures, obtained graphic feature information is converted into graphic feature text.The graphic feature text of the every pictures obtaining combines and can generate the textual description of this picture with the context text of this picture.According to the textual description of every pictures, at least one theme be can set up, and the distribution situation of each theme, generating pictures theme storehouse determined.In this theme storehouse, store a plurality of themes, each theme comprises a plurality of and textual description word this Topic relative.In theme storehouse, also store the corresponding probability distribution situation of each theme.
According to step S604, draw theme and probability distribution under the search word that user inputs, trigger step S606, according to the probability distribution of theme under search word, determine with the likeness in form degree of this search word and surpass a certain proportion of picture, and from high to low the picture of determining is shown according to similarity degree.
Adopted said method and equipment, the embodiment of the present invention can be brought following beneficial effect:
In embodiments of the present invention, context text and the graphic feature text of comprehensive utilization picture carry out picture searching.Search based on image content of the prior art can only be carried out accurate signature search for clear and definite graphic feature, can not search for generally.The embodiment of the present invention is utilized the context text of picture, makes hunting zone no longer be confined to clear and definite graphic feature, but can carry out searching for generally of appropriateness, improves the problem that the search for synonym feature being caused by precise search has omission.In addition, in the embodiment of the present invention, in conjunction with graphic feature textual description image content, effectively utilize graphic feature text, solved and utilized separately picture context text to be described picture, can not accurate description image content, cause Search Results defect of low quality, can to picture, be described more exactly, improve picture searching quality.
C8, equipment according to claim 7, wherein, described display is also configured to show from high to low according to similarity degree the picture that described picture analogies degree counter is determined.
C11, method according to claim 10, wherein, the described textual description according to each picture is set up at least one theme, comprising: the latent semantic analysis PLSA algorithm of probability of use is set up each theme.
C12, method according to claim 10, wherein, the described textual description according to each picture is set up at least one theme, comprising: use LDA algorithm or LSA algorithm to set up each theme.
C13, according to claim 10 to the method described in 12 any one, wherein, the generating mode of the textual description of picture is as follows, comprising:
For every pictures,
Enumerate the graphic feature text of this picture and the context text of this picture;
To enumerating result, carry out re-scheduling, the graphic feature text after re-scheduling and context text will be combined, generate the textual description of described picture.
C14, according to claim 10 to the method described in 13 any one, wherein, described context text is html statement.
C16, method according to claim 15, wherein, the generation method in described theme storehouse is as follows:
Obtain plurality of pictures;
Respectively every pictures is processed, obtained its graphic feature information and context text;
Graphic feature information is converted into graphic feature text, in conjunction with the context text of this picture and the textual description that transforms this picture of graphic feature text generation generating;
According to the textual description of each picture, set up at least one theme, and the distribution situation of definite each theme, generating pictures theme storehouse, wherein, each theme comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
C17, according to the method described in claim 15 or 16, wherein, the similar probability distribution of described basis also comprises: according to similarity degree, show from high to low the picture of determining after determining and surpassing a certain proportion of picture with the similarity degree of this search word.
In the instructions that provided herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can not put into practice in the situation that there is no these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the above in the description of exemplary embodiment of the present invention, each feature of the present invention is grouped together into single embodiment, figure or sometimes in its description.Yet, the method for the disclosure should be construed to the following intention of reflection: the present invention for required protection requires than the more feature of feature of clearly recording in each claim.Or rather, as reflected in claims below, inventive aspect is to be less than all features of disclosed single embodiment above.Therefore, claims of following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can the module in the equipment in embodiment are adaptively changed and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and can put them into a plurality of submodules or subelement or sub-component in addition.At least some in such feature and/or process or unit are mutually repelling, and can adopt any combination to combine all processes or the unit of disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and disclosed any method like this or equipment.Unless clearly statement in addition, in this instructions (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or the alternative features of similar object replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included in other embodiment, the combination of the feature of different embodiment means within scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, or realizes with the software module moved on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize according to the equipment of setting up picture theme storehouse of the embodiment of the present invention and the some or all functions of the some or all parts in picture searching equipment.The present invention for example can also be embodied as, for carrying out part or all equipment or device program (, computer program and computer program) of method as described herein.Realizing program of the present invention and can be stored on computer-readable medium like this, or can there is the form of one or more signal.Such signal can be downloaded and obtain from internet website, or provides on carrier signal, or provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation that do not depart from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed as element or step in the claims.Being positioned at word " " before element or " one " does not get rid of and has a plurality of such elements.The present invention can be by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to carry out imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title by these word explanations.

Claims (10)

1. an equipment of setting up picture theme storehouse, comprising:
Figure film source storehouse, is configured to store the context text of at least one pictures and this picture;
Graphical information getter, is configured to read picture from described figure film source storehouse, and every pictures is carried out to graphic feature analysis, obtains its graphic feature information, and this graphic feature information is converted into graphic feature text;
Text combiner, is configured to for every pictures, and the context text of the graphic feature text obtaining and this picture is combined to generate textual description, and each textual description comprises a plurality of textual description words;
Subject determination device, be configured to set up at least one theme according to the textual description of each picture, generating pictures theme storehouse, wherein, each theme comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
2. equipment according to claim 1, wherein, described subject determination device is also configured to use LDA algorithm or LSA algorithm to set up each theme.
3. equipment according to claim 2, wherein, described LSA algorithm is the probability semantic analysis PLSA algorithm of diving.
4. according to the equipment described in claims 1 to 3 any one, wherein, described text combiner is also configured to:
For any image,
Enumerate the graphic feature text of this picture and the context text of this picture;
To enumerating result, carry out re-scheduling, the graphic feature text after re-scheduling and context text will be combined, generate the textual description of described picture.
5. according to the equipment described in claim 1 to 4 any one, wherein, described context text is html statement.
6. a picture searching equipment, comprising:
Message recipient, is configured to receive the search word that user inputs;
Theme converter, be configured to obtain described search word from described message recipient, and determine theme and probability distribution thereof under this search word according to theme storehouse, wherein, in described theme storehouse, store the probability distribution situation of a plurality of themes and each theme, each theme comprises a plurality of and textual description word this Topic relative;
Picture analogies degree counter, is configured to determine with the similarity degree of this search word and surpass a certain proportion of picture according to similar probability distribution.
7. equipment according to claim 6, wherein, also comprises:
Display, is configured to the picture that shows that described picture analogies degree counter is determined.
8. an image searching system, comprises the equipment of setting up picture theme storehouse described in claim 1 to 5 any one, and the picture searching equipment described in claim 6 to 7 any one.
9. a method of setting up picture theme storehouse, comprising:
Obtain plurality of pictures;
Respectively every pictures is processed, obtained its graphic feature information and context text;
Graphic feature information is converted into graphic feature text, in conjunction with the context text of this picture and the textual description that transforms this picture of graphic feature text generation generating;
According to the textual description of each picture, set up at least one theme, and the distribution situation of definite each theme, generating pictures theme storehouse, wherein, each theme comprises a plurality of and textual description word this Topic relative, and determine the theme under each textual description word and the probability that belongs to this theme, and the theme being associated with the textual description of each picture and with the probability of this Topic relative connection.
10. an image searching method, comprising:
Receive the search word of user's input;
According to theme storehouse, determine theme and probability distribution thereof under this search word, wherein, store the probability distribution situation of a plurality of themes and each theme in described theme storehouse, each theme comprises a plurality of and textual description word this Topic relative;
According to similar probability distribution, determine with the similarity degree of this search word and surpass a certain proportion of picture.
CN201310492161.2A 2013-10-18 2013-10-18 Picture searching equipment, method and system Active CN103559220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310492161.2A CN103559220B (en) 2013-10-18 2013-10-18 Picture searching equipment, method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310492161.2A CN103559220B (en) 2013-10-18 2013-10-18 Picture searching equipment, method and system

Publications (2)

Publication Number Publication Date
CN103559220A true CN103559220A (en) 2014-02-05
CN103559220B CN103559220B (en) 2017-08-25

Family

ID=50013467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310492161.2A Active CN103559220B (en) 2013-10-18 2013-10-18 Picture searching equipment, method and system

Country Status (1)

Country Link
CN (1) CN103559220B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015188719A1 (en) * 2014-06-09 2015-12-17 北京奇虎科技有限公司 Association method and association device for structural data and picture
CN105243083A (en) * 2015-09-08 2016-01-13 百度在线网络技术(北京)有限公司 Document topic mining method and apparatus
WO2016107126A1 (en) * 2014-12-30 2016-07-07 百度在线网络技术(北京)有限公司 Image search method and device
WO2016107125A1 (en) * 2014-12-30 2016-07-07 百度在线网络技术(北京)有限公司 Information searching method and apparatus
CN107221328A (en) * 2017-05-25 2017-09-29 百度在线网络技术(北京)有限公司 The localization method and device in modification source, computer equipment and computer-readable recording medium
CN108681541A (en) * 2018-01-17 2018-10-19 百度在线网络技术(北京)有限公司 Image searching method, device and computer equipment
CN110020153A (en) * 2017-11-30 2019-07-16 北京搜狗科技发展有限公司 A kind of searching method and device
CN110070512A (en) * 2019-04-30 2019-07-30 秒针信息技术有限公司 The method and device of picture modification

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799614A (en) * 2012-06-14 2012-11-28 北京大学 Image search method based on space symbiosis of visual words
US20130254184A1 (en) * 2012-03-22 2013-09-26 Corbis Corporation Proximity-Based Method For Determining Concept Relevance Within A Domain Ontology

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130254184A1 (en) * 2012-03-22 2013-09-26 Corbis Corporation Proximity-Based Method For Determining Concept Relevance Within A Domain Ontology
CN102799614A (en) * 2012-06-14 2012-11-28 北京大学 Image search method based on space symbiosis of visual words

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
卓景文: "基于主题分析的图像自动标注研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
吴启明等: "基于PLSA的个性化Web信息检索系统", 《软件导刊》 *
谢琳: "融合文本语义和视觉内容的Web人像图片检索", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
陈涛: "基于网页关联特征的互联网图像自动标注系统", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
黄鹏: "基于文本和视觉信息融合的Web图像检索", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015188719A1 (en) * 2014-06-09 2015-12-17 北京奇虎科技有限公司 Association method and association device for structural data and picture
WO2016107126A1 (en) * 2014-12-30 2016-07-07 百度在线网络技术(北京)有限公司 Image search method and device
WO2016107125A1 (en) * 2014-12-30 2016-07-07 百度在线网络技术(北京)有限公司 Information searching method and apparatus
CN105243083B (en) * 2015-09-08 2018-09-07 百度在线网络技术(北京)有限公司 Document subject matter method for digging and device
CN105243083A (en) * 2015-09-08 2016-01-13 百度在线网络技术(北京)有限公司 Document topic mining method and apparatus
US10528670B2 (en) 2017-05-25 2020-01-07 Baidu Online Network Technology (Beijing) Co., Ltd. Amendment source-positioning method and apparatus, computer device and readable medium
CN107221328A (en) * 2017-05-25 2017-09-29 百度在线网络技术(北京)有限公司 The localization method and device in modification source, computer equipment and computer-readable recording medium
CN107221328B (en) * 2017-05-25 2021-02-19 百度在线网络技术(北京)有限公司 Method and device for positioning modification source, computer equipment and readable medium
CN110020153A (en) * 2017-11-30 2019-07-16 北京搜狗科技发展有限公司 A kind of searching method and device
CN110020153B (en) * 2017-11-30 2022-02-25 北京搜狗科技发展有限公司 Searching method and device
CN108681541A (en) * 2018-01-17 2018-10-19 百度在线网络技术(北京)有限公司 Image searching method, device and computer equipment
CN108681541B (en) * 2018-01-17 2021-08-31 百度在线网络技术(北京)有限公司 Picture searching method and device and computer equipment
CN110070512A (en) * 2019-04-30 2019-07-30 秒针信息技术有限公司 The method and device of picture modification

Also Published As

Publication number Publication date
CN103559220B (en) 2017-08-25

Similar Documents

Publication Publication Date Title
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
CN103559220A (en) Image searching device, method and system
CN108920497B (en) Man-machine interaction method and device
CN107193792B (en) Method and device for generating article based on artificial intelligence
CN103324665B (en) Hot spot information extraction method and device based on micro-blog
US8515972B1 (en) Finding relevant documents
US11328128B2 (en) System and method for analysis and navigation of data
CN106960030B (en) Information pushing method and device based on artificial intelligence
Bergamaschi et al. Comparing LDA and LSA topic models for content-based movie recommendation systems
US8538965B1 (en) Determining a relevance score of an item in a hierarchy of sub collections of items
WO2013024338A1 (en) System and method for managing opinion networks with interactive opinion flows
CN110750995B (en) File management method based on custom map
US20180246879A1 (en) System and method for analysis and navigation of data
CN110737824B (en) Content query method and device
CN111813993A (en) Video content expanding method and device, terminal equipment and storage medium
Wegrzyn-Wolska et al. Tweets mining for French presidential election
Tikves et al. A system for ranking organizations using social scale analysis
CN111881695A (en) Audit knowledge retrieval method and device
Wang et al. Twiinsight: Discovering topics and sentiments from social media datasets
KR20230059364A (en) Public opinion poll system using language model and method thereof
Meneses et al. Aligning social media indicators with the documents in an open access repository
Musabeyezu Comparative study of annotation tools and techniques
Jadhav et al. Twitris: socially influenced browsing
US11989217B1 (en) Systems and methods for real-time data processing of unstructured data
CN115374108B (en) Knowledge graph technology-based data standard generation and automatic mapping method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220727

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.