CN103324646B - Retrieval assisting system and retrieval support method - Google Patents

Retrieval assisting system and retrieval support method Download PDF

Info

Publication number
CN103324646B
CN103324646B CN201210082643.6A CN201210082643A CN103324646B CN 103324646 B CN103324646 B CN 103324646B CN 201210082643 A CN201210082643 A CN 201210082643A CN 103324646 B CN103324646 B CN 103324646B
Authority
CN
China
Prior art keywords
collocation
word
keyword
mentioned
retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210082643.6A
Other languages
Chinese (zh)
Other versions
CN103324646A (en
Inventor
新名博
服部雅
服部雅一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Toshiba Digital Solutions Corp
Original Assignee
Toshiba Corp
Toshiba Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Toshiba Solutions Corp filed Critical Toshiba Corp
Publication of CN103324646A publication Critical patent/CN103324646A/en
Application granted granted Critical
Publication of CN103324646B publication Critical patent/CN103324646B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides retrieval assisting system and retrieval support method, user is not required with numerous and diverse operation in advance, just can propose to be suitable for the candidate of the search key of document set, suitably support the retrieval of document.The extracting part of retrieval assisting system extracts keyword candidate from document set, and calculating part calculates the collocation probability between keyword candidate.1st test section detection collocation keyword group, the 1st generating unit generates Collocation Dictionary, and the 2nd generating unit generates character string and supplements rule.1st identification part supplements rule according to character string and carrys out identified input keyword.Collocation communication portion, with reference to Collocation Dictionary, obtains, according to input key sequence, the collocation word that Matching Relation is propagated.Among the word string that input keyword and collocation word are associated by the 2nd identification part identification, proposal word string meeting the 2nd condition, word string is proposed in prompting part prompting.In the case of have selected proposal word string, search part generates retrieval type and enters line retrieval according to this proposal word string.

Description

Retrieval assisting system and retrieval support method
Technical field
The present invention relates to retrieval assisting system and retrieval support method.
Background technology
Document retrieval is to retrieve the literary composition comprising the search key that user specifies from the document set as retrieval object The technology of book.Here, so-called " document ", not only include the document of electronization, also include the various contents with text data.For Mitigate the operating burden of the user in document retrieval, propose various retrieval support methods in the past.
For example, as it is known that the resume of the retrieval type according to the past, the method that the candidate of search key is prompted to user.? In the method, if search key input such as " diffusion emphasize as " by the input operation of user, as being connected on The candidate of search key thereafter is it is proposed that the shoe of the retrieval type in the past such as " delay phase place ", " fatty ", " high RST ", " axle position " The word continually arranged in pairs or groups among going through.According to the method, can make to become including the generation of the retrieval type of multiple search keys to hold Easily, and mitigate the operating burden of user.But, in the method, in order to propose the candidate of suitable search key, need more Resume, resume relatively partially, in the case of deficiency, the quality that there is proposal reduces it is impossible to purpose document is entered with the worry of line retrieval.
Moreover it is known that using the Collocation Dictionary defining the 2 single contaminations that there is Matching Relation, by search key Candidate propose to the method for user.In the method, if inputting a certain word by the input operation of user as retrieval Keyword, then the candidate as the search key being connected on thereafter it is proposed that as with respect to input search key, collocation The high word of probability and be registered in the word in Collocation Dictionary.According to the method, the inspection including multiple search keys can be made Cable-styled generation becomes easy, and mitigates the operating burden of user.But, in the method, need to prepare predefined list The Collocation Dictionary of word Matching Relation each other, and, it is not suitable for the literary composition as retrieval object in cut-and-dried Collocation Dictionary In the case that book fair is closed, the quality that there is proposal reduces it is impossible to purpose document is entered with the worry of line retrieval.
As previously discussed, making to become in easy prior art including the generation of the retrieval type of multiple search keys, There is the quality proposed to reduce and purpose document can not be entered with the situation of line retrieval it is desirable to improve.
Prior art literature
Patent documentation 1:No. 2850952 public affairs of Japanese Patent Publication No.
Patent documentation 2:Japanese Unexamined Patent Publication 2006-48286 public affairs
Content of the invention
Present invention problem to be solved is to provide retrieval assisting system and retrieval support method, user is not required in advance Numerous and diverse operation is it is possible to propose document set, search key the candidate being suitable for retrieving object, and suitably supports The retrieval of document.
The retrieval assisting system of embodiment possesses extracting part, calculating part, the 1st test section, the 1st generating unit, the 2nd generation Portion, the 1st identification part, collocation communication portion, the 2nd identification part, prompting part and search part.Extracting part is from the document set of retrieval object Extract keyword candidate.Calculating part for the combination of 2 keyword candidates being extracted, calculate a keyword candidate with another Individual keyword candidate occurs in probability in the same document in above-mentioned document set, probability of arranging in pairs or groups together.1st test section inspection Survey above-mentioned collocation probability and meet the combination of 2 keyword candidates of the 1st condition, keyword group of arranging in pairs or groups.1st generating unit generates and takes Join dictionary, this Collocation Dictionary is to be candidate for entry, and the pass by the opposing party with the keyword of a side of above-mentioned collocation keyword group The set of the dictionary key element as collocation word for the key word candidate.2nd generating unit generates character string and supplements rule, and this character string is supplemented Rule is the rule for contained keyword candidate in above-mentioned collocation keyword group obtained from input character string is supplemented Then.1st identification part will be waited by keyword obtained from input character string being supplemented according to above-mentioned character string supplementary rule Mend, be identified as inputting keyword.Collocation communication portion is repeated with reference to above-mentioned Collocation Dictionary, obtains and make above-mentioned input keyword For the collocation word of the dictionary key element of entry, obtain the collocation obtaining word as the collocation word of the dictionary key element of entry Process.The collocation word that 2nd identification part will obtain in the process by above-mentioned input keyword with by above-mentioned collocation communication portion Word string among the word string associating, meeting the 2nd condition, is identified as proposing word string.Above-mentioned proposal word string is pointed out in prompting part. Search part, in the case of have selected the above-mentioned proposal word string of prompting, generates retrieval type according to this proposal word string and is directed to The retrieval of above-mentioned document set.
Reference text
Fig. 1 is the block diagram of the composition of function of retrieval assisting system illustrating the 1st embodiment.
Fig. 2 is the figure illustrating to retrieve an example of document set of object.
Fig. 3 is the figure of the example of keyword candidate illustrating to extract from the document set of retrieval object.
Fig. 4 is the figure illustrating the example of relation between the keyword candidate of extraction and the frequency of occurrences.
Fig. 5 is the figure of an example of the frequency of occurrences of the combination illustrating 2 keyword candidates.
Fig. 6 is the figure illustrating the example of collocation probability between 2 keyword candidates.
Fig. 7 is to illustrate the figure of an example of network of arranging in pairs or groups.
Fig. 8 is to illustrate the flow chart of an example of process of keyword group test section of arranging in pairs or groups.
Fig. 9 is the figure illustrating to constitute an example of the data structure of dictionary key element of Collocation Dictionary.
Figure 10 is the figure of an example illustrating Collocation Dictionary.
Figure 11 is the figure of an example illustrating PAT tree.
Figure 12 is the figure of an example of the data structure illustrating word string.
Figure 13 is an example illustrating to input the process in keyword recognition portion, collocation communication portion and proposal word string identification part Flow chart.
Figure 14 is the block diagram of the composition of function of retrieval assisting system illustrating the 2nd embodiment.
Figure 15 is to illustrate that collocation probability is the figure of 0 combination of 2 keyword candidates.
Figure 16 is the figure of an example of the data structure of dictionary key element illustrating to constitute zero Collocation Dictionary.
Figure 17 is the figure of an example illustrating zero Collocation Dictionary.
Figure 18 is an example illustrating to input the process in keyword recognition portion, collocation communication portion and proposal word string identification part Flow chart.
Figure number explanation
100 retrieval assisting systems;101 keyword candidate extracting part
103 collocation probability calculation portions;104 collocation keyword group test sections
105 Collocation Dictionary generating units;106 PAT tree generating units
108 input keyword recognition portions;109 collocation communication portions
110 proposal word string identification parts;111 display parts
112 search part;200 retrieval assisting systems
201 0 collocation keyword group test sections;202 0 Collocation Dictionary generating units
209 collocation communication portions;210 proposal word string identification parts
Specific embodiment
Hereinafter, the retrieval assisting system to embodiment and retrieval support method, explain while referring to the drawings.
(the 1st embodiment)
Fig. 1 is the block diagram of the composition of function of retrieval assisting system 100 illustrating the 1st embodiment.Present embodiment Retrieval assisting system 100 is as shown in figure 1, possess keyword candidate extracting part (extracting part) 101, index generating unit 102, collocation generally Rate calculating part (calculating part) 103, collocation keyword group test section (the 1st test section) 104, (the 1st generation of Collocation Dictionary generating unit Portion) 105, PAT tree generating unit (the 2nd generating unit) 106, input receiving unit 107, input keyword recognition portion (the 1st identification part) 108th, collocation communication portion 109, proposal word string identification part (the 2nd identification part) 110, display part (prompting part) 111 and search part 112.
Keyword candidate extracting part 101, from the document set 300 of retrieval object, extracts the candidate as search key Keyword candidate.The method extracting keyword candidate from the document set 300 of retrieval object is not particularly limited, and can make Use various methods.For example, it is possible to reference to the word dictionary registering the multiple words that can use as search key, from inspection In contained each document in the document set 300 of rope object, extract the word being registered in word dictionary as keyword candidate.
Here, can arbitrarily specify the literary composition of retrieval object using the user of the retrieval assisting system 100 of present embodiment Book fair closes 300.That is, in the retrieval assisting system 100 of present embodiment, if user specifies the document collection of retrieval object Close 300, then keyword candidate extracting part 101 is carried out and extracted keyword candidate from the document set 300 specified by user Process.Hereinafter, as the concrete example of the document set 300 of retrieval object, exemplify the set of book information as shown in Figure 2. In the example in figure 2,5 book information illustrate respectively as document 1~document 5, document 1~document 5 is respectively according to " books Name ", " authors' name ", " translator's name ", the project of " publishing house ", " ISBN " and " summary ", for example, describe correspondence with text data Books information.Extract the text data that keyword candidate extracting part 101 to describe from the project according to these each documents Keyword candidate.And, retrieve the set that the document set 300 of object is generally multiple documents, but here, in order that explanation becomes Obtain simply, and only exemplify 5 documents.
Fig. 3 illustrates the pass extracted from document set 300 as illustrated in Figure 2 by keyword candidate extracting part 101 One example of key word candidate.And, Fig. 4 illustrates an example of the relation between the keyword candidate extracting and the frequency of occurrences Son.The same keyword candidate that the frequency of occurrences of keyword candidate represents among the document in document set 300 occurs Document quantity.For example, in document set 300 as illustrated in Figure 2, " beggar " of one of keyword candidate is respectively by from document Extract in 1~document 5, so the frequency of occurrences of keyword candidate " beggar " is 5.
When keyword candidate extracting part 101 extracts keyword candidate from document set 300, will extract keyword candidate Together with the document configured information of the document indicating the appearance of this keyword candidate, index generating unit 102 is exported.And, crucial Word candidate extracting part 101, by the combination of the 2 keyword candidates extracting, exports to collocation probability calculation portion 103.Also, it is preferred that Keyword candidate extracting part 101 uses storage device (not shown), between the keyword candidate illustrating extraction and the frequency of occurrences The information as Fig. 4 of relation kept.
Index generating unit 102 according to the keyword candidate inputting from keyword candidate extracting part 101 and document configured information, Generate the index 301 of the document set 300 for retrieval object.In the case of have input a certain search key, index 301 Keep document configured information, the paperwork configured information illustrates to indicate the document that this search key occurs.
Collocation probability calculation portion 103 divides for the combination of the 2 keyword candidates inputting from keyword candidate extracting part 101 Ji Suan not arrange in pairs or groups probability, this collocation probability is that a keyword candidate occurs in document set together with another keyword candidate The probability in same document in 300.
Fig. 5 is the figure of an example of the frequency of occurrences of the combination illustrating 2 keyword candidates.And, Fig. 6 is to illustrate root One of collocation probability between the 2 keyword candidates calculating according to the frequency of occurrences of Fig. 5 and the document sum of document set 300 The figure of example.And, Fig. 7 is the figure of the example of collocation network illustrating to be generated according to the collocation probability of Fig. 6.2 passes The frequency of occurrences of the combination of key word candidate represents what among the document in document set 300,2 keyword candidates occurred simultaneously Document quantity.For example, in the document set 300 illustrated in Fig. 2, " beggar " and " in mansion " as keyword candidate goes out simultaneously Now in document 1, document 2 and document 5, so the frequency of occurrences of the combination at " beggar " and " in mansion " is 3.
The collocation combination to the 2 keyword candidates inputting from keyword candidate extracting part 101 for the probability calculation portion 103, first First, obtain the frequency of occurrences of each combination respectively.Then, collocation probability calculation portion 103 passes through the frequency of occurrences with the combination obtained Divided by contained document sum in document set 300, obtain the collocation probability between 2 keyword candidates.Then, collocation probability meter Calculation portion 103 generates the collocation network as shown in Figure 7 that the collocation probability obtained is kept, and to collocation keyword group detection The collocation network that portion 104 output generates.As shown in fig. 7, collocation network is that to have keyword candidate as node, with link The node of Matching Relation is connected to each other, will arrange in pairs or groups probability and links the network being mapped.
, according to the collocation network from the input of collocation probability calculation portion 103, detection collocation is general for collocation keyword group test section 104 Rate meets the combination of 2 keyword candidates of the 1st condition, keyword group of arranging in pairs or groups, to Collocation Dictionary generating unit 105 and PAT tree Generating unit 106 exports.Here, as the 1st condition, for example can use for so that the collocation keyword group that finally gives The threshold alpha that number sets more than or equal to the mode of threshold value beta, the also such greatly condition of likelihood ratio threshold alpha of arranging in pairs or groups.And, the 1st condition is not It is limited to this example, it is possible to use various conditions set in advance.
Fig. 8 be illustrate to collocation likelihood ratio threshold alpha also big collocation keyword group detect in the case of collocation close The flow chart of one example of process of key word group test section 104.
Threshold alpha is set as initial value set in advance by collocation keyword group test section 104 first, extracts general by collocation The combination (step S101) of the also big keyword candidate of collocation likelihood ratio threshold alpha that rate calculating part 103 calculates.Specifically, take Join keyword group test section 104 for example take out from collocation probability calculation portion 103 input collocation network 1 node, determine with The node that this node is linked by the corresponding link of the collocation probability also bigger than threshold alpha, extracts what these 2 nodes represented The combination of 2 keyword candidates.Then, collocation keyword group test section 104 is repeated to whole nodes of collocation network State process, the combination of keyword candidate also big for collocation likelihood ratio threshold alpha is all extracted.
Then, the number of the combination to the keyword candidate extracting in step S101 for the collocation keyword group test section 104 is entered Row counts, and judges whether the number of the combination of keyword candidate extracted is more than or equal to threshold value beta (step S102).If this judgement Result is that the number of the combination of keyword candidate extracted is less than threshold value beta (step S102:No), then keyword group test section of arranging in pairs or groups 104 make threshold alpha reduce ormal weight (step S103), return to step S101 and above-mentioned process is repeated.On the other hand, if extracting The combination of keyword candidate number be more than or equal to threshold value beta (step S102:It is), then by the combination of the keyword candidate extracting As collocation keyword group, (step S104) is exported to Collocation Dictionary generating unit 105 and PAT tree generating unit 106.
Collocation Dictionary generating unit 105 is taken according to the collocation keyword group from collocation keyword group test section 104 input, generation Join dictionary 302.Collocation Dictionary 302 is as shown in figure 9, be to have (entry, collocation word, collocation probability) this 3 key elements as 1 The set of the dictionary key element of data structure of group, when specifying entry, can retrieve the dictionary key element with this entry and take Must arrange in pairs or groups word and collocation probability.Collocation Dictionary generating unit 105 is for whole the taking from collocation keyword group test section 104 input Join keyword group, using the keyword candidate of a side of each collocation keyword group as entry, the keyword candidate of the opposing party is made For word of arranging in pairs or groups, generate the dictionary key element describing the collocation probability between these keyword candidates, and generate these dictionary key elements Set, i.e. Collocation Dictionary 302.Figure 10 illustrates an example of the Collocation Dictionary 302 being generated by Collocation Dictionary generating unit 105 Son.And, all of Collocation Dictionary 302 becomes big, so in Fig. 10, illustrates a part for convenience and only.
PAT tree generating unit 106 generates PAT according to the collocation keyword group from collocation keyword group test section 104 input Tree 303 (character string supplements rule), this PAT tree 303 (character string supplements rule) is obtained for supplementing to input character string The contained keyword candidate in collocation keyword group.In fig. 11, the PAT tree 303 being generated by PAT tree generating unit 106 is shown An example.And, all of PAT tree 303 becomes big, so in fig. 11, illustrates a part for convenience and only.
PAT tree 303 as shown in figure 11, has the tree construction of partial character string as each node and illustrates.From tree root to end Node is sequentially explored in side, and each several part character string that each node is had sequentially combines, and reaches as endpoint node Character string obtained by the moment of leaveves is obtained as output string.PAT tree generating unit 106 will be from collocation keyword group In the collocation keyword group of test section 104 input, contained keyword candidate is divided into partial character string respectively, the portion that will share Point character string is combined, thus generating the PAT tree that contained keyword in collocation keyword group is candidate for output string 303.In the scope of PAT tree 303 shown in Figure 11, " Orion ", " Olympics ", " Olympus ", " Japan ", " east Capital ", " spring ", " beggar " are respectively output string.
Input receiving unit 107 accepts the operation input from user.For example, carried out using input equipments such as keyboards in user In the case of the operation of input character string, input receiving unit 107 accepts the input of the character string based on user, crucial to input The character string (input character string) that word identification part 108 output is transfused to.Here, input character string includes not constituting 1 word Partial character string.And, even here, the input of only 1 word can also be processed as input character string.Whenever input Character string changes, and input receiving unit 107 is all to the new input character string after input keyword recognition portion 108 exporting change.And And, in the case that user has carried out the operation of the regulation pressing inferior determination input character string of key of for example changing one's profession, input reception The input character string of determination is exported to search part 112 by portion 107 as search key.Furthermore, selected by user In the case of selecting the operation of any one among shown proposal word string on display part 111 described later, input receiving unit 107 Accept this operation, in the motion keyword string that will select contained each keyword candidate as search key to search part 112 outputs.
Input keyword recognition portion 108, will be by being subject to input according to the PAT tree 303 being generated by PAT tree generating unit 106 The input character string that reason portion 107 accepts supplemented obtained from keyword candidate be identified as input keyword, by identification defeated Entry keyword exports to collocation communication portion 109.That is, input keyword recognition portion 108 from input receiving unit 107 accepted user to this Till input input character string, and retrieve PAT tree 303, so that it is determined that having the character string in the probability being subsequently transfused to, Character string will be inputted as word ahead, try to achieve the character string consistent with contained keyword candidate in keyword group of arranging in pairs or groups.Now, Obtained character string, is the front one with input character string among contained keyword candidate in all collocation keyword groups The character string causing.If all character strings obtaining here significantly increase as input keyword, the process load of worry back segment Plus, so the character string being identified as input keyword inputting among the character string that keyword recognition portion 108 preferred pair obtains is entered Row is simplified.Method as simplifying, for example, as set forth above, it is possible to using illustrating by keyword candidate extracting part remain In the case of the information as Fig. 4 of relation between the keyword candidate of 101 extractions and the frequency of occurrences, with reference to this information, According to frequency of occurrences order from big to small, foremost M0 part (M0 is natural number set in advance) is identified as inputting keyword Such method.
And, input keyword recognition portion 108 will be obtained to from the new word inputting of user, inputting so far The node of PAT tree 303 as basic point, explores the next node corresponding with the word of new input, and acquirement with the addition of newly defeated The consistent keyword candidate in the front of the input character string of the word entering.Then, foremost therein M0 part is identified as inputting Keyword, exports to collocation communication portion 109.Whenever the input having been carried out new word by user, input keyword recognition portion 108 Above-mentioned process is all repeated.
Collocation communication portion 109 will be closed from input with reference to the Collocation Dictionary 302 being generated by Collocation Dictionary generating unit 105, acquirement What key word identification part 108 inputted inputs keyword as the collocation word of the dictionary key element of entry, and then, acquirement is repeated will The collocation word obtaining is as the process of the collocation word of the dictionary key element of entry.Specifically, collocation communication portion 109 for from The input keyword of the M0 part of input keyword recognition portion 108 input, retrieves Collocation Dictionary 302 respectively and tries to achieve and close for input The collocation word of key word, by among multiple tried to achieve collocation words, according to the collocation probability between input keyword from Big temporary transient as 1 grade of collocation word for the collocation word of foremost M1 part (M1 is natural number set in advance) to little order Storage.Then, 1 grade of collocation word to M1 part for the collocation communication portion 109, retrieves Collocation Dictionary 302 respectively and tries to achieve and take for 1 grade Join the collocation word of word, by among multiple tried to achieve collocation words, according to the collocation probability between 1 grade of collocation word Order from big to small is temporary as 2 grades of collocation words for the collocation word of foremost M2 part (M2 is natural number set in advance) When storage.Now, collocation communication portion 109 is obtained to the collocation probability between input keyword and 1 grade of collocation word, integrating (example As multiplying, integral operation) value, i.e. integrating obtained from collocation probability between 1 grade of collocation word and 2 grades of collocation words Collocation probability, and temporarily store.Furthermore, if try to achieve integrating collocation likelihood ratio threshold gamma also big, collocation communication portion 109 as from 2 grades of collocation words, to 3 grades of collocation words, from 3 grades of collocation words to 4 grades of collocation words so, are repeated crucial according to input Word and order try to achieve the process of the keyword candidate of Matching Relation propagation, try to achieve and will have between the keyword candidate of Matching Relation Collocation probability sequence area let it pass value, i.e. integrating collocation probability.Then, it is equal to if the integrating collocation probability tried to achieve becomes less than Threshold gamma, then collocation communication portion 109 stopping is processed.
That is, using input keyword as the collocation word of the dictionary key element of entry as 1 grade of collocation word, by L- 1 grade collocation word (L is greater than the natural number equal to 2) as the dictionary key element of entry collocation word as L level arrange in pairs or groups word When, collocation communication portion 109 is repeated increases by the 1, process of word while acquirement L level is arranged in pairs or groups by each for L, when to input pass Collocation probability between key word and 1 grade of collocation word, sequentially between integrating L-1 level collocation word and L level collocation word Value obtained from collocation probability, i.e. integrating collocation probability be when becoming less than equal to threshold gamma, stopping process.Then, collocation is propagated 109, portion by the word string associating to L level arranges in pairs or groups word from the input keyword obtaining in the stage stopping processing or The word string that will associate to L-1 level collocation word from input keyword, together with the integrating collocation probability of each word string, to proposal Word string identification part 110 exports.And, it is in order to avoid situation repeatedly propagated each other in the high keyword of collocation probability, identical Keyword candidate repeat word string from process object among except.New whenever inputting from input keyword recognition portion 108 Input keyword, collocation communication portion 109 all implements above-mentioned process.
Figure 12 is illustrate the data structure of the word string to the output of proposal word string identification part 110 from collocation communication portion 109 one The figure of individual example.From collocation communication portion 109 to proposing word string identification part 110, for example as shown in figure 12, input is closed by output combination Key, 1 grade of collocation word ..., L level arranges in pairs or groups the number obtained from integrating collocation probability of the word string that associates of word and this word string According to.And, will associate to L-1 level collocation word from input keyword and in the case of becoming word string, from collocation Communication portion 109 to propose word string identification part 110 output be combined with by input keyword ..., L-1 level word of arranging in pairs or groups associates The data obtained from integrating collocation probability of word string and this word string.
And, in the above example, illustrate communication portion of arranging in pairs or groups when integrating collocation probability becomes less than equal to threshold gamma 109 stoppings obtain the situation of L level collocation word, but the condition that collocation communication portion 109 stops processing is not limited to this example, for example, The process stopping collocation communication portion 109 when the quantity of the word string obtaining reaches the upper limit can also be used or work as and 1 grade of Collocation The quantity of the collocation word of language series winding reaches the such condition of process stopping collocation communication portion 109 during the upper limit.And, above-mentioned In example, when being obtained the process from 1 grade of collocation word to L level collocation word, to according to obtained collocation word Collocation probability order from big to small is foremost M1 part, foremost ML part is simplified is taken but it is also possible to presetting and becoming Must arrange in pairs or groups word condition collocation probability, as arranging in pairs or groups word from 1 grade of collocation word to L level and order obtains this and presets Collocation probability more than content.
Propose word string identification part 110 by among the collocation word string that inputs of communication portion 109, the word string that meets the 2nd condition knows Word string Wei not proposed, the proposal word string to display part 111 output identification.Here, as the 2nd condition, it is, for example possible to use according to Integrating collocation probability order from big to small is foremost N part (N is natural number set in advance) such condition.And, the 2nd Condition is not limited to this example, it is possible to use various conditions set in advance.
Figure 13 is to illustrate to input keyword recognition portion 108, collocation communication portion 109 and the process proposing word string identification part 110 An example flow chart.
When from input receiving unit 107 input character string, input keyword recognition portion 108, according to PAT tree 303, takes Keyword candidate obtained from input character string must being supplemented, among the keyword candidate that will obtain, according to appearance frequency Rate order from big to small is that the keyword candidate of foremost M0 part is identified as inputting keyword (step S201).Then, input The input keyword to collocation communication portion 109 output identification for the keyword recognition portion 108.
Then, when communication portion 109 reference of when inputting the input keyword that keyword recognition portion 108 inputs M0 part, arranging in pairs or groups Collocation Dictionary 302, try to achieve respectively M0 part input keyword collocation word, by these arrange in pairs or groups words among, according to defeated The keyword candidate for foremost M1 part for the probability order from big to small of arranging in pairs or groups between entry keyword is as 1 grade of collocation word Obtain (step S202).And, with reference to Collocation Dictionary 302, (L is big to collocation communication portion 109 to try to achieve L-1 level collocation word respectively In the natural number equal to 2) collocation word, by these arrange in pairs or groups words among, according to L-1 level arrange in pairs or groups word between collocation Probability order from big to small obtains (step S203) for the keyword candidate of foremost ML part as L level collocation word.So Afterwards, collocation communication portion 109 judges to have calculated L-1 level to the collocation probability sequence area between input keyword and 1 grade of collocation word Whether value obtained from collocation probability that collocation word and L level are arranged in pairs or groups between word, i.e. integrating collocation probability are less than or equal to threshold value γ (step S204).If the result of this judgement is also big (step S204 of integrating collocation likelihood ratio threshold gamma:No), then collocation is propagated The value of L is incremented by (+1) (step S205) return to step S203 afterwards by portion 109, and the process that obtain L level collocation word is repeated.Separately On the one hand, if integrating collocation probability is less than or equal to threshold gamma (step S204:It is), then collocation communication portion 109 will input key The word string that the collocation word of word and acquirement associates, together with the integrating collocation probability of each word string, to proposal word string identification part 110 outputs.
Then, when from arrange in pairs or groups communication portion 109 input word string when it is proposed that word string identification part 110 will input word string among, It is identified as proposing word string (step S206) according to the word string that integrating collocation probability order from big to small is foremost N part.Then, Propose the proposal word string (step S207) to display part 111 output identification for the word string identification part 110.
Display part 111 shows from the proposal word string proposing word string identification part 110 input, and is prompted to user.User is permissible Using input equipments such as mouse, keyboards, select some in the proposal word string of display part 111 display.Have selected aobvious in user In the case of showing some in the proposal word string of portion 111 display, this operation is transfused to receiving unit 107 and accepts, the proposal of selection In word string, contained each keyword candidate exports search part 112 respectively as search key.And, from retrieval described later In the case that portion 112 have input retrieval result, display part 111 shows retrieval result.
In the case that some in the proposal word string of display part 111 display is easily selected by a user, search part 112 accepts Contained each keyword candidate in selected proposal word string, generates these keyword candidates whole as search key The retrieval type comprising, carries out the retrieval for document set 300.Specifically, search part 112 cross index generating unit 102 is raw The index 301 becoming, sequentially enters line retrieval to the document configured information including contained search key in retrieval type, tries to achieve bag Include the document configured information of the contained document of whole search keys in retrieval type.Then, search part 112 is from retrieval object In document set 300, the document of the document configured information instruction obtained by obtaining, using necessary information as retrieval result to aobvious Show that portion 111 exports.
The retrieval assisting system 100 of present embodiment, is constituted as hardware, for example, can possess CPU etc. using employing The display devices such as the input equipments such as the internal storage devices such as control device, ROM and RAM, keyboard and mouse, liquid crystal panel, hard disk, The hardware of the common computer of the external memories such as CD, DVD, flash memory is constituted.In this case, executive program, from And the composition of the above-mentioned functions of retrieval assisting system 100 of present embodiment can be realized.And, index 301, Collocation Dictionary 302 and PAT trees 303, for example, can be saved in the external memories such as hard disk, flash memory.
Then, illustrate and the document set 300 illustrated in Fig. 2 is entered the situation of line retrieval as retrieval object, to this The concrete example of the process of retrieval assisting system 100 of embodiment illustrates.Hereinafter, the document set according to Fig. 2 300, generate the Collocation Dictionary 302 illustrated in Figure 10 and the PAT tree 303 illustrated in Figure 11.
First, when user inputs character string " a (for inputting the partial character of " difficult to understand ") " using input equipments such as keyboards When, input receiving unit 107 accepts the input of this character string, to input keyword recognition portion 108 output input character string " a ".Defeated Entry keyword identification part 108 accepts input character string " a ", the PAT tree 303 illustrated in retrieval Figure 11 from input receiving unit 107, enters The process that row is identified to input keyword.Input keyword recognition portion 108 first, meets input character string to try to achieve The node of " a ", and start sequentially to retrieve the PAT tree 303 illustrated in Figure 11 from tree root, obtain as the node meeting " a " The node of " difficult to understand ".Then, input keyword recognition portion 108 determines that the node from this " difficult to understand " starts the leaveves group exploring, and tries to achieve this Character string shown in leaveves tries to achieve the keyword that input character string " a " is supplemented as follow-up in the character string of " difficult to understand " Candidate.In this case, obtain " Olympics ", " Orion " and " Olympus ".Input keyword recognition portion 108 is by institute Among the keyword candidate obtaining, to be identified as input according to frequency of occurrences order from big to small for foremost M0 part crucial Word.Here, being set as M0=3.In this case, input keyword recognition portion 108 will " Olympics ", " Orion " and " Austria This mountain of woods " is identified as inputting keyword, and to collocation communication portion 109 output, these input keyword.
Collocation communication portion 109 from input keyword recognition portion 108 accept input keyword " Olympics ", " Orion " and " Olympus ", with reference to the Collocation Dictionary 302 illustrated in Figure 10, trying to achieve will as the dictionary of entry using these input keywords Element.Here, trying to achieve dictionary key element shown below.
(Orion, beggar, 0.75)
(Orion, little to, 0.5)
(Orion, rock, 0.5)
(Orion, secondary youth, 0.5)
(Orion, world cup, 0.5)
(Orion, publication, 0.4)
(Orion, sickle storehouse, 0.33)
(Orion, China fir field, 0.33)
(Orion, Japan, 0.25)
(in Orion, mansion, 0.25)
(Olympics, Japan, 0.67)
(Olympics, hole river, 0.5)
(Olympics, Zhi Pu, 0.5)
(Olympics, Kawasaki, 0.5)
(Olympics, the spring son, 0.5)
(Olympics, secondary youth, 0.5)
(Olympics, the world, 0.5)
(Olympics, track and field, 0.5)
(Olympics, Taro, 0.5)
(Olympics, publication, 0.4)
(Olympics, beggar, 0.4)
(in Olympics, mansion, 0.25)
(Olympus, sickle storehouse, 0.5)
(Olympus, Tokyo, 0.5)
(in Olympus, mansion, 0.33)
(Olympus, publication, 0.2)
(Olympus, beggar, 0.2)
Then, collocation communication portion 109 (inputs by among the collocation word of obtained dictionary key element, according to entry Keyword) between collocation probability order from big to small for foremost M1 part collocation word as 1 grade collocation word and take ?.Here, being set as M1=3.And, for the collocation probability identical collocation word between entry, for instance, it is possible to word On the basis of the frequency of occurrences of bar order from big to small, frequency of occurrences order from big to small of collocation word etc., to determine and to take The collocation word obtaining.In this case, the collocation probability and input keyword " Orion " between be 0.75 " beggar " with defeated Collocation probability between entry keyword " Olympics " is the collocation between 0.67 " Japan " and input keyword " Orion " Probability is that 0.5 " little to " obtains respectively as 1 grade of collocation word.
Furthermore, collocation communication portion 109 has 2 grades of collocation words of Matching Relation to try to achieve with 1 grade of collocation word, once again With reference to Collocation Dictionary 302, try to achieve 1 grade of word of arranging in pairs or groups as the dictionary key element of entry.Here, try to achieve dictionary shown below will Element.
(beggar, Orion, 0.75)
(beggar, in mansion, 0.6)
(beggar, sickle storehouse, 0.4)
(beggar, China fir field, 0.4)
(beggar, Olympics, 0.4)
(Japan, Olympics, 0.67)
(Japan publishes, 0.6)
(Japan, little to 0.33)
(Japan, Taro, 0.33)
(Japan, hole river, 0.33)
(Japan, Zhi Pu, 0.33)
(little to, China fir field, 0.5)
(little to, Tokyo, 0.5)
(little to, Orion, 0.5)
Then, among the collocation word of the dictionary key element obtaining, obtain integrating collocation probability becomes collocation communication portion 109 The collocation word also bigger than threshold gamma, as 2 grades of collocation words being associated with 1 grade of collocation word.Here, being set as threshold gamma =0.37.In this case, for 1 grade of collocation word " beggar ", and as the collocation between " Orion " of input keyword Probability is 0.75, therefore in order that integrating collocation probability becomes greater than 0.37, and needs obtain 2 that collocation probability is more than or equal to 0.49 Level collocation word, " Orion " and " in mansion " that meet this condition obtains respectively as 2 grades of collocation words.But, due to " difficult to understand Profit peace " is repeated with input keyword, so except from the object processing.And, for 1 grade of collocation word " Japan ", with work It is 0.67 for the collocation probability between " Olympics " of input keyword, therefore in order that integrating collocation probability becomes 0.37 and needs Obtain 2 grades of collocation words that collocation probability is more than or equal to 0.55, " Olympics " and " publication " that meet this condition is made respectively Obtain for 2 grades of collocation words.But, " Olympics " is repeated with input keyword, so except from the object processing.And And, for 1 grade of collocation word " little to ", and it is 0.5 as the collocation probability between " Orion " of input keyword, therefore in order to Make integrating collocation probability go above 0.37, and needs obtain 2 grades of collocation words that collocation probability is more than or equal to 0.8, but due to There is not the collocation word meeting this condition, so not obtaining 2 grades of collocation words.
The result of above process, arranges in pairs or groups what word associated as by input keyword and 1 grade of collocation word and 2 grades Word string, obtain (Orion, beggar, in mansion, 0.45) and (Olympics, Japan, publication, 0.4).
Afterwards, due to trying to achieve 3 grades of collocation words with 2 grades of collocation words with Matching Relation, so collocation communication portion 109 Referring again to Collocation Dictionary 302, try to achieve 2 grades of words of arranging in pairs or groups as the dictionary key element of entry.Here, trying to achieve word shown below Allusion quotation key element.
(in mansion, sickle storehouse, 0.67)
(in mansion, publish, 0.6)
(in mansion, beggar, 0.6)
(publication, Japan, 0.6)
(publish, in mansion, 0.6)
(publication, Orion, 0.4)
Then, collocation communication portion 109 carries out, among the collocation word of obtained dictionary key element, obtaining and integrating being arranged in pairs or groups Probability becomes the collocation word also bigger than threshold gamma=0.37, takes as 3 grades of collocation words being connected with 2 grades of collocation words The process obtaining.In this case, for 2 grades of collocation word " in mansion ", integrating collocation probability so far is 0.45, so being Make integrating probability of arranging in pairs or groups become also bigger than 0.37, and needs obtain 3 grades of collocation words that collocation probability is more than or equal to 0.82, but It is not exist due to meeting the collocation word of this condition, so not obtaining 3 grades of collocation words.And, for 2 grades of collocation words " publication ", integrating collocation probability so far is 0.4, so in order that integrating collocation probability becomes also bigger than 0.37, and need Obtain 3 grades of collocation words that collocation probability is more than or equal to 0.99, but the collocation word meeting this condition does not exist, so not having Have and obtain 3 grades of collocation words.Therefore, collocation communication portion 109 stops the process making collocation word propagate, by process so far In obtain word string, i.e. (Orion, beggar, in mansion, 0.45) and (Olympics, Japan, publication, 0.4), to proposing that word string knows Other portion 110 exports.
Propose word string identification part 110 from collocation communication portion 109 accept as word string (Orion, beggar, mansion, 0.45) And (Olympics, Japan, publication, 0.4), by among these word strings, according to integrating collocation probability order from big to small be Foremost N part is identified as proposing word string.Here, being set as N=2.And, for integrating arrange in pairs or groups probability identical word string, for example, Can with input the frequency of occurrences order from big to small of keyword, the frequency of occurrences order from big to small of 1 grade of collocation word, On the basis of the frequency of occurrences of 2 grades of collocation words order from big to small etc., to determine the proposal word string of N part.In this case, carry (Orion, beggar, in mansion) and (Olympics, Japan, publication) is identified as proposing word string by view word string identification part 110 respectively, To display part 111 output, these propose word string.
When from propose word string identification part 110 accept to propose word string (Orion, beggar, in mansion) and (Olympics, Japan, Publish) when, display part 111 shows that these propose that word strings are prompted to user.Then, when user is come using input equipments such as mouses When being selected among the proposal word string of display on display part 111, such as (Orion, beggar, in mansion) operation, this choosing Select operation and be transfused to receiving unit 107 and accept each keyword it is proposed that " Orion ", " beggar " and " in mansion " contained in word string Candidate is output to search part 112.And, in the state of proposing that word string is shown in display part 111, user does not select to appoint Propose word string but then input character string for what one, such as, in the case of have input " Olympic " such character string, input is closed " Olympics " and " Olympus " is identified as inputting keyword by key word identification part 108, afterwards, carries out same process.
When accepting " Orion ", " beggar " and " in mansion " from input receiving unit 107, search part 112 generates closes these Whole retrieval types comprising as search key of key word candidate, carry out the retrieval for document set 300.As a result, Obtain " document 2 " among document set 300 illustrated in from Fig. 2, the information about " document 2 " is shown in as retrieval result On display part 111.
More than, enumerate specific example and be illustrated in detail, the retrieval according to present embodiment is supported Device 100, user only specifies the document set 300 of retrieval object it is possible to detect contained keyword in the paperwork set 300 Among candidate, the collocation big keyword candidate of probability combination is arranged in pairs or groups keyword group, generates based on collocation keyword group Collocation Dictionary 302, and generate PAT tree 303, this PAT tree 303 is will to be constituted collocation for supplementing to input character string The keyword candidate of keyword group supplements rule as character string obtained from input keyword.Then, work as user inputs character During string, prompt the user with and will start to propagate from input keyword corresponding to the input keyword of this input character string and Matching Relation The collocation proposal word string that associates of word, when user selects suggested proposal word string, from the document collection of retrieval object Close and in 300, enter the retrieval being about to propose the document that contained keyword candidate all comprises in word string.Therefore, according to this embodiment party The retrieval assisting system 100 of formula, does not require numerous and diverse operating it is possible to propose the document collection with retrieval object in advance to user Close the candidate of 300 matched search keys, suitably support the retrieval of document.
And, according to the retrieval assisting system 100 of present embodiment, communication portion 109 of arranging in pairs or groups with reference to Collocation Dictionary 302, Using input keyword as the collocation word of the dictionary key element of entry as 1 grade collocation word, using L-1 level arrange in pairs or groups word as When the collocation word of the dictionary key element of entry is as L level collocation word, it is repeated and makes each increase by 1 of L obtain L The process of level collocation word, stops processing it is possible to make to be prompted to when integrating collocation probability becomes less than equal to threshold gamma The integrating collocation probability of the proposal word string of user is consistent with the value close to threshold gamma.This is related to not useless to user's proposition Propose, but effectively propose contained multiple keyword candidate in document set 300 as search key.Namely Say, integrating arranges in pairs or groups the excessive word string of probability although the situation about enumerating of the usually high word of relatedness is more, but in this enforcement In the retrieval assisting system 100 of mode, due to prompting the user with the proposal word string close to threshold gamma for the integrating collocation probability, so not It is enumerating of the high word of the general relatedness of proposal, and being by is preferred proposal for a user.
(the 2nd embodiment)
Then, the 2nd embodiment is illustrated.2nd embodiment detects 2 keywords that collocation probability is zero in advance The combination of candidate, i.e. zero collocation keyword group, are generated in advance zero Collocation Dictionary, rise with collocation word association inputting keyword In the case of there is, among the word string come, the word string of 2 keyword candidates simultaneously comprising to constitute zero collocation keyword group, should Except word string is from the object proposing word string.That is, sequentially obtaining input and closing by propagating Matching Relation from input keyword In the case of the follow-up keyword candidate of key word, as between input keyword and 1 grade of collocation word, in 1 grade of word and 2 of arranging in pairs or groups Between level collocation word like that, ensure Matching Relation between continuous keyword candidate, but be not limited to for example crucial in input Between word and 2 grades of collocation words like that, must have Matching Relation between detached keyword candidate.Thus, for example, defeated There is no Matching Relation (that is, input keyword and 2 grades of collocation words go out simultaneously between entry keyword and 2 grades of collocation words Existing document is not present in the document set of retrieval object) in the case of, such word string is pointed out as proposal word string To user, when user selects this proposal word string to carry out the retrieval for document set, exist can not obtain retrieval result ( That is, hit number of packages is zero) such problem.Then, in the present embodiment, when the probability that comprised to arrange in pairs or groups is zero simultaneously Except the word string of 2 keyword candidates is from the object proposing word string, thus when document set is retrieved according to proposal word string, one Surely obtain retrieval result.And, the following describes only the characteristic of present embodiment, to the composition sharing with the 1st embodiment Simultaneously the repetitive description thereof will be omitted for the additional same symbol of key element.
Figure 14 is the block diagram of the composition of function of retrieval assisting system 200 illustrating the 2nd embodiment.Present embodiment Retrieval assisting system 200 as shown in figure 14, outside the composition of the retrieval assisting system 100 of the 1st embodiment, is also equipped with zero and takes Join keyword group test section (the 2nd test section) 201, zero Collocation Dictionary generating unit (the 3rd generating unit) 202.And, present embodiment Retrieval assisting system 200 possess collocation communication portion 209 and propose word string identification part 210, to replace the retrieval of the 1st embodiment Collocation communication portion 109 and proposal word string identification part 110 that assisting system 100 possesses.
Zero collocation keyword group test section 201 inputs the collocation network illustrated in Fig. 7, root from collocation probability calculation portion 103 According to this collocation network, detect the combination of 2 keyword candidates that probability is zero of arranging in pairs or groups, i.e. zero collocation keyword group, arrange in pairs or groups to zero Dictionary creation portion 202 exports.Specifically, zero collocation keyword group test section 201 for example, selects keyword candidate extracting part Keyword candidate among the keyword candidate of 101 extractions, that the frequency of occurrences exceedes setting, to these keywords selected Candidate, obtains combining of the keyword candidate being zero with probability of arranging in pairs or groups respectively, as zero collocation keyword group.Here, will appear from Frequency exceedes the keyword candidate of setting as object, is because estimating the substantially little keyword candidate of the frequency of occurrences originally not As the keyword candidate constituting collocation keyword group.
Figure 15 is to illustrate the figure of the combination of 2 keyword candidates that probability is 0 of arranging in pairs or groups, and is 2 having extracted illustrated in Fig. 6 Among collocation probability between keyword candidate, collocation probability is zero situation.In the example of Figure 15, the frequency of occurrences is more than Keyword candidate equal to 2, represents combining of another keyword candidate being zero with probability of arranging in pairs or groups respectively.
Zero Collocation Dictionary generating unit 202 according to from zero collocation keyword group test section 201 input zero collocation keyword group, Generate zero Collocation Dictionary 304.Zero Collocation Dictionary 304 as shown in figure 16, is to have 2 key elements of (entry, zero collocation word) As the set of the dictionary key element of 1 group of data structure, when entry is designated, the dictionary key element with this entry can be retrieved And obtain zero collocation word.Zero Collocation Dictionary generating unit 202 is for whole zero inputting from zero collocation keyword group test section 201 Collocation keyword group, using the keyword candidate of the side having been simplified with the frequency of occurrences as entry, by the key of the opposing party Word candidate to generate dictionary key element as zero collocation word, and the set generating these dictionary key elements is zero Collocation Dictionary 304.In figure In 17, the example of zero Collocation Dictionary 304 being generated by zero Collocation Dictionary generating unit 202 is shown.And, zero Collocation Dictionary 304 all changes are big, so in fig. 17, illustrate a part for convenience and only.
Collocation communication portion 209, for the input keyword of the M0 part from the input of input keyword recognition portion 108, is retrieved respectively Collocation Dictionary 302 and try to achieve for input keyword collocation word, by multiple try to achieve collocation words among, according to input Collocation probability order from big to small between keyword is the Collocation of foremost M1 part (M1 is natural number set in advance) Language temporarily to store as 1 grade of collocation word.Then, collocation communication portion 109, for 1 grade of collocation word of M1 part, is retrieved respectively Collocation Dictionary 302 and try to achieve for 1 grade collocation word collocation word, by multiple try to achieve collocation words among take with 1 grade The collocation probability joined between word temporarily stores as 2 grades of collocation words more than or equal to the collocation word of threshold value δ.Furthermore, take Joined communication portion 209 before the number of the collocation word being associated with 1 grade of collocation word reaches setting A set in advance, as From 2 grades of collocation words to 3 grades of collocation words, from 3 grades of collocation words to 4 grades of collocation words so, order is repeated and tries to achieve and 1 The process of the collocation word that level collocation word is associated, when the number of the collocation word being associated with 1 grade of collocation word reaches rule During definite value A, stopping is processed.
That is, collocation communication portion 209 using input keyword take as the collocation word of the dictionary key element of entry as 1 grade Join word, word (L is greater than the natural number equal to 2) that L-1 level is arranged in pairs or groups is made as the collocation word of the dictionary key element of entry During for L level collocation word, it is repeated and makes L increase by 1 process obtaining L level collocation word every time, when L is changed into advising During definite value A, try to achieve the integrating collocation probability to L level arranges in pairs or groups word from input keyword, stopping is processed.Then, collocation passes Broadcast 209, the portion word that will associate to L level arranges in pairs or groups word from the input keyword obtaining in the stage that stopped process String, together with the integrating collocation probability of each word string, exports to proposal word string identification part 210.And, in a same manner as in the first embodiment, In order to avoid repeatedly propagating each other in the collocation high keyword of probability, and the word string that identical keyword candidate is repeated from Except in the object processing.Whenever inputting new input keyword from input keyword recognition portion 108, collocation communication portion 209 is all Implement above-mentioned process.
And, in the above example, illustrate to reach regulation when the quantity of the collocation word being associated with 1 grade of collocation word Stop the situation of the process of collocation communication portion 209 during value A, but the condition stopping collocation communication portion 209 is not limited to this example, example As it is also possible in a same manner as in the first embodiment, using communication portion of arranging in pairs or groups when integrating collocation probability becomes less than equal to threshold gamma 209 stoppings process or stop the such condition of process of collocation communication portion 209 when the quantity of the word string obtaining reaches the upper limit.
The proposal word string identification part 110 of proposal word string identification part 210 and the 1st embodiment is same, will be from collocation communication portion Word string among the word string of 209 inputs, meeting the 2nd condition is identified as proposing word string, by the proposal word string of identification to display part 111 outputs.But it is proposed that word string identification part 210 is with reference first to zero Collocation Dictionary 304, judge to input from collocation communication portion 210 Word string among with the presence or absence of comprising the word string of 2 keyword candidates constituting zero collocation keyword group simultaneously, exist so Word string in the case of, except this word string, and the word string by satisfaction the 2nd condition among remaining word string, as proposing word Go here and there and display part 111 is exported.And, as the 2nd condition, in a same manner as in the first embodiment, it is, for example possible to use according to integrating Collocation probability order from big to small is foremost N part (N is natural number set in advance) such condition.
Figure 18 is the input keyword recognition portion 108 of retrieval assisting system 200, the collocation communication portion illustrating present embodiment 209 and propose word string identification part 210 an example of process flow chart.
When from input receiving unit 107 input character string, input keyword recognition portion 108, according to PAT tree 303, takes Keyword candidate obtained from input character string must being supplemented, among the keyword candidate that will obtain, according to appearance frequency Rate order from big to small is the keyword candidate of foremost M0 part, is identified as inputting keyword (step S301).Then, input The input keyword to collocation communication portion 209 output identification for the keyword recognition portion 108.
Then, when communication portion 209 reference of when inputting the input keyword that keyword recognition portion 108 inputs M0 part, arranging in pairs or groups Collocation Dictionary 302, try to achieve respectively M0 part input keyword collocation word, by these arrange in pairs or groups words among, according to collocation Probability order from big to small is the keyword candidate of foremost M1 part, obtains (step S302) as 1 grade of collocation word.And And, collocation communication portion 209, with reference to Collocation Dictionary 302, tries to achieve L-1 level collocation word (L is greater than the natural number equal to 2) respectively Collocation word, the collocation probability and L-1 level collocation word between among these words of arranging in pairs or groups is more than or equal to threshold value δ Keyword candidate, obtains (step S303) as L level collocation word.Then, collocation communication portion 209 judges whether the value of L reaches To setting A (step S304) set in advance.If the value as L for the result that this judges is less than setting A (step S304:No), Then collocation communication portion 209 makes the value of L be incremented by (+1) (step S305) return to step S303 afterwards, and acquirement L level Collocation is repeated The process of language.On the other hand, if the value of L reaches setting A (step S304:It is), then collocation communication portion 209 will input key The word string that the collocation word of word and acquirement associates, together with the integrating collocation probability of each word string, to proposal word string identification part 210 outputs.
Then, it is proposed that word string identification part 210 is with reference to zero Collocation Dictionary 304 when from collocation communication portion 209 input word string, Judge among the word string of input with the presence or absence of the word string comprising 2 keyword candidates constituting zero collocation keyword group simultaneously (step S306).Then, in the case of there is the word string simultaneously comprising 2 keyword candidates constituting zero collocation keyword group (step S306:It is) it is proposed that this word string (step S307) is deleted in word string identification part 210.On the other hand, among the word string of input There is not (step S306 in the case of the word string of 2 keyword candidates simultaneously comprising to constitute zero collocation keyword group:No), jump Cross the process of step S307.Then it is proposed that remaining except the word string that will delete in step S307 of word string identification part 210 Among the word string, word string that probability order from big to small is foremost N part of arranging in pairs or groups according to integrating, is identified as proposing word string (step Rapid S308).Then it is proposed that word string identification part 210 exports the proposal word string (step S309) of identification to display part 111.
The retrieval assisting system 100 of the retrieval assisting system 200 of present embodiment and the 1st embodiment is same, as hard Part is constituted, it is for instance possible to use make use of possessing the internal storage devices such as CPU equal controller, ROM and RAM, keyboard and mouse Deng the external memories such as the display devices such as input equipment, liquid crystal panel, hard disk, CD, DVD, flash memory common computer hard Part is constituted.In this case, by executive program, it is possible to achieve present embodiment retrieval assisting system 200 upper State the composition of function.And, index 301, Collocation Dictionary 302 and PAT tree 303, zero Collocation Dictionary 304, for example can be saved in In the external memory such as hard disk and flash memory.
Then, illustrate the situation that the document set 300 illustrated in Fig. 2 is entered line retrieval as retrieval object, to this reality The concrete example applying the process of retrieval assisting system 200 of mode illustrates.Hereinafter, it is document set 300 according to Fig. 2 Collocation Dictionary 302 illustrated in generation Figure 10, the PAT tree 303 illustrated in Figure 11 and zero Collocation Dictionary 304 illustrated in Figure 17 Example.
First, when user inputs character string using input equipments such as keyboards, " a is (for inputting the foremost of " difficult to understand " etc. Character) " when, input receiving unit 107 accepts the input of this character string, will input character string " a " to input keyword recognition portion 108 Output.Input keyword recognition portion 108 accepts input character string " a ", the PAT illustrated in retrieval Figure 11 from input receiving unit 107 Tree 303, carries out the process that input keyword is identified.Input keyword recognition portion 108 meets input first of all for trying to achieve The node of character string " a ", and sequentially explore PAT tree 303 illustrated in Figure 11 from tree root, take as the node meeting " a " Obtain the node of " difficult to understand ".Then, input keyword recognition portion 108 determines that the node from this " difficult to understand " starts the leaveves group exploring, and tries to achieve Character string that this leaves represents and as " difficult to understand " follow-up character string, try to achieve the key that input character string " a " is supplemented Word candidate.In this case, obtain " Olympics ", " Orion " and " Olympus ".Input keyword recognition portion 108 will Among the keyword candidate obtaining, be foremost M0 part according to frequency of occurrences order from big to small, be identified as input crucial Word.Here, being set as M0=3.In this case, input keyword recognition portion 108 will " Olympics ", " Orion " and " Austria This mountain of woods " is identified as inputting keyword, and to collocation communication portion 209 output, these input keyword.
Collocation communication portion 209 from input keyword recognition portion 108 accept input keyword " Olympics ", " Orion " and " Olympus ", with reference to the Collocation Dictionary 302 illustrated in Figure 10, trying to achieve will as the dictionary of entry using these input keywords Element.Here, trying to achieve dictionary key element shown below.
(Orion, beggar, 0.75)
(Orion, little to, 0.5)
(Orion, rock, 0.5)
(Orion, secondary youth, 0.5)
(Orion, world cup, 0.5)
(Orion, publication, 0.4)
(Orion, sickle storehouse, 0.33)
(Orion, China fir field, 0.33)
(Orion, Japan, 0.25)
(in Orion, mansion, 0.25)
(Olympics, Japan, 0.67)
(Olympics, hole river, 0.5)
(Olympics, Zhi Pu, 0.5)
(Olympics, Kawasaki, 0.5)
(Olympics, the spring son, 0.5)
(Olympics, secondary youth, 0.5)
(Olympics, the world, 0.5)
(Olympics, track and field, 0.5)
(Olympics, Taro, 0.5)
(Olympics, publication, 0.4)
(Olympics, beggar, 0.4)
(in Olympics, mansion, 0.25)
(Olympus, sickle storehouse, 0.5)
(Olympus, Tokyo, 0.5)
(in Olympus, mansion, 0.33)
(Olympus, publication, 0.2)
(Olympus, beggar, 0.2)
Then, collocation communication portion 209, will be according to (input is closed with entry among the collocation word of the dictionary key element obtaining Key word) between collocation probability order from big to small be foremost M1 part collocation word, as 1 grade collocation word and take ?.Here, being set as M1=3.And, with regard to the collocation probability identical collocation word between entry, for instance, it is possible to word On the basis of the frequency of occurrences of bar order from big to small, frequency of occurrences order from big to small of collocation word etc., determine to take The collocation word obtaining.In this case, obtaining the collocation probability and input keyword " Orion " between respectively is 0.75 " flower Collocation probability between son " and input keyword " Olympics " be 0.67 " Japan " and input keyword " Orion " it Between collocation probability be 0.5 " little to ", as 1 grade of collocation word.
Furthermore, due to there are 2 grades of words of arranging in pairs or groups having Matching Relation with 1 grade of collocation word, so collocation communication portion 209 is again Degree, with reference to Collocation Dictionary 302, is tried to achieve 1 grade of word of arranging in pairs or groups as the dictionary key element of entry.Here, trying to achieve dictionary shown below Key element.
(beggar, Orion, 0.75)
(in beggar, mansion, 0.6)
(beggar, sickle storehouse, 0.4)
(beggar, China fir field, 0.4)
(beggar, Olympics, 0.4)
(Japan, Olympics, 0.67)
(Japan, publication, 0.6)
(Japan, Taro, 0.33)
(Japan, hole river, 0.33)
(Japan, Zhi Pu, 0.33)
(little to, China fir field, 0.5)
(little to, Tokyo, 0.5)
(little to, Orion, 0.5)
Then, collocation communication portion 20, will be with entry (1 grade of collocation word) among the collocation word of the dictionary key element obtaining Between collocation probability be more than or equal to threshold value δ keyword candidate as 2 grades collocation words obtain.Here, being set as threshold Value δ=0.4.In this case, for 1 grade of collocation word " beggar ", obtain " the Ao Li that collocation probability is more than or equal to 0.4 respectively Peace ", " in mansion ", " sickle storehouse ", " China fir field " are as 2 grades of collocation words.But, " Orion " is repeated with input keyword, so from Except in the object processing.And, for 1 grade of collocation word " Japan ", obtain " Austria that collocation probability is more than or equal to 0.4 respectively Woods gram " and " publication " are as 2 grades of collocation words.But, " Olympics " is repeated with input keyword, thus right from process As in except.And, for 1 grade of collocation word " little to ", obtain " China fir field ", " east that collocation probability is more than or equal to 0.4 respectively Capital ", " Orion " are as 2 grades of collocation words.But, " Orion " is repeated with input keyword, so from the object processing Except.
The result of above process, arranges in pairs or groups what word associated as by input keyword, 1 grade of collocation word and 2 grades Word string, obtain (Orion, beggar, in mansion), (Orion, beggar, sickle storehouse), (Orion, beggar, China fir field), (Olympics, Japan, publication), (Orion, little to, China fir field), (Orion, little to, Tokyo).
Then, collocation communication portion 209 confirms whether the number of the collocation word being associated with 1 grade of collocation word is reached in advance The setting A setting.Here, be set as setting A=1 to simplify.In this case, collocation communication portion 209 is obtaining The stage stopping of 2 grades of collocation words is processed, and tries to achieve integrating collocation respectively to by word string obtained from process so far Probability, to propose word string identification part 210 export (Orion, beggar, in mansion, 0.45), (Orion, beggar, sickle storehouse, 0.45), (Orion, beggar, China fir field, 0.3), (Olympics, Japan, publication, 0.4), (Orion, little to, China fir field, 0.25), (Ao Li Peace, little to, Tokyo, 0.25).
When from collocation communication portion 209 accept as word string (Orion, beggar, in mansion, 0.45), (Orion, beggar, Sickle storehouse, 0.45), (Orion, beggar, China fir field, 0.3), (Olympics, Japan, publication, 0.4), (Orion, little to, China fir field, 0.25), (Orion, little to, Tokyo, 0.25) when it is proposed that word string identification part 210 is with reference first to zero Collocation Dictionary 304, at these In the case of there is the word string of 2 keyword candidates simultaneously comprising to constitute zero collocation keyword group among word string, delete this word String.In zero Collocation Dictionary 304 illustrated in Figure 17, exist using " Orion " as entry, by " Tokyo " as zero Collocation The dictionary key element of language, due among above-mentioned word string (Orion, little to, Tokyo, 0.25) comprise " Orion " and " east simultaneously Capital ", so delete (Orion, little to, Tokyo, 0.25).
Then it is proposed that word string identification part 210 will not deleted and remaining word string (Orion, beggar, in mansion, 0.45), (Orion, beggar, sickle storehouse, 0.45), (Orion, beggar, China fir field, 0.3), (Olympics, Japan, publication, 0.4), (Ao Li Peace, little to, China fir field, 0.25) among, be foremost N part according to integrating collocation probability order from big to small, be identified as proposing Word string.Here, being set as N=4.And, for integrating collocation probability identical word string, for instance, it is possible to input keyword Frequency of occurrences order from big to small, the frequency of occurrences order from big to small of 1 grade of collocation word, the appearance of 2 grades of collocation words On the basis of frequency order from big to small etc., to determine the proposal word string of N part.In this case it is proposed that word string identification part 210 will (Orion, beggar, in mansion), (Orion, beggar, sickle storehouse), (Orion, beggar, China fir field), (Olympics, Japan, publication) It is respectively identified as proposing word string, to display part 111 output, these propose word string.
Propose that word string identification part 210 exports the proposal word string of display part 111, be all certain when user has carried out selection The word string of retrieval result (that is, hit number of packages is not zero) can be obtained.If for example have selected (Orion, beggar, in mansion), Obtain " document 2 " among document set 300 illustrated in then in Fig. 2, if having selected (Orion, beggar, sickle storehouse), in Fig. 2 Obtain " document 2 " among illustrated document set 300, if having selected (Orion, beggar, China fir field), illustrated in Fig. 2 " document 3 " is obtained among document set 300, if having selected (Olympics, Japan, publication), the document collection illustrated in Fig. 2 Close and obtain " document 1 " among 300.
When from proposing that word string identification part 210 accepts to propose word string (Orion, beggar, in mansion), (Orion, beggar, sickle Storehouse), (Orion, beggar, China fir field), (Olympics, Japan, publication) when, display part 111 display these propose word strings and point out To user.Then, when user carries out using input equipments such as mouses selecting among the proposal word string being displayed on display part 111 , the operation of such as (Olympics, Japan, publish) when, this selection operation is transfused to receiving unit 107 and accepts it is proposed that in word string Each keyword candidate of contained " Olympics ", " Japan " and " publication " is output to search part 112.And, carrying In the state of view word string is shown on display part 111, user does not select any proposal word string to be to continue with inputting character string, for example In the case of have input " l (l is the character of the foremost for inputting " woods " etc.) difficult to understand " such character string, input keyword is known " Orion " is identified as inputting keyword by other portion 108, after, carry out same process.
When accepting " Olympics ", " Japan " and " in mansion " from input receiving unit 107, search part 112 generates these Whole retrieval types comprising as search key of keyword candidate, carry out the retrieval for document set 300.Its knot Really, obtain " document 1 " among the document set 300 illustrated in from Fig. 2, the information about " document 1 " shows as retrieval result Show on display part 111.
More than, enumerate specific example and be illustrated in detail, dress is supported in the retrieval according to present embodiment Put 200, same with the retrieval assisting system 100 of the 1st embodiment, user only specifies the document set 300 of retrieval object, thus The combination of the keyword candidate among contained keyword candidate in detection the paperwork set 300, collocation probability is big is arranged in pairs or groups Keyword group, generates the Collocation Dictionary 302 based on collocation keyword group, and generates PAT tree 303, this PAT tree 303 be for Input character string is supplemented and is obtained the keyword candidate constituting collocation keyword group as the character string inputting keyword Supplement rule.Then, when user inputs character string, prompt the user with the input keyword corresponding with this input character string The proposal word string that the collocation word propagated from input keyword with Matching Relation associates, when user selects suggested proposal During word string, enter the literary composition being about to propose that in word string, contained keyword candidate all comprises from the document set 300 of retrieval object The retrieval of book.Therefore, the retrieval assisting system 200 according to present embodiment, does not require numerous and diverse operation in advance, just to user The candidate with the matched search key of document set 300 of retrieval object can be proposed, suitably support the retrieval of document.
And, according to the retrieval assisting system 200 of present embodiment, detect 2 keyword candidates that Matching Relation is zero Combination, i.e. zero collocation keyword group, generate zero Collocation Dictionary 304.Then it is proposed that word string identification part 210 is with reference to zero collocation word Allusion quotation 304, the word string except the word string of 2 keyword candidates being comprised to constitute zero collocation keyword group simultaneously, as proposal Word string and be input to display part 111.Therefore, the retrieval assisting system 200 according to present embodiment, only can point out really to user The proposal word string of retrieval result can be obtained in fact, user does not carry out useless retrieval ground and suitably supports retrieval.
Each function of the retrieval device 100,200 of the 1st and the 2nd embodiment described above is constituted, for example, using meter In the case that calculation machine is as the hardware composition of retrieval assisting system 100,200, can be by executing established procedure with this computer To realize.By the program of the computer execution as retrieval assisting system 100,200, it is can installation form or executable shape The file of formula, is recorded in CD-ROM (Compact Disk Read Only Memory), flash memory (FD), CD-R (Compact Disk Recordable), in the computer-readable recording medium such as DVD (Digital Versatile Disc), as calculating Machine program product and provide.
And, be configured to by by as retrieval assisting system 100,200 computer performed by program, be saved in On other computers of the network connections such as internet, download to provide by network path.Furthermore, it is also possible to be configured to by English The network paths such as special net are providing or to distribute by the program performed by the computer as retrieval assisting system 100,200.And And it is also possible to be configured to, by by the program performed by the computer as retrieval assisting system 100,200, be attached in computer To there is provided in the ROM in portion etc..
By the program performed by the computer as retrieval assisting system 100,200, it is pre-formed and supports including retrieval Main composition key element (keyword candidate extracting part 101, collocation probability calculation portion 103, the collocation keyword group of device 100,200 Test section 104, zero collocation keyword group test section 201, Collocation Dictionary generating unit 105, PAT tree generating unit 106, zero Collocation Dictionary Generating unit 202, input keyword recognition portion 108, collocation communication portion 109,209, propose word string identification part 110,210, display part 111st, search part 112) module composition, as actual hardware, for example, CPU (processor) reads program from storage medium to be come Execution.Thus, above-mentioned each element is uploaded in main storage means, and above-mentioned each element is created on main storage means On.And, part or all of the main composition key element of retrieval assisting system 100,200 of embodiment, can use ASIC(Application Specific Integrated Circuit)、FPGA(Field-Programmable Gate ) etc. Array specialized hardware is realizing.
The retrieval assisting system of at least one embodiment from the description above, will due to possessing above-mentioned main composition Element, numerous and diverse the operating it is possible to propose to be suitable for retrieving document set, the retrieval key of object in advance to user's requirement The candidate of word, and suitably support the retrieval of document.
Above although illustrating several embodiments of the invention, but these embodiments are intended only as example and point out It is not intended that limit invention scope.These new embodiments, can be implemented with other various forms, without departing from invention In the range of main idea, various omissions, displacement, change can be carried out.These embodiments and its change, are included in the model of invention Enclose with main idea in, and be included in the scope being equal to the invention described in claim.

Claims (7)

1. a kind of retrieval assisting system is it is characterised in that possess:
Extracting part, extracts keyword candidate from the document set of retrieval object;
Calculating part, for the combination of 2 keyword candidates being extracted, is calculated a keyword candidate and is waited with another keyword Mend the probability in the same document occurring in above-mentioned document set together, probability of arranging in pairs or groups;
1st test section, the above-mentioned collocation probability of detection meets the combination of 2 keyword candidates of the 1st condition, keyword of arranging in pairs or groups Group;
1st generating unit, generates Collocation Dictionary, and this Collocation Dictionary is to make the keyword candidate of a side of above-mentioned collocation keyword group For entry and using the keyword candidate of the opposing party as collocation word dictionary key element set;
2nd generating unit, generate character string supplement rule, this character string supplement rule be for input character string supplemented and Obtain the rule of contained keyword candidate in above-mentioned collocation keyword group;
1st identification part, will be by keyword obtained from input character string being supplemented according to above-mentioned character string supplementary rule Candidate, is identified as inputting keyword;
Collocation communication portion, is repeated with reference to above-mentioned Collocation Dictionary, obtaining will as the dictionary of entry using above-mentioned input keyword The collocation word of element, obtain the collocation obtaining word as different from above-mentioned input keyword in the dictionary key element of entry The process of collocation word;
2nd identification part, the process by above-mentioned input keyword with by above-mentioned collocation communication portion and the collocation word association that obtains Word string among the word string got up, meeting the 2nd condition, is identified as proposing word string;
Prompting part, points out above-mentioned proposal word string;And
Search part, in the case of have selected the above-mentioned proposal word string of prompting, generates retrieval type according to this proposal word string and enters The retrieval to above-mentioned document set for the hand-manipulating of needle.
2. as claimed in claim 1, wherein retrieval assisting system it is characterised in that
Using above-mentioned input keyword as the collocation word of the dictionary key element of entry as 1 grade of collocation word, by L-1 level Collocation word as the dictionary key element of entry collocation word as L level collocation word when, above-mentioned collocation communication portion is repeated Make L increase every time 1 to arrange in pairs or groups the process of word while obtaining above-mentioned L level, when taking to above-mentioned input keyword and above-mentioned 1 grade Join the collocation probability between word, the collocation that the above-mentioned L-1 level of order integrating arranges in pairs or groups word and above-mentioned L level is arranged in pairs or groups between word is general Value obtained from rate is to stop above-mentioned process when integrating collocation probability becomes smaller than equal to 1 threshold value, and wherein, L is greater than equal to 2 Natural number.
3. the retrieval assisting system as described in claim 2 it is characterised in that
Above-mentioned 2nd identification part is taken the period that above-mentioned input keyword and above-mentioned collocation communication portion are carried out continuously above-mentioned process Among the word string that the collocation word obtaining associates, probability order from big to small of arranging in pairs or groups according to above-mentioned integrating is foremost N The word string of part, is identified as above-mentioned proposal word string, and wherein, N is natural number set in advance.
4. as claimed in claim 1, wherein retrieval assisting system it is characterised in that
Using above-mentioned input keyword as the collocation word of the dictionary key element of entry as 1 grade of collocation word by L-1 Level collocation word as the dictionary key element of entry collocation word as L level collocation word when, above-mentioned collocation communication portion is entered repeatedly Row makes L increase by 1 every time while obtaining the process of above-mentioned L level collocation word, if L becomes setting set in advance, stops Only above-mentioned process, wherein, L is greater than the natural number equal to 2.
5. as claimed in claim 1, wherein retrieval assisting system it is characterised in that
It is also equipped with:
2nd test section, detect above-mentioned collocation probability be zero the combination of 2 keyword candidates, i.e. zero collocation keyword group;And
3rd generating unit, generates zero Collocation Dictionary, and this zero Collocation Dictionary is by the keyword of a side of above-mentioned zero collocation keyword group Candidate as entry, using the keyword candidate of the opposing party as zero collocation word dictionary key element set;
Above-mentioned 2nd identification part with reference to above-mentioned zero Collocation Dictionary, by above-mentioned input keyword with by above-mentioned collocation communication portion Among the word string that the collocation word processing and obtaining associates, there are 2 passes simultaneously comprising to constitute zero collocation keyword group In the case of the word string of key word candidate, in addition to this word string, by the word string meeting above-mentioned 2nd condition among remaining word string, It is identified as above-mentioned proposal word string.
6. as claimed in claim 1, wherein retrieval assisting system it is characterised in that
Above-mentioned 1st test section is repeated tries to achieve the also big combination of 2 keyword candidates of above-mentioned collocation likelihood ratio the 2nd threshold value Number, if obtained number is less than the 3rd threshold value, makes above-mentioned 2nd threshold value reduce ormal weight, and tries to achieve above-mentioned collocation likelihood ratio Decrease the process of the big number of the combination of 2 keyword candidates of above-mentioned 2nd threshold value of ormal weight, obtained number is become It is more than or equal to the combination of 2 keyword candidates during above-mentioned 3 threshold value, be detected as above-mentioned collocation keyword group.
7. a kind of retrieval support method, executes in retrieval assisting system, and this retrieval support method is characterised by possessing following Step:
The step that the extracting part of above-mentioned retrieval assisting system extracts keyword candidate from the document set of retrieval object;
The calculating part of above-mentioned retrieval assisting system is for the step of the combination calculation collocation probability of 2 keyword candidates being extracted Suddenly, wherein, this collocation probability is that a keyword candidate is occurred in above-mentioned document set together with another keyword candidate Same document in probability;
The step of the test section detection collocation keyword group of above-mentioned retrieval assisting system, wherein, this collocation keyword group is above-mentioned Collocation probability meets the combination of 2 keyword candidates of the 1st condition;
The step that 1st generating unit of above-mentioned retrieval assisting system generates Collocation Dictionary, wherein, this Collocation Dictionary is by above-mentioned collocation The keyword candidate of one side of keyword group as entry, using the keyword candidate of the opposing party as collocation word dictionary key element Set;
2nd generating unit of above-mentioned retrieval assisting system generates the step that character string supplements rule, and wherein, this character string supplements rule It is the rule being obtained contained keyword candidate in above-mentioned collocation keyword group to input character string for supplementing;
1st identification part of above-mentioned retrieval assisting system will be according to supplementary regular the supplement to input character string of above-mentioned character string The keyword candidate obtaining, is identified as the step inputting keyword;
The collocation communication portion of above-mentioned retrieval assisting system is repeated with reference to above-mentioned Collocation Dictionary, obtains above-mentioned input keyword As entry dictionary key element collocation word, obtain using obtain collocation word as in the dictionary key element of entry with above-mentioned The step of the different process of collocation word of input keyword;
2nd identification part of above-mentioned retrieval assisting system obtains input keyword and the process by above-mentioned collocation communication portion Word string among the word string that collocation word associates, meeting the 2nd condition, is identified as the step proposing word string;
The step that above-mentioned proposal word string is pointed out in the prompting part of above-mentioned retrieval assisting system;And
The search part of above-mentioned retrieval assisting system prompting above-mentioned proposal word string selected in the case of, according to this proposal word string The step carrying out the retrieval for above-mentioned document set to generate retrieval type.
CN201210082643.6A 2012-03-19 2012-03-26 Retrieval assisting system and retrieval support method Active CN103324646B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP062595/2012 2012-03-19
JP2012062595A JP5426710B2 (en) 2012-03-19 2012-03-19 Search support device, search support method and program

Publications (2)

Publication Number Publication Date
CN103324646A CN103324646A (en) 2013-09-25
CN103324646B true CN103324646B (en) 2017-03-01

Family

ID=49193393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210082643.6A Active CN103324646B (en) 2012-03-19 2012-03-26 Retrieval assisting system and retrieval support method

Country Status (2)

Country Link
JP (1) JP5426710B2 (en)
CN (1) CN103324646B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6173896B2 (en) * 2013-12-10 2017-08-02 株式会社日立製作所 Data processing method and data processing server
JP5952343B2 (en) * 2014-06-11 2016-07-13 ヤフー株式会社 SEARCH DEVICE, SEARCH METHOD, AND SEARCH PROGRAM
JP6072739B2 (en) * 2014-08-28 2017-02-01 ヤフー株式会社 Extraction apparatus, extraction method and extraction program
JP6149836B2 (en) * 2014-09-30 2017-06-21 ダイキン工業株式会社 Human resource search device
JP6056829B2 (en) * 2014-09-30 2017-01-11 ダイキン工業株式会社 Recommendation creation device
WO2017158926A1 (en) * 2016-03-18 2017-09-21 三菱電機株式会社 Control logic diagram creation support device
JP6077163B2 (en) * 2016-06-09 2017-02-08 ヤフー株式会社 SEARCH DEVICE, SEARCH METHOD, AND SEARCH PROGRAM
JP6622236B2 (en) * 2017-03-06 2019-12-18 株式会社日立製作所 Idea support device and idea support method
US10817551B2 (en) * 2017-04-25 2020-10-27 Panasonic Intellectual Property Management Co., Ltd. Method for expanding word, word expanding apparatus, and non-transitory computer-readable recording medium
US11314794B2 (en) 2018-12-14 2022-04-26 Industrial Technology Research Institute System and method for adaptively adjusting related search words

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03273360A (en) * 1990-03-23 1991-12-04 Hitachi Ltd Method and device for machine translation
JPH0756948A (en) * 1993-08-09 1995-03-03 Fuji Xerox Co Ltd Information retrieval device
JPH10207910A (en) * 1997-01-16 1998-08-07 Fuji Xerox Co Ltd Related word dictionary preparing device
JP2001249933A (en) * 2000-03-06 2001-09-14 Nippon Telegr & Teleph Corp <Ntt> Retrieval word input complementing method and device and recording medium having program for executing the method stored thereon
CN101601038A (en) * 2007-08-03 2009-12-09 松下电器产业株式会社 Related word presentation device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3717808B2 (en) * 2001-06-29 2005-11-16 株式会社日立製作所 Information retrieval system
US10515374B2 (en) * 2005-03-10 2019-12-24 Adobe Inc. Keyword generation method and apparatus
JP4841940B2 (en) * 2005-11-28 2011-12-21 株式会社日立製作所 How to display the station name search system
JP2010003015A (en) * 2008-06-18 2010-01-07 Hitachi Software Eng Co Ltd Document search system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03273360A (en) * 1990-03-23 1991-12-04 Hitachi Ltd Method and device for machine translation
JPH0756948A (en) * 1993-08-09 1995-03-03 Fuji Xerox Co Ltd Information retrieval device
JPH10207910A (en) * 1997-01-16 1998-08-07 Fuji Xerox Co Ltd Related word dictionary preparing device
JP2001249933A (en) * 2000-03-06 2001-09-14 Nippon Telegr & Teleph Corp <Ntt> Retrieval word input complementing method and device and recording medium having program for executing the method stored thereon
CN101601038A (en) * 2007-08-03 2009-12-09 松下电器产业株式会社 Related word presentation device

Also Published As

Publication number Publication date
JP2013196358A (en) 2013-09-30
JP5426710B2 (en) 2014-02-26
CN103324646A (en) 2013-09-25

Similar Documents

Publication Publication Date Title
CN103324646B (en) Retrieval assisting system and retrieval support method
RU2683507C2 (en) Retrieval of attribute values based upon identified entries
US9026897B2 (en) Integrated, configurable, sensitivity, analytical, temporal, visual electronic plan system
US9552352B2 (en) Enrichment of named entities in documents via contextual attribute ranking
US9417760B2 (en) Auto-completion for user interface design
US20180089316A1 (en) Seamless integration of modules for search enhancement
CN102663023A (en) Implementation method for extracting web content
EP2511869A2 (en) Method and system for providing user-customized content
WO2013140636A1 (en) Search apparatus, search method, and program
US9129024B2 (en) Graphical user interface in keyword search
CN101986293A (en) Method and equipment for displaying search answer information on search interface
JP2005122295A (en) Relationship figure creation program, relationship figure creation method, and relationship figure generation device
CN103927330A (en) Method and device for determining characters with similar forms in search engine
CN104169912A (en) Information processing terminal and method, and information management apparatus and method
US9946813B2 (en) Computer-readable recording medium, search support method, search support apparatus, and responding method
KR20110019131A (en) Apparatus and method for searching information using social relation
JP2011113289A (en) System and method for supporting document decoration
CN102016782A (en) Operation assistance system and operation assistance method
JP6621514B1 (en) Summary creation device, summary creation method, and program
US8745050B2 (en) Definitions in master documents
KR101069278B1 (en) Apparatus and Method for visualization of patent claim
JP2007257369A (en) Information retrieval device
KR102519955B1 (en) Apparatus and method for extracting of topic keyword
WO2022018899A1 (en) System for extracting subtree from kpi tree
KR20100090178A (en) Apparatus and method refining keyword and contents searching system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant