CN109033244A - Search result ordering method and device - Google Patents
Search result ordering method and device Download PDFInfo
- Publication number
- CN109033244A CN109033244A CN201810729232.9A CN201810729232A CN109033244A CN 109033244 A CN109033244 A CN 109033244A CN 201810729232 A CN201810729232 A CN 201810729232A CN 109033244 A CN109033244 A CN 109033244A
- Authority
- CN
- China
- Prior art keywords
- candidate
- correlation
- described search
- question
- answer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present invention proposes a kind of search result ordering method and device.Include search problem in user's request this method comprises: obtaining user's request and candidate result from the first ranking results, includes candidate problem and the corresponding candidate answers of each candidate problem in candidate result;Obtain the first correlation metric of search problem and candidate problem;Obtain the second correlation metric of search problem and candidate answers;It according to the first correlation metric and the second correlation metric, reorders to the first ranking results, obtains the second ranking results.Because joined more specific correlation metrics in the second sequence, so that ranking results be made not limited by single sort method, it can better and more conveniently provide accurately to answer and sort and handle some specific problems.
Description
Technical field
The present invention relates to automatic question answering technical field more particularly to a kind of search result ordering methods and device.
Background technique
With the fast development of internet, there is largely search need relevant to medical knowledge aspect.For this
A little search needs have derived medical intelligent answer service.
In medical automatic question answering, because being related to the particularity of medical treatment and the preciseness of answer, existing main method
It is to carry out relevance ranking to existing answer content to provide answer.However these rely on the method for single relevance ranking due to piece
Face property, limitation etc. lack the comprehensive measurement to question and answer correlation, it is difficult to provide accurately ranking results.And other are led
The answering method in domain can not also be directly extended to medical field.
Scheme (1) is ranked up based on the information of problem and problem, has ignored the key message for including in answer, it is desirable to
The quality of question and answer in original question and answer library is highly dependent on to good ranking results.
Scheme (2) is ranked up based on problem and the information of answer, the key message for including in problem is had ignored, in medical treatment
Field, a bit, answer may be entirely different for problem deviation, therefore it is inaccurate to will lead to sequence.
Scheme (3) is ranked up based on the method that problem, answer merge, although containing the information of problem and answer,
A kind of sort method is to have to stress to ranking results, and more complicated scene can not be coped in medical intelligent answer.
Summary of the invention
The embodiment of the present invention provides a kind of search result ordering method and device, to solve one in the prior art or more
A technical problem.
In a first aspect, the embodiment of the invention provides a kind of search result ordering methods, comprising:
User's request and candidate result are obtained from the first ranking results, include search problem, institute in user's request
State in candidate result includes candidate problem and the corresponding candidate answers of each candidate problem;
Obtain the first correlation metric of described search problem and the candidate problem;
Obtain the second correlation metric of described search problem and the candidate answers;
According to first correlation metric and second correlation metric, first ranking results are reset
Sequence obtains the second ranking results.
With reference to first aspect, the embodiment of the present invention is in the first implementation of first aspect, according to first phase
Closing property index and second correlation metric, reorder to first ranking results, obtain the second sequence as a result, packet
It includes:
The candidate question and answer group for including in high priority list is determined according to first correlation metric;
The candidate question and answer group for including in low priority list is determined according to second correlation metric;
By the candidate question and answer group in the high priority list and the low priority list, according to high priority it is preceding,
The posterior sequence of low priority merges, and obtains second ranking results.
The first implementation with reference to first aspect, second implementation of the embodiment of the present invention in first aspect
In, the candidate question and answer group for including in high priority list is determined according to first correlation metric, comprising:
If at least one first correlation metric of a candidate question and answer group is higher than given threshold, the candidate is asked
Answer a group addition high priority list.
The first implementation with reference to first aspect, the third implementation of the embodiment of the present invention in first aspect
In, the candidate question and answer group for including in low priority list is determined according to second correlation metric, comprising:
If at least one second correlation metric of a candidate question and answer group is higher than given threshold, the candidate is asked
Answer a group addition low priority list.
With reference to first aspect, the embodiment of the present invention obtains described search and asks in the 4th kind of implementation of first aspect
First correlation metric of topic and the candidate problem, at least one including following manner:
Calculate the word rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the character rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the depth problem similitude of described search problem and the candidate problem;
Calculate the term vector similitude of described search problem and the candidate problem;
The potential applications for calculating described search problem and the candidate problem index similitude.
With reference to first aspect, the embodiment of the present invention obtains described search and asks in the 5th kind of implementation of first aspect
Second correlation metric of topic and the candidate answers, at least one including following manner:
Calculate the depth question and answer correlation of described search problem and the candidate answers;
Calculate the word rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the character rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the term vector correlation of described search problem and the candidate answers;
It calculates described search problem and the potential applications of the candidate answers indexes correlation.
Second aspect, the embodiment of the invention provides a kind of search results ranking devices, comprising:
First sorting module, for obtaining user's request and candidate result, user's request from the first ranking results
In include search problem, include candidate problem and the corresponding candidate answers of each candidate problem in the candidate result;
First correlation module, for obtaining the first correlation metric of described search problem and the candidate problem;
Second correlation module, for obtaining the second correlation metric of described search problem Yu the candidate answers;
Second sorting module, for according to first correlation metric and second correlation metric, to described the
One ranking results reorder, and obtain the second ranking results.
In conjunction with second aspect, the embodiment of the present invention is in the first implementation of second aspect, the second sequence mould
Block includes:
High priority submodule, for determining the candidate for including in high priority list according to first correlation metric
Question and answer group;
Low priority submodule, for determining the candidate for including in low priority list according to second correlation metric
Question and answer group;
Ordering by merging submodule, for by the candidate question and answer in the high priority list and the low priority list
Group merges in the posterior sequence of preceding, low priority according to high priority, obtains second ranking results.
In conjunction with the first implementation of second aspect, second implementation of the embodiment of the present invention in second aspect
In, if at least one first correlation metric that the high priority submodule is also used to a candidate question and answer group is higher than setting
Then high priority list is added in the candidate question and answer group by threshold value.
In conjunction with the first implementation of second aspect, the third implementation of the embodiment of the present invention in second aspect
In, if at least one second correlation metric that the low priority submodule is also used to a candidate question and answer group is higher than setting
Then low priority list is added in the candidate question and answer group by threshold value.
In conjunction with second aspect, the embodiment of the present invention is in the 4th kind of implementation of second aspect, first correlation
Module includes at least one of following submodule:
First word rank submodule, it is similar to the candidate word rank TF-IDF of problem for calculating described search problem
Property;
First character level small pin for the case module, for calculating the character rank TF-IDF of described search problem and the candidate problem
Similitude;
First phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters grade of described search problem and the candidate problem
Other TF-IDF similitude;
Depth problem submodule, for calculating the depth problem similitude of described search problem and the candidate problem;
First term vector submodule, for calculating the term vector similitude of described search problem and the candidate problem;
First potential applications index submodule, for calculating the potential applications rope of described search problem and the candidate problem
Draw similitude.
In conjunction with second aspect, the embodiment of the present invention is in the 5th kind of implementation of second aspect, second correlation
Module includes at least one of following submodule:
Depth question and answer submodule, for calculating depth question and answer correlation of the described search problem with the candidate answers;
Second word rank submodule is related to the word rank TF-IDF of the candidate answers for calculating described search problem
Property;
Second character level small pin for the case module, for calculating the character rank TF-IDF of described search problem Yu the candidate answers
Correlation;
Second phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters grade of described search problem Yu the candidate answers
Other TF-IDF correlation;
Second term vector submodule, for calculating term vector correlation of the described search problem with the candidate answers;
Second potential applications index submodule, for calculating the potential applications rope of described search problem Yu the candidate answers
Draw correlation.
The third aspect, the embodiment of the invention provides a kind of search results ranking device, the function of described device can lead to
Hardware realization is crossed, corresponding software realization can also be executed by hardware.The hardware or software include it is one or more with it is upper
State the corresponding module of function.
It is described to deposit including processor and memory in the structure of search results ranking device in a possible design
Reservoir is used to store the program for supporting search results ranking device to execute mentioned above searching results sort method, and the processor is matched
It is set to for executing the program stored in the memory.Described search sort result device can also include communication interface, use
In search results ranking device and other equipment or communication.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, for storing search result row
Computer software instructions used in sequence device comprising for executing program involved in mentioned above searching results sort method.
A technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that on the basis of based on main sequence
Reorder technology, it is possible to prevente effectively from the aspect unicity of concern, the one-sidedness and limitation of the correlative character extracted
The shortcomings that.
Another technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that the technology of reordering is medical treatment
A nucleus module in intelligent answer.Addition is reordered module, is realized to the further of medical intelligent answer ranking results
Optimization.In other words, on the basis of having the answer sorted, we adjust to the position of part of result, so that
Certain more suitable answer position Forwards, inappropriate answer position moves back, to achieve the purpose that Optimal scheduling result.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description
Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further
Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings
Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention
Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is the flow chart according to the search result ordering method of the embodiment of the present invention.
Fig. 2 is the flow chart according to the search result ordering method of the embodiment of the present invention.
Fig. 3 is the flow chart according to the search result ordering method of the embodiment of the present invention.
Fig. 4 is the block diagram according to the search results ranking device of the embodiment of the present invention.
Fig. 5 is the block diagram according to the search results ranking device of the embodiment of the present invention.
Fig. 6 is the structural block diagram according to the search results ranking device of the embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that
Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes.
Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Fig. 1 is the flow chart according to the search result ordering method of the embodiment of the present invention.
As shown in Figure 1, the search result ordering method may comprise steps of:
Step S110, user's request and candidate result are obtained from the first ranking results, include that search is asked in user's request
It inscribes, includes candidate problem and the corresponding candidate answers of each candidate problem in candidate result.
Step S120, the first correlation metric of search problem and candidate problem is obtained.
Step S130, the second correlation metric of search problem and candidate answers is obtained.
Step S140, it according to the first correlation metric and the second correlation metric, reorders to the first ranking results,
Obtain the second ranking results.
In intelligent answer field, user can input in a search engine according to their own needs wants the problem of puing question to
(i.e. search problem).For example, according to search problem search to candidate result may include several question and answer groups (candidate problem and its
Corresponding candidate answers).Then, it is tentatively sorted using the various ways question and answer group candidate to these, such as: 1) it is based on asking
The mode of topic and problem tentatively sorts.Candidate problem and search problem are encoded, according to candidate problem and search for problem
Similarity is ranked up.2) mode based on problem and answer tentatively sorts.Candidate answers and search problem are encoded, root
It is ranked up according to the similarity of candidate answers and search problem.3) mode merged based on problem, answer.To candidate problem, time
It selects answer and search problem to be encoded, is ranked up according to comprehensive similarity.
After the first minor sort, available first sequence is as a result, available user's request from the first ranking results
With multiple candidate results.It wherein, may include the search problem of user's input in user's request, each candidate result may include
One candidate problem and its corresponding one or more candidate answers.
For multiple candidate results in the first ranking results, it is related to the first of candidate problem that search problem can be calculated
Property index, and search problem and candidate answers the second correlation metric, in conjunction with both indexs to this multiple candidate result
It reorders, to obtain, accurately ranking results more related to search problem.
In one possible implementation, as shown in Fig. 2, step S140 includes:
Step S210, the candidate question and answer group for including in high priority list is determined according to the first correlation metric.
Step S220, the candidate question and answer group for including in low priority list is determined according to the second correlation metric.
Step S230, by the candidate question and answer group in high priority list and low priority list, according to high priority it is preceding,
The posterior sequence of low priority merges, and obtains second ranking results.
In one possible implementation, step S120 includes at least one of following manner:
Calculate the word rank TF-IDF (Term Frequency-Inverse Document of search problem and candidate problem
Frequency, the inverse text frequency of word frequency -) similitude;
Calculate the character rank TF-IDF similitude of search problem and candidate problem;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF similitude of search problem and candidate problem;
Calculate the depth problem similitude of search problem and candidate problem;
Calculate the term vector similitude of search problem and candidate problem;
The potential applications for calculating search problem and candidate problem index similitude.
For example, search problem and candidate problem can be segmented, word rank is then calculated according to word segmentation result
TF-IDF similitude.A point word can be carried out to search problem and candidate problem, then basis divides word result calculating character rank TF-
IDF similitude.Search problem and the candidate problem Chinese phonetic alphabet can be obtained respectively, and phonetic transcriptions of Chinese characters is then calculated according to the Chinese phonetic alphabet
Rank TF-IDF similitude.
Wherein, the advantages of calculating phonetic transcriptions of Chinese characters TF-IDF similitude is as follows:
Phonetic is the Chinese difference one of important with English, and each Chinese text uniquely corresponds to the sequence of a phonetic
Column.Most users use spelling input method as Chinese character input tool, i.e. the first corresponding phonetic of input Chinese character, if then again from
It does in the corresponding Chinese character of the phonetic and selects.This operation leads to user it is possible that wrong choice.If identical phonetic pair
The Chinese character answered is different, such as the phonetic of " life " and " lighting a fire " is all " shenghuo ", and user may select the phonetically similar word of mistake.
In addition, sometimes user just knows that the pronunciation of some word due to the generally use of Pinyin Input, do not know that specific Chinese character is write but
Method also will affect the accuracy of Chinese character input.In medical intelligent answer scene, the doctor of various all Internet user's inputs
The text that searching request is frequently not specification is treated, may include many literal mistakes.Therefore, make to be expressed in pinyin text
This, then calculate text similarity and can then weaken the influence as caused by wrong word to a certain extent.
Phonetic transcriptions of Chinese characters TF-IDF can be calculated in character rank.For example, for the text S including Chinese character, in S
Literary Chinese character is converted into pinyin representation (not considering tone), and the non-Chinese character in S then retains original character.Each is independent
Phonetic transcriptions of Chinese characters can be regarded as an independent character.For example, by Chinese-character text " coughing with a lot of sputum " be converted into " ke ", " sou ", " tan " and
" duo " four characters.It is then possible to using modes meters such as the IDF feature of character, the TF-IDF feature of text and cosine similarities
The TF-IDF for calculating phonetic transcriptions of Chinese characters is similar.
Depth problem similitude is referred to as depth QQ similitude.When realizing depth QQ correlation, it can use and ask
The modes such as topic cluster obtain the similar several other problems Q ' of each problem Q, learn (Pairwise using sorting to grade
Learning) mode is trained.Then the model depth QQ similitude obtained search problem and candidate problem input training
Result.
In one possible implementation, step S130 includes at least one of following manner:
Calculate the depth question and answer correlation of search problem and candidate answers;
Calculate the word rank TF-IDF correlation of search problem and candidate answers;
Calculate the character rank TF-IDF correlation of search problem and candidate answers;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF correlation of search problem and candidate answers;
Calculate the term vector correlation of search problem and candidate answers;
It calculates search problem and the potential applications of candidate answers indexes correlation.
Wherein, depth QA (question and answer) correlation can excavate the language of the search problem Q and candidate answers A of user
Adopted relationship.The correlation of search problem Q with candidate answers A are calculated, using deep learning to adjust according to problem and problem phase
The ranking results obtained like degree.
For example, under medical intelligent answer scene, in addition to matching user searches for problem QuWith candidate problem QiText it is similar
Except property, sequence accuracy can be further improved by the association between matching problem and answer.On the one hand, two problems can
Can be entirely different in the text, and be semantically identical or closely similar.It is answered if both of these problems are corresponding
When case is identical or closely similar, even if QuWith QiIt is unable to complete matching, it can also be according to QuWith AiBetween association carry out
Match.On the other hand, the case where question and answer group in question and answer resources bank also occurs erroneous matching, the problems in question and answer resources bank and answers
The cost that case is difficult to accomplish to exactly match or reach exact matching is very high, may make Q in libraryiWith corresponding AiIt is not stringent
Matching.In this case, ranking results can also be finely adjusted by depth QA correlation.
In one possible implementation, it is determined in high priority list according to first correlation metric and includes
Candidate question and answer group, comprising:
If at least one first correlation metric of a candidate question and answer group is higher than given threshold, the candidate is asked
Answer a group addition high priority list.
In one possible implementation, it is determined in low priority list according to second correlation metric and includes
Candidate question and answer group, comprising:
If at least one second correlation metric of a candidate question and answer group is higher than given threshold, the candidate is asked
Answer a group addition low priority list.
Wherein, each correlation metric may have a given threshold.The threshold value of different correlation metrics may not
Together.First correlation metric mainly reflects the text similarity of problem and problem.Second correlation metric mainly reflect problem with
The correlation of answer.In embodiments of the present invention, can according to practical application scene, select required the first correlation metric and
The number amount and type of second correlation metric.Then each correlation metric and threshold between comparison search problem and candidate question and answer group
Value, so that the classification of candidate question and answer group is stored in different priority lists.
In a kind of example, many indexes can be compared according to certain sequence.First more a certain index, will be qualified
Question and answer group is put into corresponding priority list, and ineligible question and answer group is compared according to another index, and so on.
For example, if the similitude for having 10 question and answer groups and searching for the word rank of problem is higher than setting in 100 question and answer groups
This 10 question and answer groups are then added in high prioritized results list by threshold value.Then, more remaining 90 problem sets are asked with search
The similitude and given threshold of autograph symbol rank, then therefrom obtain in 20 groups of addition high priority lists.And so on, it is no longer superfluous
It states.
In another example, many indexes, then duplicate removal can be respectively compared.
For example, comparing 100 question and answer groups and searching for the similitude of the word rank of problem, the phase of 10 word ranks is therefrom chosen
It is higher than the question and answer group of given threshold like property.Compare this 100 question and answer groups and searches for the other similitude of character level of problem, Cong Zhongxuan
40 other similitudes of character level are taken to be higher than the question and answer group of given threshold.30 question and answer groups will be obtained after this 40 question and answer group duplicate removals
It is added high priority list (high priority list duplicate removal again can also first be added).
Ranking results are advanced optimized and adjusted by a variety of correlation metrics, it can be by certain more suitable answer positions
Forward is set, inappropriate answer position is moved back, with Optimal scheduling result.
In a kind of example, based on the above search problem QuWith candidate problem QiSimilitude, and search problem QuWith time
Select problem AiCorrelation, using method as shown in Figure 3, by each question and answer group (Q in ranking results beforei, Ai) according to row
Sequence is successively handled from front to back, and steps are as follows:
Step S301, Q is calculateduWith QiWord rank TF-IDF similitude, if similitude be higher than a certain threshold value, by (Qi,
Ai) the supreme prioritized results list of addition;If similitude is lower than a certain threshold value, the question and answer group is abandoned;Otherwise, it enters step
S302。
For example, two threshold values Y1, Y2 can be arranged for word rank TF-IDF similitude, Y1 is greater than Y2.If the question and answer group
QuWith QiWord rank TF-IDF similitude be greater than Y1, then be put into high priority list.If the Q of the question and answer groupuWith QiWord
Rank TF-IDF similitude is less than Y2, then abandons the question and answer group, the question and answer group for obviously not having correlation can be excluded, after reduction
The continuous quantity compared.Question and answer group between Y1 and Y2, can compare other correlation metrics.Various correlations in example refer to
The setting of target threshold value is similar with manner of comparison, is not repeated to illustrate below.
Step S302, Q is calculateduWith QiCharacter rank TF-IDF similitude, if similitude be higher than a certain threshold value, will
(Qi, Ai) the supreme prioritized results list of addition;If similitude is lower than a certain threshold value, the question and answer group is abandoned;Otherwise, it enters step
S303。
Step S303, Q is calculateduWith QiPhonetic transcriptions of Chinese characters TF-IDF similitude, if similitude be higher than a certain threshold value, will
(Qi, Ai) the supreme prioritized results list of addition;If similitude is lower than a certain threshold value, the question and answer group is abandoned;Otherwise, it enters step
S304。
Step S304, Q is calculateduWith QiDepth QQ similitude, if similitude is higher than a certain threshold value, by (Qi, Ai) be added to
High prioritized results list;If similitude is lower than a certain threshold value, the question and answer group is abandoned;Otherwise, S305 is entered step.
Step S305, Q is calculateduWith QiTerm vector similitude, if similitude be higher than a certain threshold value, by (Qi, Ai) addition
Supreme prioritized results list;If similitude is lower than a certain threshold value, the question and answer group is abandoned;Otherwise, S306 is entered step.
Step S306, Q is calculateduWith QiLSI (Latent Semantic Indexing, potential applications index) similitude,
If similitude is higher than a certain threshold value, by (Qi, Ai) the supreme prioritized results list of addition;If similitude is lower than a certain threshold value, lose
Abandon the question and answer group;Otherwise, S307 is entered step.
Step S307, Q is calculateduWith AiDepth QA correlation, if candidate result is (without entering high prioritized results list
) in maximum depth correlation be higher than a certain threshold value, then candidate result is added into low prioritized results list;Otherwise, it is arranged
List is empty for low prioritized results;Execute step S308.
Step S308, preferentially merge two the results lists in preceding, low preferential posterior principle by high, the sequence knot after merging
Fruit is final ranking results.
It should be pointed out that the sequence of step 301- step 308 can be adjusted as required, user search for problem with
The similitude and correlation that the similitude and correlation of candidate problem, user search for problem and candidate answers can be according to actually answering
Different indexs is selected to reorder with scene, in embodiments of the present invention without limitation.
Method for reordering is added after main sort method in the embodiment of the present invention, in the scene of for example medical intelligent answer,
It is not comprehensive (such as one-sidedness, limitation) that the ranking results obtained due to main sort method can be efficiently solved, it is difficult to provide essence
The problem of sequence of standard.By the way that many specifically relevant property indexs can be added in reordering, to make arranging order result comprehensive
It is more multifactor, it can more preferably, more easily provide and accurately to answer sequence, handle some specific medical care problems.
Fig. 4 is the block diagram according to the search results ranking device of the embodiment of the present invention.As shown in figure 4, the device includes:
First sorting module 41 is asked for obtaining user's request and candidate result, the user from the first ranking results
Include search problem in asking, includes candidate problem and the corresponding candidate answers of each candidate problem in the candidate result;
First correlation module 42, for obtaining the first correlation metric of described search problem and the candidate problem;
Second correlation module 43, for obtaining the second correlation metric of described search problem Yu the candidate answers;
Second sorting module 45 is used for according to first correlation metric and second correlation metric, to described
First ranking results reorder, and obtain the second ranking results.
In one possible implementation, the second sorting module 45, further includes:
High priority submodule 451 includes for being determined in high priority list according to first correlation metric
Candidate question and answer group;
Low priority submodule 452 includes for being determined in low priority list according to second correlation metric
Candidate question and answer group;
Ordering by merging submodule 453, for asking the candidate in the high priority list and the low priority list
Group is answered, is merged according to high priority in the posterior sequence of preceding, low priority, obtains second ranking results.
In one possible implementation, high priority submodule 451, if being also used to a candidate question and answer group extremely
Few first correlation metric is higher than given threshold, then high priority list is added in the candidate question and answer group.
In one possible implementation, low priority submodule 452, if being also used to a candidate question and answer group extremely
Few second correlation metric is higher than given threshold, then low priority list is added in the candidate question and answer group.
In one possible implementation, the first correlation module 42, at least one including following submodule:
First word rank submodule, it is similar to the candidate word rank TF-IDF of problem for calculating described search problem
Property;
First character level small pin for the case module, for calculating the character rank TF-IDF of described search problem and the candidate problem
Similitude;
First phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters grade of described search problem and the candidate problem
Other TF-IDF similitude;
Depth problem submodule, for calculating the depth problem similitude of described search problem and the candidate problem;
First term vector submodule, for calculating the term vector similitude of described search problem and the candidate problem;
First potential applications index submodule, for calculating the potential applications rope of described search problem and the candidate problem
Draw similitude.
In one possible implementation, the second correlation module 43, at least one including following submodule:
Depth question and answer submodule, for calculating depth question and answer correlation of the described search problem with the candidate answers;
Second word rank submodule is related to the word rank TF-IDF of the candidate answers for calculating described search problem
Property;
Second character level small pin for the case module, for calculating the character rank TF-IDF of described search problem Yu the candidate answers
Correlation;
Second phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters grade of described search problem Yu the candidate answers
Other TF-IDF correlation;
Second term vector submodule, for calculating term vector correlation of the described search problem with the candidate answers;
Second potential applications index submodule, for calculating the potential applications rope of described search problem Yu the candidate answers
Draw correlation
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not
It repeats again.
Fig. 6 is the structural block diagram according to the search results ranking device of one embodiment of the invention.As shown in fig. 6, the device
Include: memory 910 and processor 920, the computer program that can be run on processor 920 is stored in memory 910.Institute
State the search result ordering method realized in above-described embodiment when processor 920 executes the computer program.The memory
910 and processor 920 quantity can for one or more.
The device further include:
Communication interface 930 carries out data interaction for being communicated with external device.
Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.
If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor
Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture
Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral
Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard
Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, Fig. 6
In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core
On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt
Processor realizes any method in above-described embodiment when executing.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described
It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this
The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples
Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden
It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise
Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings
Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory
(CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie
Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media
Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement,
These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim
It protects subject to range.
Claims (14)
1. a kind of search result ordering method characterized by comprising
User's request and candidate result are obtained from the first ranking results, include search problem, the time in user's request
Select in result includes candidate problem and the corresponding candidate answers of each candidate problem;
Obtain the first correlation metric of described search problem and the candidate problem;
Obtain the second correlation metric of described search problem and the candidate answers;
According to first correlation metric and second correlation metric, reorder to first ranking results,
Obtain the second ranking results.
2. the method according to claim 1, wherein related to described second according to first correlation metric
Property index, reorders to first ranking results, obtains the second sequence as a result, including:
The candidate question and answer group for including in high priority list is determined according to first correlation metric;
The candidate question and answer group for including in low priority list is determined according to second correlation metric;
By the candidate question and answer group in the high priority list and the low priority list, according to high priority preceding, low excellent
The first posterior sequence of grade merges, and obtains second ranking results.
3. according to the method described in claim 2, it is characterized in that, determining that high priority arranges according to first correlation metric
The candidate question and answer group for including in table, comprising:
If at least one first correlation metric of a candidate question and answer group is higher than given threshold, by the candidate question and answer group
High priority list is added.
4. according to the method described in claim 2, it is characterized in that, determining that low priority arranges according to second correlation metric
The candidate question and answer group for including in table, comprising:
If at least one second correlation metric of a candidate question and answer group is higher than given threshold, by the candidate question and answer group
Low priority list is added.
5. the method according to claim 1, wherein obtaining the first of described search problem and the candidate problem
Correlation metric, at least one including following manner:
Calculate the word rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the character rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF similitude of described search problem and the candidate problem;
Calculate the depth problem similitude of described search problem and the candidate problem;
Calculate the term vector similitude of described search problem and the candidate problem;
The potential applications for calculating described search problem and the candidate problem index similitude.
6. the method according to claim 1, wherein obtaining the second of described search problem and the candidate answers
Correlation metric, at least one including following manner:
Calculate the depth question and answer correlation of described search problem and the candidate answers;
Calculate the word rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the character rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the phonetic transcriptions of Chinese characters rank TF-IDF correlation of described search problem and the candidate answers;
Calculate the term vector correlation of described search problem and the candidate answers;
It calculates described search problem and the potential applications of the candidate answers indexes correlation.
7. a kind of search results ranking device characterized by comprising
First sorting module is wrapped in user's request for obtaining user's request and candidate result from the first ranking results
Search problem is included, includes candidate problem and the corresponding candidate answers of each candidate problem in the candidate result;
First correlation module, for obtaining the first correlation metric of described search problem and the candidate problem;
Second correlation module, for obtaining the second correlation metric of described search problem Yu the candidate answers;
Second sorting module is used for according to first correlation metric and second correlation metric, to the first row
Sequence result reorders, and obtains the second ranking results.
8. device according to claim 7, which is characterized in that second sorting module includes:
High priority submodule, for determining the candidate question and answer for including in high priority list according to first correlation metric
Group;
Low priority submodule, for determining the candidate question and answer for including in low priority list according to second correlation metric
Group;
Ordering by merging submodule, for pressing the candidate question and answer group in the high priority list and the low priority list
It is merged according to high priority in the posterior sequence of preceding, low priority, obtains second ranking results.
9. device according to claim 8, which is characterized in that if the high priority submodule is also used to a candidate
At least one first correlation metric of question and answer group is higher than given threshold, then high priority column is added in the candidate question and answer group
Table.
10. device according to claim 8, which is characterized in that if the low priority submodule is also used to a time
It selects at least one second correlation metric of question and answer group to be higher than given threshold, then low priority column is added in the candidate question and answer group
Table.
11. device according to claim 7, which is characterized in that first correlation module includes following submodule
At least one:
First word rank submodule, for calculating the word rank TF-IDF similitude of described search problem and the candidate problem;
First character level small pin for the case module, it is similar to the candidate character rank TF-IDF of problem for calculating described search problem
Property;
First phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters rank of described search problem and the candidate problem
TF-IDF similitude;
Depth problem submodule, for calculating the depth problem similitude of described search problem and the candidate problem;
First term vector submodule, for calculating the term vector similitude of described search problem and the candidate problem;
First potential applications index submodule, index phase for calculating described search problem and the potential applications of the candidate problem
Like property.
12. device according to claim 7, which is characterized in that second correlation module includes following submodule
At least one:
Depth question and answer submodule, for calculating depth question and answer correlation of the described search problem with the candidate answers;
Second word rank submodule, for calculating word rank TF-IDF correlation of the described search problem with the candidate answers;
Second character level small pin for the case module is related to the character rank TF-IDF of the candidate answers for calculating described search problem
Property;
Second phonetic transcriptions of Chinese characters rank submodule, for calculating the phonetic transcriptions of Chinese characters rank of described search problem Yu the candidate answers
TF-IDF correlation;
Second term vector submodule, for calculating term vector correlation of the described search problem with the candidate answers;
Second potential applications index submodule, index phase for calculating described search problem and the potential applications of the candidate answers
Guan Xing.
13. a kind of search results ranking device, which is characterized in that described device includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors
Realize such as method described in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor
Such as method described in any one of claims 1 to 6 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810729232.9A CN109033244B (en) | 2018-07-05 | 2018-07-05 | Search result ordering method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810729232.9A CN109033244B (en) | 2018-07-05 | 2018-07-05 | Search result ordering method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109033244A true CN109033244A (en) | 2018-12-18 |
CN109033244B CN109033244B (en) | 2020-10-16 |
Family
ID=65522449
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810729232.9A Active CN109033244B (en) | 2018-07-05 | 2018-07-05 | Search result ordering method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109033244B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825864A (en) * | 2019-11-13 | 2020-02-21 | 北京香侬慧语科技有限责任公司 | Method and device for obtaining answers to questions |
CN110851484A (en) * | 2019-11-13 | 2020-02-28 | 北京香侬慧语科技有限责任公司 | Method and device for obtaining multi-index question answers |
CN112784600A (en) * | 2021-01-29 | 2021-05-11 | 北京百度网讯科技有限公司 | Information sorting method and device, electronic equipment and storage medium |
CN113326420A (en) * | 2021-06-15 | 2021-08-31 | 北京百度网讯科技有限公司 | Question retrieval method, device, electronic equipment and medium |
CN113761084A (en) * | 2020-06-03 | 2021-12-07 | 北京四维图新科技股份有限公司 | POI search ranking model training method, ranking device, method and medium |
CN115203598A (en) * | 2022-07-20 | 2022-10-18 | 贝壳找房(北京)科技有限公司 | Information sorting method, electronic device and storage medium in real estate field |
CN116013488A (en) * | 2023-03-27 | 2023-04-25 | 中国人民解放军总医院第六医学中心 | Intelligent security management system for medical records with self-adaptive data rearrangement function |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8412514B1 (en) * | 2005-10-27 | 2013-04-02 | At&T Intellectual Property Ii, L.P. | Method and apparatus for compiling and querying a QA database |
CN108153876A (en) * | 2017-12-26 | 2018-06-12 | 爱因互动科技发展(北京)有限公司 | Intelligent answer method and system |
CN108170739A (en) * | 2017-12-18 | 2018-06-15 | 深圳前海微众银行股份有限公司 | Problem matching process, terminal and computer readable storage medium |
-
2018
- 2018-07-05 CN CN201810729232.9A patent/CN109033244B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8412514B1 (en) * | 2005-10-27 | 2013-04-02 | At&T Intellectual Property Ii, L.P. | Method and apparatus for compiling and querying a QA database |
CN108170739A (en) * | 2017-12-18 | 2018-06-15 | 深圳前海微众银行股份有限公司 | Problem matching process, terminal and computer readable storage medium |
CN108153876A (en) * | 2017-12-26 | 2018-06-12 | 爱因互动科技发展(北京)有限公司 | Intelligent answer method and system |
Non-Patent Citations (1)
Title |
---|
周亦鹏 等: "《软件人主题分析和信息检索技术》", 31 August 2012, 北京邮电大学出版社 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825864A (en) * | 2019-11-13 | 2020-02-21 | 北京香侬慧语科技有限责任公司 | Method and device for obtaining answers to questions |
CN110851484A (en) * | 2019-11-13 | 2020-02-28 | 北京香侬慧语科技有限责任公司 | Method and device for obtaining multi-index question answers |
CN113761084A (en) * | 2020-06-03 | 2021-12-07 | 北京四维图新科技股份有限公司 | POI search ranking model training method, ranking device, method and medium |
CN113761084B (en) * | 2020-06-03 | 2023-08-08 | 北京四维图新科技股份有限公司 | POI search ranking model training method, ranking device, method and medium |
CN112784600A (en) * | 2021-01-29 | 2021-05-11 | 北京百度网讯科技有限公司 | Information sorting method and device, electronic equipment and storage medium |
CN112784600B (en) * | 2021-01-29 | 2024-01-16 | 北京百度网讯科技有限公司 | Information ordering method, device, electronic equipment and storage medium |
CN113326420A (en) * | 2021-06-15 | 2021-08-31 | 北京百度网讯科技有限公司 | Question retrieval method, device, electronic equipment and medium |
CN113326420B (en) * | 2021-06-15 | 2023-10-27 | 北京百度网讯科技有限公司 | Question retrieval method, device, electronic equipment and medium |
US11977567B2 (en) | 2021-06-15 | 2024-05-07 | Beijing Baidu Netcom Science Technology Co., Ltd. | Method of retrieving query, electronic device and medium |
CN115203598A (en) * | 2022-07-20 | 2022-10-18 | 贝壳找房(北京)科技有限公司 | Information sorting method, electronic device and storage medium in real estate field |
CN115203598B (en) * | 2022-07-20 | 2023-09-19 | 贝壳找房(北京)科技有限公司 | Information ordering method in real estate field, electronic equipment and storage medium |
CN116013488A (en) * | 2023-03-27 | 2023-04-25 | 中国人民解放军总医院第六医学中心 | Intelligent security management system for medical records with self-adaptive data rearrangement function |
Also Published As
Publication number | Publication date |
---|---|
CN109033244B (en) | 2020-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109033244A (en) | Search result ordering method and device | |
Wang et al. | K-adapter: Infusing knowledge into pre-trained models with adapters | |
US10628472B2 (en) | Answering questions via a persona-based natural language processing (NLP) system | |
US10380149B2 (en) | Question sentence generating device and computer program | |
RU2701110C2 (en) | Studying and using contextual rules of extracting content to eliminate ambiguity of requests | |
US11481417B2 (en) | Generation and utilization of vector indexes for data processing systems and methods | |
US11468238B2 (en) | Data processing systems and methods | |
WO2018018626A1 (en) | Conversation oriented machine-user interaction | |
CN116134432A (en) | System and method for providing answers to queries | |
CN108846138B (en) | Question classification model construction method, device and medium fusing answer information | |
US11455357B2 (en) | Data processing systems and methods | |
CN114341841A (en) | Building answers to queries by using depth models | |
WO2023236253A1 (en) | Document retrieval method and apparatus, and electronic device | |
WO2022005573A1 (en) | Interactive search training | |
WO2021092272A1 (en) | Qa-bots for information search in documents using paraphrases | |
CN110717008B (en) | Search result ordering method and related device based on semantic recognition | |
JP2017151588A (en) | Image evaluation learning device, image evaluation device, image searching device, image evaluation learning method, image evaluation method, image searching method, and program | |
CN109657043B (en) | Method, device and equipment for automatically generating article and storage medium | |
WO2016009321A1 (en) | System for searching, recommending, and exploring documents through conceptual associations and inverted table for storing and querying conceptual indices | |
US20190318220A1 (en) | Dispersed template-based batch interaction with a question answering system | |
CN113571196A (en) | Method and device for constructing medical training sample and method for retrieving medical text | |
CN107784112A (en) | Short text data Enhancement Method, system and detection authentication service platform | |
Secker et al. | AISIID: An artificial immune system for interesting information discovery on the web | |
US10474726B2 (en) | Generation of digital documents | |
EA002016B1 (en) | A method of searching for fragments with similar text and/or semantic contents in electronic documents stored on a data storage devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |