CN109147934A - Interrogation data recommendation method, device, computer equipment and storage medium - Google Patents
Interrogation data recommendation method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109147934A CN109147934A CN201810724291.7A CN201810724291A CN109147934A CN 109147934 A CN109147934 A CN 109147934A CN 201810724291 A CN201810724291 A CN 201810724291A CN 109147934 A CN109147934 A CN 109147934A
- Authority
- CN
- China
- Prior art keywords
- question
- answer
- words
- feature
- interrogation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application involves a kind of interrogation data recommendation method, device, computer equipment and storage mediums.Method includes: to obtain currently wait answer a question and segmented, and extracts Feature Words according to word segmentation result, obtains currently wait corresponding fisrt feature set of words of answering a question;Obtain the corresponding second feature set of words of each index node in the index that pre-establishes;Calculate separately the cosine similarity between fisrt feature set of words and second feature set of words, it is ranked up the index node to choose preset quantity to each index node as target index node according to the first similarity calculation result, obtains target index node set;The corresponding question and answer pair of each target index node are obtained from interrogation database;It calculates separately currently wait answer a question with each question and answer to the second similarity between corresponding problem, target question and answer pair is chosen to being ranked up to each question and answer according to the second similarity calculation result, according to the target question and answer of selection to progress interrogation data recommendation.
Description
Technical field
This application involves online interview techniques field, more particularly to it is a kind of by interrogation data recommendation method, device, based on
Calculate machine equipment and storage medium.
Background technique
With the rapid development of Internet technology, online interrogation Internet-based and online health consultation are obtained increasingly
The favor of more people.In online interrogation and online health consultation, each user is after proposition problem, and all it is most fast to obtain doctor for expectation
The answer of speed.
In traditional technology, doctor needs to organize language by thinking, writes and answer most after seeing the enquirement of user
It clicks and sends afterwards, user can just see the reply to problem, lead to interrogation inefficiency.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of interrogation data recommendation that can be improved interrogation efficiency
Method, apparatus, computer equipment and storage medium.
A kind of interrogation data recommendation method, comprising:
It obtains currently wait answer a question, is currently segmented wait answer a question to described, feature is extracted according to word segmentation result
Word obtains described currently wait corresponding fisrt feature set of words of answering a question;
Obtain the corresponding second feature set of words of each index node in the index that pre-establishes;
It calculates separately described currently wait corresponding fisrt feature set of words of answering a question corresponding with each index node
The first similarity between two feature set of words is ranked up to select each index node according to the first similarity calculation result
It takes the index node of preset quantity as target index node, obtains target index node set;
From obtaining the corresponding question and answer pair of each target index node in target index node set in interrogation database;
Calculate separately it is described currently wait answer a question with each question and answer to the second similarity between corresponding problem, according to
Second similarity calculation result chooses target question and answer pair to being ranked up to each question and answer, according to the target question and answer of selection
To progress interrogation data recommendation.
Include: before the step of acquisition is currently to be answered a question in one of the embodiments,
The corresponding interrogation information aggregate of all previous interrogation is obtained, the interrogation information aggregate is pre-processed;
Question and answer pair are extracted to pretreated interrogation information aggregate, and to the question and answer of extraction to progress feature extraction;
The question and answer pair and the question and answer store to interrogation database the corresponding feature correspondence;
The interrogation Database is indexed according to the feature.
It is described in one of the embodiments, that question and answer pair are extracted to pretreated interrogation information aggregate, comprising:
The corresponding user identifier of each interrogation information in the interrogation information aggregate is obtained, the user identifier is interrogation
User identifier or clinician user mark;
Corresponding interrogation information is identified to clinician user to be filtered according to default rule;
To filtered interrogation information aggregate, question and answer pair are extracted according to punctuation mark and interrogative.
The question and answer of described pair of extraction are to progress feature extraction in one of the embodiments, comprising:
To the question and answer of extraction to the problems in segment, obtain the corresponding set of words of described problem;
Word each in the set of words is matched with each word in the feature dictionary pre-established respectively, when
When successful match, using the word as the feature extracted.
It is calculated separately described in one of the embodiments, described currently wait corresponding fisrt feature set of words of answering a question
The step of the first similarity between second feature set of words corresponding with each index node, comprising:
Feature weight is calculated to each Feature Words in the fisrt feature set of words and obtains the first calculated result, according to institute
It states the first calculated result and chooses keyword, obtain described currently wait corresponding first keyword set of answering a question;
Feature weight is calculated to Feature Words each in second feature set of words and obtains the second calculated result, according to described second
Calculated result chooses keyword, obtains corresponding second keyword set of each index node;
It is obtained currently according to first keyword set and second keyword set wait answer a question corresponding
One word frequency vector and the corresponding second word frequency vector of each index node;
The included angle cosine value calculated separately between each first word frequency vector and each second word frequency vector obtains the first phase
Like degree.
Each Feature Words in the fisrt feature set of words calculate feature weight in one of the embodiments,
Obtain the first calculated result, comprising:
The initial characteristics power of each Feature Words in the fisrt feature set of words is calculated using term frequency-inverse document frequency algorithm
Weight;
When any one Feature Words in the fisrt feature set of words meet default adjustment rule, according to described default
Adjustment rule is adjusted the initial characteristics weight of Feature Words, obtains final feature weight;
It, will be described initial when any one Feature Words in the fisrt feature set of words are unsatisfactory for default adjustment rule
Feature weight is as final feature weight.
A kind of interrogation data recommendation device, described device include:
Fisrt feature set of words obtains module, for obtaining currently wait answer a question, to it is described currently wait answer a question into
Row participle extracts Feature Words according to word segmentation result, obtains described currently wait corresponding fisrt feature set of words of answering a question;
Second feature set of words obtains module, for obtaining each index node corresponding second in the index pre-established
Feature set of words;
Target index node set obtains module, described currently wait corresponding fisrt feature of answering a question for calculating separately
The first similarity between set of words second feature set of words corresponding with each index node, according to the first similarity calculation knot
Fruit is ranked up the index node to choose preset quantity to each index node as target index node, obtains target index
Node set;
Question and answer are to module is obtained, for from each target index saves in acquisition target index node set in interrogation database
The corresponding question and answer pair of point;
Recommending module, it is described currently wait answer a question with each question and answer between corresponding problem for calculating separately
Two similarities choose target question and answer pair to being ranked up to each question and answer according to the second similarity calculation result, according to selection
The target question and answer to carry out interrogation data recommendation.
Described device in one of the embodiments, further include:
Preprocessing module carries out the interrogation information aggregate for obtaining the corresponding interrogation information aggregate of all previous interrogation
Pretreatment;
Feature extraction module for extracting question and answer pair to pretreated interrogation information aggregate, and is asked described in extraction
Answer questions carry out feature extraction;
Memory module, for storing the question and answer pair and the question and answer the corresponding feature correspondence to interrogation data
Library;
Index establishes module, for being indexed according to the feature to the interrogation Database.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device realizes step described in above-mentioned interrogation data recommendation method when executing the computer program.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
Step described in above-mentioned interrogation data recommendation method is realized when row.
Above-mentioned interrogation data recommendation method, device, computer equipment and storage medium, obtain correspondence to be answered a question first
Feature Words arrangement set, then calculate the feature word order of each node in Feature Words arrangement set to be answered a question and index
The first similarity between column set chooses the maximum some nodes of similarity as destination node, then searches these nodes
Corresponding question and answer pair, calculate the second similarity wait answer a question with question and answer centering problem, and similarity is more maximum asks for selection
It answers questions as target question and answer pair, according to these question and answer to come the recommendation that carries out interrogation data, passes through two minor sorts, essence in the application
It is located quasi-ly with wait most like question and answer pair of answering a question, according to most like question and answer to recommending, realizes interrogation
Shi Zidong is that doctor recommends accurately to answer, to improve the efficiency of interrogation.
Detailed description of the invention
Fig. 1 is the application scenario diagram of interrogation data recommendation method in one embodiment;
Fig. 2 is the flow diagram of interrogation data recommendation method in one embodiment;
Fig. 3 is the flow diagram in one embodiment before step S202;
Fig. 4 is the corresponding flow diagram of step S304 in one embodiment;
Fig. 5 is the corresponding flow diagram of step S206 in one embodiment;
Fig. 6 is the corresponding flow diagram of step S502 in one embodiment;
Fig. 7 is the structural block diagram of interrogation data recommendation device in one embodiment;
Fig. 8 is the structural block diagram of interrogation data recommendation device in another embodiment;
Fig. 9 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Interrogation data recommendation method provided by the present application, can be applied in application environment as shown in Figure 1.Wherein, it asks
Terminal 102, doctor terminal 104 is examined to be communicated by network with server 106 respectively.Server 106 is receiving interrogation end
End send after answering a question, to currently segmenting wait answer a question, Feature Words are extracted according to word segmentation result, are obtained current
Wait corresponding fisrt feature set of words of answering a question, it is corresponding second special to obtain each index node in the index database pre-established
Set of words is levied, is calculated separately currently wait the corresponding fisrt feature set of words of answering a question the second spy corresponding with each index node
The first similarity for levying set of words, is ranked up to choose present count each index node according to the first similarity calculation result
The index of amount obtains target index node set, each entry is searched from interrogation information database as target index node
The corresponding question and answer pair of index node are marked, are calculated separately currently wait answer a question with each question and answer to second between corresponding problem
Similarity chooses target question and answer pair to being ranked up to each question and answer according to the second similarity calculation result, according to selection
Target question and answer pair, phase doctor terminal carry out interrogation data recommendation, wherein the interrogation data of recommendation can be entire target question and answer
It is right, it can also be only the reply message of target question and answer centering.
Wherein, interrogation terminal 102, doctor terminal 104 can be, but not limited to be various personal computers, laptop,
Smart phone, tablet computer, server 104 can use the server cluster of the either multiple server compositions of independent server
To realize.
In one embodiment, as shown in Fig. 2, providing a kind of interrogation data recommendation method, it is applied to Fig. 1 in this way
In server for be illustrated, comprising the following steps:
Step S202 is obtained currently wait answer a question, to currently segmenting wait answer a question, is extracted according to word segmentation result
Feature Words obtain currently wait corresponding fisrt feature set of words of answering a question.
Specifically, interrogation problem that interrogation user inputs in interrogation terminal is referred to wait answer a question.When interrogation, user exists
When interrogation terminal inputs interrogation problem, server can receive the interrogation problem of interrogation terminal transmission, carry out to the interrogation problem
Participle, obtains word segmentation result, wherein word segmentation result refers to the word sequence of the word composition one by one obtained after participle
Column.Such as, the word segmentation result obtained after " I what if having a stomachache " participle can be with are as follows: I/stomach-ache/what if.
It, can be first according to punctuation mark will to be divided into a rule complete wait answer a question to currently being segmented wait answer a question
Sentence, then word segmentation processing is carried out to the sentence of each cutting, such as using the segmenting method of string matching to each cutting
Sentence carry out word segmentation processing, such as Forward Maximum Method method, the character string in the sentence of a cutting is segmented from left to right;
Alternatively, reversed maximum matching method, the character string in the sentence of a cutting is segmented from right to left;Alternatively, shortest path point
Morphology, it is least that the word number cut out is required inside the character string in the sentence of a cutting;Alternatively, two-way maximum matching method,
It is forward and reverse while carrying out participle matching.Also word segmentation processing, the meaning of a word point are carried out using sentence of the meaning of a word participle method to each cutting
Morphology is a kind of segmenting method of machine talk judgement, handles Ambiguity using syntactic information and semantic information to segment.
Word segmentation processing also is carried out using sentence of the statistical morphology to each cutting, from the historical search of active user record or masses
In the historical search record of user, according to the statistics of phrase, the frequency that can count some two adjacent words appearance is more, then may be used
The two adjacent words are segmented as phrase.
Further, server extracts Feature Words according to word segmentation result.In one embodiment, extracting Feature Words specifically can be with
To match word each in word segmentation result one by one with each word in the feature dictionary pre-established, using the word matched as
Feature Words.In one embodiment, it is identical to can be two words for matching.In another embodiment, matching can be two
Similarity between a word is more than preset threshold, and such as " stomach-ache " and " stomach-ache " can be used as two words being mutually matched.Wherein,
Feature lexicon can be the authentic interpretation of the various diseases obtained from existing medical data base, including its corresponding letter
The specialized informations such as Jie, symptom, complication, treatment drug, common inspection, are also possible to the corresponding medical information of various drugs, such as
The information such as the disease type that drug cures mainly, the medical data are also possible to through tools such as web crawlers in real time or periodically from mutual
Open source medical data source in networking is (for example, about the question and answer of various disease, discussion etc. or various new doctors on each World Jam
Treat case, medical question and answer text etc.) the certain types of information that obtains is (for example, the corresponding therapeutic scheme of various disease, medicine
Object, affiliated department, clinical manifestation etc.).
Step S204 obtains the corresponding second feature set of words of each index node in the index pre-established.
Specifically, for history interrogation data, question and answer pair are extracted in advance, then question and answer are mentioned to feature extraction has been carried out
Feature Words corresponding to question and answer centering problem are included at least in the feature taken, these Feature Words form second feature set of words, and
Question and answer pair and its corresponding feature are saved to same a line of the tables of data of interrogation database, finally according to the columns where feature
It is indexed according to interrogation Database, each index node includes index value and pointer in index, wherein index value at least wraps
Each question and answer are included to corresponding second feature set of words, pointer refers to one piece of region of memory, and region of memory record is to hard
The reference of the data of the corresponding line of disc recording.=wherein, question and answer are to the problem of referring to interrogation user and the answer of doctor institute
The information pair of composition.Question and answer are made of the problem of interrogation user and the answer of doctor to can be, be also possible to by
Multiple answers of one problem of interrogation user and doctor form, can also be interrogation user continuous multiple problems and doctor one
A answer composition can also be and be made of continuous multiple problems of interrogation user and continuous multiple answers of doctor.
In the present embodiment, server successively traverses each index node in index, asks for the index value of index node, obtains
To the corresponding second feature set of words of each index node.
Step S206 is calculated separately currently corresponding with each index node wait corresponding fisrt feature set of words of answering a question
Second feature set of words between the first similarity, each index node is ranked up according to the first similarity calculation result
Index node to choose preset quantity obtains target index node set as target index node.
Specifically, the first similarity is used to characterize the similarity degree of fisrt feature set of words Yu second feature set of words.?
In one embodiment, the first similarity can be cosine similarity, calculate currently wait corresponding fisrt feature word set of answering a question
Close the cosine similarity of corresponding with any one index node second feature set of words, can respectively to fisrt feature set of words,
Second feature set of words extract keyword, obtain wait answer a question corresponding first keyword set and index node it is corresponding
Then second keyword set calculates its respective word frequency vector to the first keyword set and the second keyword set, most
The included angle cosine value for calculating two word frequency vectors afterwards obtains cosine similarity.
Further, server is ranked up each index node of index database according to the size of cosine similarity, according to
Ranking results choose the index node of preset quantity as target index node, obtain target index node set.In a reality
It applies in example, server can carry out descending arrangement to index node according to the size of cosine similarity, choose the index node of TOPN1
As target index node, wherein N1 is the preset value being previously set, and rule of thumb can be set and be adjusted.
Step S208 is asked from each target index node is corresponding in acquisition target index node set in interrogation database
It answers questions.
Specifically, due to being stored with the corresponding line being directed toward in interrogation database in table in each of index index node
Pointer.The data of the corresponding corresponding line of index node can be obtained by the pointer, and question and answer are to being in the row data wherein one
The data of column, therefore its corresponding question and answer pair can be got by index node.
Step S210 is calculated separately currently similar to second between corresponding problem with each question and answer wait answer a question
Each question and answer are chosen target question and answer pair to being ranked up according to the second similarity calculation result, according to the target of selection by degree
Question and answer are to progress interrogation data recommendation.
Specifically, the second similarity is for characterizing currently wait answer a question with each question and answer to the phase between corresponding problem
Like degree.In one embodiment, the second similarity can be similarity of character string.Calculating is currently asked wait answer a question with each
The second similarity between corresponding problem is answered questions, specifically, it may include following steps: server gets each target
The corresponding question and answer of index node are calculated currently wait answer a question and ask in question and answer centering each question and answer pair for obtaining first to rear
Editing distance between topic, wherein when editing distance refers to being modified to another character string from a character string, wherein editing
Minimum number required for single character (such as modification, insertion, deletion).Then according to editor calculate currently wait answer a question with
Similarity of character string between each question and answer centering problem of the question and answer centering of acquisition, formula are as follows: Similarity=(Max (x,
Y)-Levenshtein)/Max (x, y), wherein x is wait corresponding string length of answering a question, and y is question and answer centering problem
Corresponding string length, Levenshtein are editing distance.
Further, server according to the size of similarity of character string to each question and answer obtained in step S208 to arranging
Then sequence chooses the question and answer of preset quantity to as target question and answer pair, according to these target question and answer to progress according to ranking results
Interrogation data recommendation.In one embodiment, server can be according to the size of similarity of character string to obtaining in step S208
Each question and answer to carry out descending arrangement, choose the question and answer of TOPN2 to as target question and answer pair, wherein N2 is previously set
Value, can rule of thumb be adjusted.
In one embodiment, server can be to interrogation data recommendation is carried out by all targets according to target question and answer
Question and answer are also possible to select an optional question and answer to recommending doctor terminal to doctor terminal is recommended, or will come
How one question and answer specifically recommend doctor terminal is recommended, and the application is it is not limited here.
In another embodiment, doctor's end is recommended in the answer that server is also possible to directly choose the centering of target question and answer
End, can be the answer of all target question and answer pair all recommending doctor terminal, be also possible to the answer of an optional question and answer pair
Doctor terminal is recommended, or the answer for the question and answer pair for coming first is selected to recommend doctor terminal, specifically how to be recommended, this
Invention is herein with no restrictions.
In above-mentioned interrogation data recommendation method, wait corresponding feature set of words of answering a question, then server obtains first
Calculate the first similarity in feature set of words and index to be answered a question between the feature set of words of each index node, choosing
It takes the maximum some nodes of similarity as destination node, then searches the corresponding question and answer pair of these nodes, calculate and asked wait answer
Topic and the second similarity of question and answer centering problem select the maximum some question and answer of similarity of character string to as target question and answer pair,
According to these question and answer to come the recommendation that carries out interrogation data, by two minor sorts in the application, accurately located with wait answer
The most like question and answer pair of problem are that doctor recommends essence when realizing interrogation according to most like question and answer to recommending automatically
Quasi- answer, to improve the efficiency of interrogation.
In one embodiment, as shown in figure 3, including: before step S202
Step S302 obtains the corresponding interrogation information aggregate of all previous interrogation, pre-processes to interrogation information aggregate.
Specifically, all previous interrogation before referring to current time completed each secondary interrogation, interrogation information aggregate refer to
Believed in primary complete interrogation by the information aggregate interrogation that the interrogation information of interrogation user and the return information of clinician user form
Breath.
In the present embodiment, pretreatment includes subordinate sentence, reference resolution, context processing etc..Wherein, subordinate sentence is referred to one
Information cutting is single sentence;Reference resolution refers to calculating the reference content of pronoun in sentence, can pass through syntactic analysis
It is calculated with editing distance;Context processing refers to completion context.Such as: D: whether dizzy are you? U: yes,
Be to be extended to me be dizzy.Make the meaning of second expression more comprehensive;Context processing is sentenced using syntactic analysis and clause
It is disconnected.
Step S304 extracts question and answer pair to pretreated interrogation information aggregate, and to the question and answer of extraction to progress feature
It extracts.
Specifically, in interrogation user once complete interrogation, it will usually repeatedly propose problem, interrogation user mentions each time
Doctor will do it answer after ging wrong, and when the enquirement each time of interrogation user the problem of, doctor corresponding with the problem replied group
At a question and answer pair.Extract question and answer to i.e. from primary completely interrogation corresponding interrogation information by question and answer to extracting.
Further, server is to the question and answer of extraction to progress feature extraction.In one embodiment, feature extraction can be
To question and answer to the problems in extract keyword.In another embodiment, the feature of extraction for example can be the list of question and answer centering
Sentence quantity, adjective number, interrogative etc..
Step S306 stores question and answer pair and question and answer to interrogation database to corresponding feature correspondence.
Specifically, server by question and answer to and question and answer corresponding feature is accordingly stored to interrogation database, i.e., will ask
Answering questions with question and answer is column different in same a line of table in database to corresponding characteristic storage.
In one embodiment, interrogation user is communicated by instant message with doctor, is carried in message in interrogation
The respective user identifier of communication two party, including interrogation user identifier and clinician user identify, and specifically, are sent by interrogation terminal
Information, carry interrogation user identifier, by doctor terminal send information carry clinician user mark, therefore, server is obtaining
When getting the corresponding interrogation information of all previous interrogation, the corresponding user identifier of interrogation information can be got simultaneously, then by question and answer pair
Corresponding user identifier and question and answer store to interrogation database, question and answer to corresponding feature one-to-one correspondence.
Step S308 indexes interrogation Database according to feature.
Specifically, server establishes index, each node in index according to the column data where feature in interrogation database
The data line in interrogation database is respectively corresponded, includes at least question and answer to, question and answer to corresponding feature.
In one embodiment, server can also establish index according to user identifier, feature.
In the present embodiment, by interrogation information extraction feature and establishing index, calculate wait answer a question with it is each
When the similarity of question and answer centering, do not need to traverse entire database again, it is only necessary to be counted according to wait answer a question with index value
It calculates, to improve computational efficiency significantly.
In one embodiment, as shown in figure 4, to pretreated interrogation information extraction question and answer pair, comprising:
Step S304A, obtains the corresponding user identifier of each interrogation information in interrogation information aggregate, and user identifier is to ask
Examine user identifier or clinician user mark.
Specifically, each interrogation information all corresponds to a user identifier in interrogation information, is disappeared by what interrogation terminal was sent
Breath, corresponding user identifier are interrogation user identifier, and the message sent by doctor terminal, corresponding user identifier is doctor
User identifier.
Step S306B identifies corresponding interrogation information to clinician user and is filtered according to default rule.
Specifically, default rule includes at least: filter out with interrogative end up message, and with preset polite phase
Matched message.Wherein, interrogative for example can be " what if ", " what ", " why " etc..Preset polite is
Doctor terminal be previously set for saving the sentence of turnaround time, for example, " woulding you please wait ", " you are good, my class of being not currently in "
Etc..
Step S308C extracts question and answer according to punctuation mark and interrogative to filtered interrogation text interrogation information aggregate
It is right.
Specifically, filtered interrogation information is begun stepping through from first interrogation information, successively obtains each interrogation letter
Corresponding user identifier is ceased, when the corresponding user identifier of interrogation information is interrogation user identifier, whether judges the interrogation information
Comprising question sentence, if so, using the question sentence as one of problem of question and answer centering, from first later in problem correspondence
Clinician user identifies corresponding interrogation information and starts, and obtains all continuous clinician users and identifies corresponding interrogation information, until
The corresponding interrogation information of next interrogation user identifier occurs, and the clinician user that will acquire identifies corresponding interrogation information as should
The answer of question sentence forms question and answer pair.Specifically, the question and answer of extraction to may include a problem one answer or one
Problem continuously multiple answers, or one answer of continuous multiple problems, or continuous multiple problems continuously multiple answers, specifically
It is any combination depending on specific interrogation situation, the application is herein with no restrictions.
In one embodiment, to the question and answer of extraction to carrying out feature extraction, comprising: to the question and answer of extraction to the problems in
It is segmented, obtains the corresponding set of words of problem;By word each in set of words respectively with the feature dictionary that pre-establishes
In each word matched, when successful match, using word as extract feature.
Specifically, server can first to the question and answer of extraction to the problems in segment, obtain the corresponding word set of problem
It closes.Wherein, to the question and answer of extraction to the problems in segment, it is complete problem first can be divided by a rule according to punctuation mark
Sentence, then word segmentation processing is carried out to the sentence of each cutting, if the segmenting method using string matching is to each cutting
Sentence carries out word segmentation processing, and such as Forward Maximum Method method, the character string in the sentence of a cutting is segmented from left to right;Or
Person, reversed maximum matching method from right to left segment character string in the sentence of a cutting;Alternatively, shortest path segments
Method, it is least that the word number cut out is required inside the character string in the sentence of a cutting;Alternatively, two-way maximum matching method, just
It is reversed to carry out participle matching simultaneously.Also word segmentation processing, meaning of a word participle are carried out using sentence of the meaning of a word participle method to each cutting
Method is a kind of segmenting method of machine talk judgement, handles Ambiguity using syntactic information and semantic information to segment.Also
Word segmentation processing is carried out using sentence of the statistical morphology to each cutting, from the historical search of active user record or public use
In the historical search record at family, according to the statistics of phrase, the frequency that can count some two adjacent words appearance is more, then can incite somebody to action
The two adjacent words are segmented as phrase.
Further, will in the obtained set of words of participle each word and each word in the feature dictionary pre-established one by one into
Row matching, using the word matched as Feature Words.In one embodiment, it is identical to can be two words for matching.Another
In a embodiment, it is more than preset threshold that matching, which can be the similarity between two words, and such as " stomach-ache " and " stomach-ache " can be made
For two words being mutually matched.Wherein, feature lexicon can be the various diseases obtained from existing medical data base
The specialized informations such as authentic interpretation, including its corresponding brief introduction, symptom, complication, treatment drug, common inspection, are also possible to each
The corresponding medical information of kind drug, such as the disease type information that drug cures mainly, the medical data are also possible to climb by network
The tools such as worm are in real time or periodically from the open source medical data source on internet (for example, about various disease on each World Jam
Question and answer, discussion etc. or various new medical cases, medical question and answer text etc.) the certain types of information that obtains is (for example, different
The corresponding therapeutic scheme of disease, therapeutic agent, affiliated department, clinical manifestation etc.).
In one embodiment, as shown in figure 5, calculate separately currently wait answer a question corresponding fisrt feature set of words with
The step of the first similarity between the corresponding second feature set of words of each index node, comprising:
Step S502 calculates feature weight to each Feature Words in fisrt feature set of words and obtains the first calculated result,
Keyword is chosen according to the first calculated result, is obtained currently wait corresponding first keyword set of answering a question.
Specifically, feature weight is used to characterize the significance level of some feature, and feature weight is bigger, illustrates that the specific word is got over
It is important, it can more represent the meaning of set of words.In one embodiment, calculating feature weight to each Feature Words can be used word
Frequently-inverse document frequency (term frequency-inverse document frequency, TF-IDF) algorithm.In this implementation
In example, the first checkout result is obtained after calculating feature weight, wherein the first calculated result refers to each spy in the first set of words
Levy the corresponding weighted value of word.Feature Words can be ranked up according to weighted value, then choose keyword according to ranking results, from
And obtain the first keyword set.
In one embodiment, server can drop each Feature Words in fisrt feature set of words according to feature weight
Then sequence arrangement chooses the forward preset quantity Feature Words that sort as keyword, to obtain the first keyword set.
Step S504 calculates feature weight to Feature Words each in second feature set of words and obtains the second calculated result, root
Keyword is chosen according to the second calculated result, obtains corresponding second keyword set of each index node.
Specifically, term frequency-inverse document frequency algorithm can be used, feature is calculated to Feature Words each in second feature set of words
Weight is to obtain the second calculated result, wherein the second calculated result refers to the feature power of each Feature Words in the second set of words
Weight values can be ranked up Feature Words according to weighted value, then keyword be chosen according to ranking results, to obtain the second pass
Keyword set.
In one embodiment, server can drop each Feature Words in second feature set of words according to feature weight
Then sequence arrangement chooses the forward preset quantity Feature Words that sort as keyword, to obtain the second keyword set.
Step S506 is obtained currently according to the first keyword set and the second keyword set wait answer a question corresponding
One word frequency vector and the corresponding second word frequency vector of each index node.
Specifically, the first keyword set and the second keyword set are merged to obtain a union, calculates separately the union
In word frequency of each keyword in fisrt feature set of words and in second feature set of words, generate the according to word frequency respectively
One word frequency vector sum the second word frequency vector.For example, if fisrt feature set of words are as follows: cough/smoking/insomnia, it is corresponding
Keyword set is combined into { cough is smoked };Second feature set of words are as follows: headache/cough/rhinorrhea/cooling, corresponding keyword
For { headache is had a running nose }, two keywords are merged to obtain { cough is smoked, and is had a headache, and is had a running nose }, then, each word in the set
Word frequency in fisrt feature set of words are as follows: cough 1 smokes 1, headache 0, has a running nose 0, and each word is in fisrt feature in the set
Word frequency in set of words are as follows: cough 1, smoke 0, headache 1, have a running nose 1, then finally obtain the first word frequency vector be [1,1,0,
0], the second word frequency vector is [1,0,1,1].
Step S508, the included angle cosine calculated separately between each first word frequency vector and each second word frequency vector are worth
To the first similarity.
Specifically, the calculation formula of cosine similarity are as follows:
Wherein, n (n >=2) is the dimension of word frequency vector, AiFor the first word frequency vector, BiFor the second word frequency vector.
In the present embodiment, pass through the extraction keyword from feature set of words and obtain word frequency vector to calculate two features
The cosine similarity of set of words, compared to calculate wait answer a question, question and answer are to the similarity of two documents, save calculation amount,
Improve computational efficiency.
In one embodiment, it is obtained as shown in fig. 6, calculating feature weight to each Feature Words in fisrt feature set of words
To the first calculated result, comprising:
Step S602, using term frequency-inverse document frequency algorithm calculate fisrt feature set of words in each Feature Words it is initial
Feature weight.
Specifically, word frequency TF is calculated first, be can refer to following formula and is calculated:
The total word number of number/document that some word of word frequency TF=occurs in a document;
Then, inverse document word frequency IDF is calculated, following formula is can refer to and is calculated:
Finally, calculating initial characteristics weight: W=TF*IDF.
Step S604 successively judges whether each Feature Words meet preset adjustment rule in fisrt feature set of words, if
It is then to enter step S606;If it is not, then entering step S608.
Step S606 is adjusted the initial weight of Feature Words according to adjustment rule, obtains final feature weight.
Step S608, using initial characteristics weight as final feature weight.
Specifically, preset adjustment rule is the rule being adjusted to the feature weight of Feature Words manually set.?
In one embodiment, preset adjustment rule be can be, when two Feature Words while the difference of appearance and its corresponding feature weight
When less than preset threshold, then the weight of one of word is adjusted so that the difference of weight is not less than the preset threshold, e.g.,
When headache and hand pain occur being characterized word simultaneously, and the difference of its corresponding feature weight is less than 0.2, by the feature weight of headache
It is adjusted, so that the difference of the feature weight of headache and hand pain is greater than 0.2, the purpose for the arrangement is that in order to make symptom be affected
Feature Words weight increase, thus improve keyword choose when accuracy.
In the present embodiment, by being adjusted to feature weight, the accuracy of keyword selection can be improved.
It should be understood that although each step in the flow chart of Fig. 2-6 is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 2-6
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively
It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately
It executes.
In one embodiment, as shown in fig. 7, providing a kind of interrogation data recommendation device 700, comprising:
Fisrt feature set of words obtains module 702, for obtaining currently wait answer a question, to currently wait progress of answering a question
Participle extracts Feature Words according to word segmentation result, obtains currently wait corresponding fisrt feature set of words of answering a question;
Second feature set of words obtains module 704, corresponding for obtaining each index node in the index pre-established
Second feature set of words;
Target index node set obtains module 706, for calculating separately currently wait corresponding fisrt feature of answering a question
The first similarity between set of words second feature set of words corresponding with each index node, according to the first similarity calculation knot
Fruit is ranked up the index node to choose preset quantity to each index node as target index node, obtains target index
Node set;
Question and answer to obtain module 708, for from interrogation database obtain target index node set in each target rope
Draw the corresponding question and answer pair of node;
Recommending module 710, for calculating separately currently wait answer a question with each question and answer between corresponding problem
Two similarities choose target question and answer pair to being ranked up to each question and answer according to the second similarity calculation result, according to selection
Target question and answer to carry out interrogation data recommendation.
In one embodiment, as shown in figure 8, device further include:
Preprocessing module 802 carries out interrogation information aggregate pre- for obtaining the corresponding interrogation information aggregate of all previous interrogation
Processing;
Feature extraction module 804, for extracting question and answer pair to pretreated interrogation information aggregate, and to the question and answer of extraction
To progress feature extraction;
Memory module 806, for storing question and answer pair and question and answer corresponding feature correspondence to interrogation database;
Index establishes module 808, for being indexed according to feature to interrogation Database.
In one embodiment, feature extraction module 804 is also used to obtain each interrogation information in interrogation information aggregate
Corresponding user identifier, user identifier are that interrogation user identifier or clinician user identify;Corresponding interrogation is identified to clinician user
Information is filtered according to default rule;To filtered interrogation information aggregate, asked according to punctuation mark and interrogative extraction
It answers questions.
In one embodiment, feature extraction module 804 be also used to the question and answer of extraction to the problems in segment, obtain
To the corresponding set of words of problem;By word each in set of words respectively with each word in the feature dictionary that pre-establishes into
Row matching, when successful match, using word as the feature extracted.
In one embodiment, target index node set obtains module 706 and is also used to in fisrt feature set of words
Each Feature Words calculate feature weight and obtain the first calculated result, choose keyword according to the first calculated result, obtain currently to
It answers a question corresponding first keyword set;Feature weight is calculated to Feature Words each in second feature set of words and obtains second
Calculated result chooses keyword according to the second calculated result, obtains corresponding second keyword set of each index node;According to
First keyword set and the second keyword set obtain currently wait corresponding first word frequency vector and each rope of answering a question
Draw the corresponding second word frequency vector of node;Calculate separately the angle between each first word frequency vector and each second word frequency vector
Cosine value obtains the first similarity.
In one embodiment, target index node set is obtained module 706 and is also used to be calculated using term frequency-inverse document frequency
Method calculates the initial characteristics weight of each Feature Words in fisrt feature set of words;When any one in fisrt feature set of words is special
When sign word meets default adjustment rule, the initial characteristics weight of Feature Words is adjusted according to default adjustment is regular, is obtained most
Whole feature weight;It, will be initial special when any one Feature Words in fisrt feature set of words are unsatisfactory for default adjustment rule
Weight is levied as final feature weight.
Specific about interrogation data recommendation device limits the limit that may refer to above for interrogation data recommendation method
Fixed, details are not described herein.Modules in above-mentioned interrogation data recommendation device can fully or partially through software, hardware and its
Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with
It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding
Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in Figure 8.The computer equipment include by system bus connect processor, memory, network interface and
Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment
Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data
Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The database of machine equipment is for storing question and answer to, question and answer to data such as corresponding features.The network interface of the computer equipment is used
It is communicated in passing through network connection with external terminal.To realize that a kind of interrogation data push away when the computer program is executed by processor
Recommend method.
It will be understood by those skilled in the art that structure shown in Fig. 8, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with
Computer program, the processor perform the steps of acquisition currently wait answer a question, to currently wait return when executing computer program
Question and answer topic is segmented, and is extracted Feature Words according to word segmentation result, is obtained currently wait corresponding fisrt feature set of words of answering a question;
Obtain the corresponding second feature set of words of each index node in the index that pre-establishes;It calculates separately currently wait answer a question pair
The first similarity between the fisrt feature set of words answered second feature set of words corresponding with each index node, according to first
Similarity calculation result is ranked up the index node to choose preset quantity to each index node as target index node,
Obtain target index node set;From obtaining in interrogation database, each target index node in target index node set is corresponding
Question and answer pair;It calculates separately currently wait answer a question with each question and answer to the second similarity between corresponding problem, according to
Two similarity calculated results choose target question and answer pair to being ranked up to each question and answer, according to the target question and answer of selection to progress
Interrogation data recommendation.
In one embodiment, before the step of obtaining currently wait answer a question, processor is gone back when executing computer program
It performs the steps of and obtains the corresponding interrogation information aggregate of all previous interrogation, interrogation information aggregate is pre-processed;To pretreatment
Interrogation information aggregate afterwards extracts question and answer pair, and to the question and answer of extraction to progress feature extraction;By question and answer pair and question and answer to correspondence
Feature correspondence store to interrogation database;Interrogation Database is indexed according to feature.
In one embodiment, to pretreated interrogation information extraction question and answer pair, comprising: obtain in interrogation information aggregate
The corresponding user identifier of each interrogation information, user identifier are that interrogation user identifier or clinician user identify;To clinician user
Corresponding interrogation information is identified to be filtered according to default rule;To filtered interrogation information aggregate, according to punctuation mark
Question and answer pair are extracted with interrogative.
In one embodiment, to the question and answer of extraction to carrying out feature extraction, comprising: to the question and answer of extraction to the problems in
It is segmented, obtains the corresponding set of words of problem;By word each in set of words respectively with the feature dictionary that pre-establishes
In each word matched, when successful match, using word as extract feature.
In one embodiment, it calculates separately currently wait corresponding fisrt feature set of words and each index section of answering a question
The step of putting the first similarity between corresponding second feature set of words, comprising: to each spy in fisrt feature set of words
Sign word calculates feature weight and obtains the first calculated result, chooses keyword according to the first calculated result, obtains currently asking wait answer
Inscribe corresponding first keyword set;Feature weight is calculated to Feature Words each in second feature set of words and obtains the second calculating knot
Fruit chooses keyword according to the second calculated result, obtains corresponding second keyword set of each index node;It is closed according to first
Keyword set and the second keyword set obtain currently wait corresponding first word frequency vector and each index node of answering a question
Corresponding second word frequency vector;Calculate separately the included angle cosine value between each first word frequency vector and each second word frequency vector
Obtain the first similarity.
In one embodiment, feature weight is calculated to each Feature Words in fisrt feature set of words and obtains the first calculating
As a result, comprising: weighed using the initial characteristics that term frequency-inverse document frequency algorithm calculates each Feature Words in fisrt feature set of words
Weight;When any one Feature Words in fisrt feature set of words meet default adjustment rule, according to default adjustment rule to spy
The initial characteristics weight of sign word is adjusted, and obtains final feature weight;When any one in fisrt feature set of words is special
When sign word is unsatisfactory for default adjustment rule, using initial characteristics weight as final feature weight.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Machine program performs the steps of acquisition currently wait answer a question when being executed by processor, to currently being segmented wait answer a question,
Feature Words are extracted according to word segmentation result, are obtained currently wait corresponding fisrt feature set of words of answering a question;What acquisition pre-established
The corresponding second feature set of words of each index node in index;It calculates separately currently wait corresponding fisrt feature word of answering a question
Gather the first similarity between second feature set of words corresponding with each index node, according to the first similarity calculation result
It is ranked up the index node to choose preset quantity to each index node as target index node, obtains target index section
Point set;From obtaining the corresponding question and answer pair of each target index node in target index node set in interrogation database;Respectively
It calculates currently wait answer a question with each question and answer to the second similarity between corresponding problem, according to the second similarity calculation knot
Fruit chooses target question and answer pair to being ranked up to each question and answer, according to the target question and answer of selection to progress interrogation data recommendation.
In one embodiment, before the step of obtaining currently wait answer a question, when computer program is executed by processor
It also performs the steps of and obtains the corresponding interrogation information aggregate of all previous interrogation, interrogation information aggregate is pre-processed;To pre- place
Interrogation information aggregate after reason extracts question and answer pair, and to the question and answer of extraction to progress feature extraction;By question and answer pair and question and answer to right
The feature correspondence answered is stored to interrogation database;Interrogation Database is indexed according to feature.
In one embodiment, to pretreated interrogation information extraction question and answer pair, comprising: obtain in interrogation information aggregate
The corresponding user identifier of each interrogation information, user identifier are that interrogation user identifier or clinician user identify;To clinician user
Corresponding interrogation information is identified to be filtered according to default rule;To filtered interrogation information aggregate, according to punctuation mark
Question and answer pair are extracted with interrogative.
In one embodiment, to the question and answer of extraction to carrying out feature extraction, comprising: to the question and answer of extraction to the problems in
It is segmented, obtains the corresponding set of words of problem;By word each in set of words respectively with the feature dictionary that pre-establishes
In each word matched, when successful match, using word as extract feature.
In one embodiment, it calculates separately currently wait corresponding fisrt feature set of words and each index section of answering a question
The step of putting the first similarity between corresponding second feature set of words, comprising: to each spy in fisrt feature set of words
Sign word calculates feature weight and obtains the first calculated result, chooses keyword according to the first calculated result, obtains currently asking wait answer
Inscribe corresponding first keyword set;Feature weight is calculated to Feature Words each in second feature set of words and obtains the second calculating knot
Fruit chooses keyword according to the second calculated result, obtains corresponding second keyword set of each index node;It is closed according to first
Keyword set and the second keyword set obtain currently wait corresponding first word frequency vector and each index node of answering a question
Corresponding second word frequency vector;Calculate separately the included angle cosine value between each first word frequency vector and each second word frequency vector
Obtain the first similarity.
In one embodiment, feature weight is calculated to each Feature Words in fisrt feature set of words and obtains the first calculating
As a result, comprising: weighed using the initial characteristics that term frequency-inverse document frequency algorithm calculates each Feature Words in fisrt feature set of words
Weight;When any one Feature Words in fisrt feature set of words meet default adjustment rule, according to default adjustment rule to spy
The initial characteristics weight of sign word is adjusted, and obtains final feature weight;When any one in fisrt feature set of words is special
When sign word is unsatisfactory for default adjustment rule, using initial characteristics weight as final feature weight.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM in a variety of forms may be used
, such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM),
Enhanced SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) are direct
RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (10)
1. a kind of interrogation data recommendation method, which comprises
It obtains currently wait answer a question, is currently segmented wait answer a question to described, Feature Words are extracted according to word segmentation result, are obtained
To described currently wait corresponding fisrt feature set of words of answering a question;
Obtain the corresponding second feature set of words of each index node in the index that pre-establishes;
It calculates separately described currently corresponding with each index node second special wait the corresponding fisrt feature set of words of answering a question
The first similarity between set of words is levied, each index node is ranked up according to the first similarity calculation result pre- to choose
If the index node of quantity obtains target index node set as target index node;
From obtaining the corresponding question and answer pair of each target index node in target index node set in interrogation database;
It calculates separately described currently wait answer a question with each question and answer to the second similarity between corresponding problem, according to second
Similarity calculation result chooses target question and answer pair to being ranked up to each question and answer, according to the target question and answer of selection into
Row interrogation data recommendation.
2. the method according to claim 1, wherein being wrapped before described the step of obtaining currently wait answer a question
It includes:
The corresponding interrogation information aggregate of all previous interrogation is obtained, the interrogation information aggregate is pre-processed;
Question and answer pair are extracted to pretreated interrogation information aggregate, and to the question and answer of extraction to progress feature extraction;
The question and answer pair and the question and answer store to interrogation database the corresponding feature correspondence;
The interrogation Database is indexed according to the feature.
3. according to the method described in claim 2, it is characterized in that, described to pretreated interrogation information extraction question and answer pair,
Include:
The corresponding user identifier of each interrogation information in the interrogation information aggregate is obtained, the user identifier is interrogation user
Mark or clinician user mark;
Corresponding interrogation information is identified to clinician user to be filtered according to default rule;
To filtered interrogation information aggregate, question and answer pair are extracted according to punctuation mark and interrogative.
4. according to the method in claim 2 or 3, which is characterized in that the question and answer of described pair of extraction are to progress feature pumping
It takes, comprising:
To the question and answer of extraction to the problems in segment, obtain the corresponding set of words of described problem;
Word each in the set of words is matched with each word in the feature dictionary pre-established respectively, works as matching
When success, using the word as the feature extracted.
5. the method according to claim 1, wherein it is described calculate separately it is described currently corresponding wait answer a question
The step of the first similarity between fisrt feature set of words second feature set of words corresponding with each index node, comprising:
Feature weight is calculated to each Feature Words in the fisrt feature set of words and obtains the first calculated result, according to described the
One calculated result chooses keyword, obtains described currently wait corresponding first keyword set of answering a question;
Feature weight is calculated to Feature Words each in second feature set of words and obtains the second calculated result, is calculated according to described second
As a result keyword is chosen, corresponding second keyword set of each index node is obtained;
It is obtained currently according to first keyword set and second keyword set wait corresponding first word of answering a question
Frequency vector and the corresponding second word frequency vector of each index node;
The included angle cosine value calculated separately between each first word frequency vector and each second word frequency vector obtains the first similarity.
6. according to the method described in claim 5, it is characterized in that, each feature in the fisrt feature set of words
Word calculates feature weight and obtains the first calculated result, comprising:
The initial characteristics weight of each Feature Words in the fisrt feature set of words is calculated using term frequency-inverse document frequency algorithm;
When any one Feature Words in the fisrt feature set of words meet default adjustment rule, according to the default adjustment
Rule is adjusted the initial characteristics weight of Feature Words, obtains final feature weight;
When any one Feature Words in the fisrt feature set of words are unsatisfactory for default adjustment rule, by the initial characteristics
Weight is as final feature weight.
7. a kind of interrogation data recommendation device, which is characterized in that described device includes:
Fisrt feature set of words obtains module, for obtaining currently wait answer a question, to described currently wait answer a question point
Word extracts Feature Words according to word segmentation result, obtains described currently wait corresponding fisrt feature set of words of answering a question;
Second feature set of words obtains module, for obtaining the corresponding second feature of each index node in the index pre-established
Set of words;
Target index node set obtains module, described currently wait corresponding fisrt feature word set of answering a question for calculating separately
The first similarity between second feature set of words corresponding with each index node is closed, according to the first similarity calculation result pair
Each index node is ranked up the index node to choose preset quantity as target index node, obtains target index node
Set;
Question and answer to obtain module, for from interrogation database obtain target index node set in each target index node pair
The question and answer pair answered;
Recommending module, it is described currently wait answer a question with each question and answer to the second phase between corresponding problem for calculating separately
Like degree, target question and answer pair are chosen to being ranked up to each question and answer according to the second similarity calculation result, according to the institute of selection
Target question and answer are stated to progress interrogation data recommendation.
8. device according to claim 7, which is characterized in that described device further include:
Preprocessing module locates the interrogation information aggregate for obtaining the corresponding interrogation information aggregate of all previous interrogation in advance
Reason;
Feature extraction module, for extracting question and answer pair to pretreated interrogation information aggregate, and to the question and answer pair of extraction
Carry out feature extraction;
Memory module, for storing the question and answer pair and the question and answer the corresponding feature correspondence to interrogation database;
Index establishes module, for being indexed according to the feature to the interrogation Database.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 6 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 6 is realized when being executed by processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810724291.7A CN109147934B (en) | 2018-07-04 | 2018-07-04 | Inquiry data recommendation method, device, computer equipment and storage medium |
PCT/CN2019/071525 WO2020007028A1 (en) | 2018-07-04 | 2019-01-14 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810724291.7A CN109147934B (en) | 2018-07-04 | 2018-07-04 | Inquiry data recommendation method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109147934A true CN109147934A (en) | 2019-01-04 |
CN109147934B CN109147934B (en) | 2023-04-11 |
Family
ID=64799920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810724291.7A Active CN109147934B (en) | 2018-07-04 | 2018-07-04 | Inquiry data recommendation method, device, computer equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109147934B (en) |
WO (1) | WO2020007028A1 (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783631A (en) * | 2019-02-02 | 2019-05-21 | 北京百度网讯科技有限公司 | Method of calibration, device, computer equipment and the storage medium of community's question and answer data |
CN110321435A (en) * | 2019-06-28 | 2019-10-11 | 京东数字科技控股有限公司 | A kind of data source division methods, device, equipment and storage medium |
CN110377719A (en) * | 2019-07-25 | 2019-10-25 | 广东工业大学 | Medical answering method and device |
CN110473067A (en) * | 2019-08-14 | 2019-11-19 | 杭州品茗安控信息技术股份有限公司 | The cost normative document of component determines method, apparatus, equipment and storage medium |
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN111367971A (en) * | 2020-03-30 | 2020-07-03 | 中国建设银行股份有限公司 | Financial system abnormity auxiliary analysis method and device based on data mining |
CN111476029A (en) * | 2020-04-13 | 2020-07-31 | 武汉联影医疗科技有限公司 | Resource recommendation method and device |
CN111858863A (en) * | 2019-04-29 | 2020-10-30 | 深圳市优必选科技有限公司 | Reply recommendation method, reply recommendation device and electronic equipment |
CN112397197A (en) * | 2020-11-16 | 2021-02-23 | 康键信息技术(深圳)有限公司 | Artificial intelligence-based inquiry data processing method and device |
CN112541069A (en) * | 2020-12-24 | 2021-03-23 | 山东山大鸥玛软件股份有限公司 | Text matching method, system, terminal and storage medium combined with keywords |
CN112559676A (en) * | 2019-09-25 | 2021-03-26 | 北京新唐思创教育科技有限公司 | Similar topic retrieval method and device and computer storage medium |
CN112786176A (en) * | 2021-02-22 | 2021-05-11 | 北京融威众邦电子技术有限公司 | Intelligent self-service diagnosis method and device and computer equipment |
CN112820364A (en) * | 2021-02-22 | 2021-05-18 | 中国人民解放军联勤保障部队第九八〇医院 | Oral cavity outpatient service electronic medical record system based on database framework |
CN112818225A (en) * | 2021-01-27 | 2021-05-18 | 上海明略人工智能(集团)有限公司 | Display method and device of pushed data |
CN113203086A (en) * | 2021-04-30 | 2021-08-03 | 江苏经贸职业技术学院 | Lighting device with disinfection function for classroom |
WO2021196934A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Question recommendation method and apparatus based on field similarity calculation, and server |
CN113658684A (en) * | 2021-08-11 | 2021-11-16 | 挂号网(杭州)科技有限公司 | Consultation result generation method and device, electronic equipment and storage medium |
CN113764111A (en) * | 2020-09-29 | 2021-12-07 | 北京京东拓先科技有限公司 | Method and device for determining message turns |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111708949B (en) * | 2020-06-19 | 2023-07-25 | 微医云(杭州)控股有限公司 | Medical resource recommendation method and device, electronic equipment and storage medium |
CN112002415B (en) * | 2020-08-23 | 2024-03-01 | 吾征智能技术(北京)有限公司 | Intelligent cognitive disease system based on human excrement |
CN112002413B (en) * | 2020-08-23 | 2023-09-29 | 吾征智能技术(北京)有限公司 | Intelligent cognitive system, equipment and storage medium for cardiovascular system infection |
CN112269880B (en) * | 2020-11-04 | 2024-02-09 | 吾征智能技术(北京)有限公司 | Sweet text classification matching system based on linear function |
CN112802597B (en) * | 2021-01-18 | 2023-11-21 | 吾征智能技术(北京)有限公司 | Intelligent evaluation system, equipment and storage medium for neonatal jaundice |
CN112951405B (en) * | 2021-01-26 | 2024-05-28 | 北京搜狗科技发展有限公司 | Method, device and equipment for realizing feature ordering |
CN116089669B (en) * | 2023-03-09 | 2023-10-03 | 数影星球(杭州)科技有限公司 | Browser-based website uploading interception mode and system |
CN117633362A (en) * | 2023-12-13 | 2024-03-01 | 北京小懂科技有限公司 | Medical information recommendation method and platform based on big data analysis technology |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989040A (en) * | 2015-02-03 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Intelligent question-answer method, device and system |
CN106503175A (en) * | 2016-11-01 | 2017-03-15 | 上海智臻智能网络科技股份有限公司 | The inquiry of Similar Text, problem extended method, device and robot |
CN107491655A (en) * | 2017-08-31 | 2017-12-19 | 康安健康管理咨询(常熟)有限公司 | Liver diseases information intelligent consultation method and system based on machine learning |
CN107980130A (en) * | 2017-11-02 | 2018-05-01 | 深圳前海达闼云端智能科技有限公司 | It is automatic to answer method, apparatus, storage medium and electronic equipment |
CN108108449A (en) * | 2017-12-27 | 2018-06-01 | 哈尔滨福满科技有限责任公司 | A kind of implementation method based on multi-source heterogeneous data question answering system and the system towards medical field |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573028B (en) * | 2015-01-14 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | Realize the method and system of intelligent answer |
CN109147934B (en) * | 2018-07-04 | 2023-04-11 | 平安科技(深圳)有限公司 | Inquiry data recommendation method, device, computer equipment and storage medium |
-
2018
- 2018-07-04 CN CN201810724291.7A patent/CN109147934B/en active Active
-
2019
- 2019-01-14 WO PCT/CN2019/071525 patent/WO2020007028A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989040A (en) * | 2015-02-03 | 2016-10-05 | 阿里巴巴集团控股有限公司 | Intelligent question-answer method, device and system |
CN106503175A (en) * | 2016-11-01 | 2017-03-15 | 上海智臻智能网络科技股份有限公司 | The inquiry of Similar Text, problem extended method, device and robot |
CN107491655A (en) * | 2017-08-31 | 2017-12-19 | 康安健康管理咨询(常熟)有限公司 | Liver diseases information intelligent consultation method and system based on machine learning |
CN107980130A (en) * | 2017-11-02 | 2018-05-01 | 深圳前海达闼云端智能科技有限公司 | It is automatic to answer method, apparatus, storage medium and electronic equipment |
CN108108449A (en) * | 2017-12-27 | 2018-06-01 | 哈尔滨福满科技有限责任公司 | A kind of implementation method based on multi-source heterogeneous data question answering system and the system towards medical field |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020007028A1 (en) * | 2018-07-04 | 2020-01-09 | 平安科技(深圳)有限公司 | Medical consultation data recommendation method, device, computer apparatus, and storage medium |
CN109783631B (en) * | 2019-02-02 | 2022-05-17 | 北京百度网讯科技有限公司 | Community question-answer data verification method and device, computer equipment and storage medium |
CN109783631A (en) * | 2019-02-02 | 2019-05-21 | 北京百度网讯科技有限公司 | Method of calibration, device, computer equipment and the storage medium of community's question and answer data |
CN111858863A (en) * | 2019-04-29 | 2020-10-30 | 深圳市优必选科技有限公司 | Reply recommendation method, reply recommendation device and electronic equipment |
CN111858863B (en) * | 2019-04-29 | 2023-07-14 | 深圳市优必选科技有限公司 | Reply recommendation method, reply recommendation device and electronic equipment |
CN110321435A (en) * | 2019-06-28 | 2019-10-11 | 京东数字科技控股有限公司 | A kind of data source division methods, device, equipment and storage medium |
CN110377719A (en) * | 2019-07-25 | 2019-10-25 | 广东工业大学 | Medical answering method and device |
CN110377719B (en) * | 2019-07-25 | 2022-02-15 | 广东工业大学 | Medical question and answer method and device |
CN110473067A (en) * | 2019-08-14 | 2019-11-19 | 杭州品茗安控信息技术股份有限公司 | The cost normative document of component determines method, apparatus, equipment and storage medium |
CN112559676A (en) * | 2019-09-25 | 2021-03-26 | 北京新唐思创教育科技有限公司 | Similar topic retrieval method and device and computer storage medium |
CN112559676B (en) * | 2019-09-25 | 2022-05-17 | 北京新唐思创教育科技有限公司 | Similar topic retrieval method and device and computer storage medium |
CN111367971A (en) * | 2020-03-30 | 2020-07-03 | 中国建设银行股份有限公司 | Financial system abnormity auxiliary analysis method and device based on data mining |
WO2021196934A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Question recommendation method and apparatus based on field similarity calculation, and server |
CN111476029A (en) * | 2020-04-13 | 2020-07-31 | 武汉联影医疗科技有限公司 | Resource recommendation method and device |
CN113764111A (en) * | 2020-09-29 | 2021-12-07 | 北京京东拓先科技有限公司 | Method and device for determining message turns |
CN113764111B (en) * | 2020-09-29 | 2024-04-05 | 北京京东拓先科技有限公司 | Method and device for determining message rounds |
CN112397197A (en) * | 2020-11-16 | 2021-02-23 | 康键信息技术(深圳)有限公司 | Artificial intelligence-based inquiry data processing method and device |
CN112541069A (en) * | 2020-12-24 | 2021-03-23 | 山东山大鸥玛软件股份有限公司 | Text matching method, system, terminal and storage medium combined with keywords |
CN112818225A (en) * | 2021-01-27 | 2021-05-18 | 上海明略人工智能(集团)有限公司 | Display method and device of pushed data |
CN112820364A (en) * | 2021-02-22 | 2021-05-18 | 中国人民解放军联勤保障部队第九八〇医院 | Oral cavity outpatient service electronic medical record system based on database framework |
CN112786176A (en) * | 2021-02-22 | 2021-05-11 | 北京融威众邦电子技术有限公司 | Intelligent self-service diagnosis method and device and computer equipment |
CN113203086A (en) * | 2021-04-30 | 2021-08-03 | 江苏经贸职业技术学院 | Lighting device with disinfection function for classroom |
CN113658684A (en) * | 2021-08-11 | 2021-11-16 | 挂号网(杭州)科技有限公司 | Consultation result generation method and device, electronic equipment and storage medium |
CN113658684B (en) * | 2021-08-11 | 2024-05-31 | 挂号网(杭州)科技有限公司 | Consultation result generation method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2020007028A1 (en) | 2020-01-09 |
CN109147934B (en) | 2023-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109147934A (en) | Interrogation data recommendation method, device, computer equipment and storage medium | |
WO2022095368A1 (en) | Question-answer corpus generation method and device based on text generation model | |
US11301637B2 (en) | Methods, devices, and systems for constructing intelligent knowledge base | |
CN110874531B (en) | Topic analysis method and device and storage medium | |
US20180336193A1 (en) | Artificial Intelligence Based Method and Apparatus for Generating Article | |
CN109299280B (en) | Short text clustering analysis method and device and terminal equipment | |
CN108986908A (en) | Interrogation data processing method, device, computer equipment and storage medium | |
CN109215754A (en) | Medical record data processing method, device, computer equipment and storage medium | |
CN109543007A (en) | Put question to data creation method, device, computer equipment and storage medium | |
CN110276071B (en) | Text matching method and device, computer equipment and storage medium | |
CN112215008B (en) | Entity identification method, device, computer equipment and medium based on semantic understanding | |
CN110162768B (en) | Method and device for acquiring entity relationship, computer readable medium and electronic equipment | |
CN109933708A (en) | Information retrieval method, device, storage medium and computer equipment | |
Liu et al. | R-trans: RNN transformer network for Chinese machine reading comprehension | |
CN111930895A (en) | Document data retrieval method, device, equipment and storage medium based on MRC | |
CN112651236B (en) | Method and device for extracting text information, computer equipment and storage medium | |
CN115470313A (en) | Information retrieval and model training method, device, equipment and storage medium | |
CN108810640B (en) | Television program recommendation method | |
CN112115237A (en) | Method and device for constructing tobacco scientific and technical literature data recommendation model | |
CN113312462B (en) | Semantic similarity calculation method and device, electronic equipment and storage medium | |
Shen et al. | Mitigating Intrinsic Named Entity-Related Hallucinations of Abstractive Text Summarization | |
CN111709226B (en) | Text processing method and device | |
CN114925185B (en) | Interaction method, model training method, device, equipment and medium | |
US20230046367A1 (en) | Systems and methods for dynamically removing text from documents | |
CN118296120A (en) | Large-scale language model retrieval enhancement generation method for multi-mode multi-scale multi-channel recall |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |