Specific embodiment
To enable present invention purpose, feature, advantage more obvious and understandable, below in conjunction with the application
Attached drawing in embodiment, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described reality
Applying example is only some embodiments of the present application, and not all embodiments.Based on the embodiment in the application, those skilled in the art
Member's every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
The auto-answer method for application of provide a loan in the present embodiment can be applied in the terminal, which specifically can be with
It is the terminals such as mobile phone, tablet computer and PC (personal computer), instant messaging APP is installed in the terminal
(application) or browser is installed, response website can be accessed by the browser.When in terminal run APP simultaneously
When can be communicated by internet, user is proposed that problem by the interface APP, alternatively, enquirement side is answered by browser access
When answering website, by puing question to window to propose problem, automatic answering system receives and answers the problem.The problem is related to loan
The problem of.It should be noted that the method in the present embodiment is described by taking loan transaction field as an example, this method can also be answered
Question and answer for other business professional domains.
Referring to Fig. 1, the auto-answer method flow diagram for application of providing a loan provided for one embodiment of the application.
This method can be with this method comprises the following steps:
S101, question sentence is extracted in the received message;
By APP or web page windows, the message that user sends is received, extracting in the content of message indicates asking for query
Sentence.
S102, it analyzes the question sentence and obtains current loan problem;
It after extracting question sentence, analyzes the question sentence and obtains current loan problem, be the information requirement for positioning user.
Specifically, critical entities word relevant to loan transaction, descriptor, focus word in question sentence are extracted, wherein closing
Key entity word is the keyword in the word removed except auxiliary word, conjunction, the auxiliary word such as interjection in question sentence, for example, pronoun, verb,
Noun etc.;Descriptor is query object, puts question to the loan transaction object being directed to, and is that the loan transaction in loan transaction field is general
It reads, is to match to obtain from the specialized knowledge base in loan transaction field;Focus word refers to category relevant to loan transaction
The words such as property word, such as loan transaction title, type, feature, function.
Further, pronoun, the noun, interrogative in question sentence are analyzed, the constituents such as predicate, the object of question sentence are analyzed,
And the mode of supplementing etc. is carried out to the lack part of question sentence according to extracting, analyzing result, so that it is determined that the current loan that user proposes
Problem.That is, this step is to analyze the problem of user will put question to from the question sentence in message.
S103, according to the current loan problem, in multiple data sources, examined according to the priority of multiple data source
Rope, wherein the data source of highest priority is learning database;
Multiple data source includes: learning database, encyclopaedia website, Ask-Answer Community and search engine.
The data stored in the learning database are the data obtained by learning correct option.Data generally refer to problem
With the corpus of answer.
Correct option can come from could also be from the specialized knowledge base of the talk record of user, loan transaction field
Directly searched from encyclopaedia website, Ask-Answer Community and webpage it is confirmed after correct question and answer content, can also periodically by with
The talk record of user, the specialized knowledge base in loan transaction field and from encyclopaedia website, Ask-Answer Community and search engine retrieving
Correct question and answer information is obtained, is updated in the learning database, therefore, can quickly and conveniently be constructed, update the study number
According to library, time cost is saved.
It should be noted that net of the learning database when initially setting up, by crawler technology, in loan transaction field
Acquisition stand it is anticipated that field term is extracted from the expectation of acquisition, further by crawler technology, in encyclopaedia website and question and answer society
Area obtains corpus relevant to loan transaction FIELD Data, and content composition question and answer pair therein, i.e. question-response are extracted after analysis
Corresponding relationship.
Due in the learning database being all correct option after confirmation, priority also highest;Encyclopaedia
Website is the authoritative popular science category knowledge website with certain structuring, and webpage internal information structure and knowledge entry are all very clear
It is clear, it is easy to extract question and answer to information;Ask-Answer Community is a kind of interaction platform that user mutually puts question to and answers, and a user proposes
Problem, other users to know the answer can answer it.Since the answer that quizmaster can provide other people is compared
Compared with evaluation, thus in this kind of webpage answer confidence level it is all very high, and information therein is question and answer to form, is easy to extract
Question and answer are to information;And then organizational form is varied for the web page contents that search engine searches out, common declarative article, this kind of
In Web page text, problem and answer are fused together, also can be by answer information while statement is with problem dependent event
Narration comes out, and directly extracts question and answer and acquires a certain degree of difficulty to information.
Learning database, encyclopaedia website, Ask-Answer Community and search engine data source priority successively decrease, i.e., first exist
Current loan problem and answer corresponding with the current loan problem are retrieved in learning database, if not retrieving current loan
Critical entities word in current loan problem is then extracted in money problem and answer, includes the critical entities word phase in encyclopaedia site search
The problem of pass and answer corresponding with the problem, if not retrieving comprising the critical entities word problem and corresponding answering
Case is then searched for the problem and answer related to current loan problem in Ask-Answer Community, is higher than if not retrieving matching degree
The relevant issues of preset matching degree, then search current loan problem obtains search result on a search engine, search result be with
The relevant corpus of current loan problem, handling these corpus can be obtained the corresponding answer of current loan problem.Specifically, with API
The form of interface (Application Programming Interface, application programming interface) provide to encyclopaedia website,
Other systems such as Ask-Answer Community and search engine are called.
In the retrieval of learning database, encyclopaedia website and Ask-Answer Community, if obtaining answer, by working as this retrieval
Preceding loan problem and corresponding answer composition question and answer to storage into the learning database.
Further, for the mode phase not to the utmost of the information retrieval mode in above data source and extraction or selection answer
Together, wherein learning database, encyclopaedia website, the data in Ask-Answer Community are due to being the form for meeting question and answer pair, so information
Retrieval is retrieved mainly for problem, the Composition of contents candidate question set retrieved, then by the candidate question set and currently
The problem of loan problem is matched, and matching degree highest is obtained is as the target problem in the candidate question set.By in candidate
The problem closest or even identical with current loan problem is obtained in problem set, improves the accuracy of Problem Confirmation, further
Improve a possibility that most accurate answer can be obtained according to the target problem.Preferably, one threshold is set for the matching degree of problem
Value is retrieved if matching degree the problem of highest matching degree is also not up to the threshold value into the data source in next stage priority.
Specifically, when retrieving in learning database, encyclopaedia website and Ask-Answer Community to current loan problem, multiple candidates are obtained
Problem constitutes candidate question set, and multiple candidate problems are matched with current loan problem respectively, and advises according to preset matching
Matching degree is then calculated, if calculated highest matching degree is more than or equal to preset matching degree threshold value, by asking for highest matching degree
Topic is used as target problem, and target problem is the problem same or similar with current loan problem confirmed, and target problem is corresponding
Answer be final result.If calculated highest matching degree is less than preset matching degree threshold value, it is lower than in priority and works as
It is retrieved in the data source of preceding retrieval data source.The preset matching rule can will carry out the entity in matched two problems
The similarity of keyword compares, and similarity is matching degree, which may include synonym and near synonym, goes back
It can be the comparison such as interrogative, function word except entity key.The preset matching degree threshold value can be 70%, or
Other numerical value.By retrieving correct option in above-mentioned multiple data sources, can allow answer based on knowledge it is more extensive,
Accuracy rate is higher.
And for the data of search engine, then it is retrieved, will be retrieved using preset search strategy and preset algorithm
The sentence composition candidate sentence subset arrived, since the corpus source of search engine is complicated, content and the dry change ten thousand of format are changed, candidate sentences
The sentence of concentration can not largely be exported directly as answer.Due to using all webpages obtained in a search engine as number
According to source, therefore the response rate to problem is greatly improved, can be theoretically increased to response rate substantially absolutely.
Specifically, search engine search is divided into following three phases: crawling search result web page, parsing by crawler technology
Search result web page simultaneously extracts wherein webpage url (Uniform Resource Locator) relevant to problem and by mentioning
The url taken obtains web page contents relevant to problem.
The information composition searching request for being firstly added current loan problem is sent to search engine, is simulated using htmlunit
The function of browser obtains searched page, is parsed by dom tree, in the decimated search page to webpage relevant the problem of search
Url can be arrived and related web page contents the problem of search according to the url of acquisition.
In addition, solving the problems, such as counter climb during obtaining the search result of search engine by crawler technology.This reality
Frequency, and the anti-side rate climbed of Analysis server are crawled by reducing request number of times, reducing in example, thus targeted structure
Build the rule of request.For single ip, on the one hand can reduce the rate for issuing and requesting can preferably be escaped from instead
On the other hand the detection climbed if there is thousands of act on behalf of ip, then can be crawled quickly.
Further, various in view of the data source of the related web page obtained by search engine, page format is dry to be become
Wan Hua.Therefore preset algorithm is utilized when extracting the content in related web page, not only guarantees extraction rate, but also guarantees preferable extract
Accuracy rate.The preset algorithm can be the algorithm of the generic web pages text extracting based on row block distribution function, algorithm mistake first
All html of strainer page (Hyper Text Markup Language) label only retains the text information of webpage, then looks into
The rapid drawdown point that rises sharply in each text is found, the content between the rapid drawdown point that rises sharply is extracted, to obtain Web page text.It can be improved
The efficiency and accuracy of problem related content are obtained in search engine.
It, can basis by said extracted key value, and by the retrieval in encyclopaedia website, Ask-Answer Community and search engine
The content retrieved expands the quantity of current question and answer pair, improves the answer output rating to problem.
Further, can be expanded by generating confrontation network (GAN, Generative Adversarial Networks)
The quantity of current question and answer pair.Specifically, the question and answer pair for obtaining preservation, it is deep to carrying out to the question and answer of preservation by generating confrontation network
Degree study obtains new question and answer pair, and the new question and answer that study is obtained can be incited somebody to action in this way being stored in learning database
The dry question and answer of number greatly improve the answer output rating to problem to hundreds of thousands of, millions of question and answer pair are extended to.
Further, by new question and answer to be added learning database in be trained, can achieve learning database self
The effect of study.
Further, different according to the priority of data source, for the different confidence level of the result queue that retrieves, wherein
Priority is higher, and confidence level is higher.
S104, it is extracted from search result or selects correct answer and export the answer.
Learning database, encyclopaedia website, the data in Ask-Answer Community are due to being the form for meeting question and answer pair, if being matched to
The target problem of current loan problem after then inquiring the corresponding answer of the target problem, can be used as final result output.And it is
Answering marks confidence level, for reference, the confidence level which can be labeled for search result.That is, being multiple number
Different confidence levels is set according to source, and when exporting the answer, confirms the corresponding data source of the answer, the data source that will confirm that
Confidence level is labeled as the confidence level of the answer.
And the candidate sentences that the candidate sentences obtained for the content searched in a search engine are concentrated are most of not
It can be exported directly as answer, therefore answer extracting first should be carried out to each candidate sentences, answer extracting is the text that will be retrieved
This information is purified, related by matching primitives filtering surface but practical semantic unmatched wrong answer, can also be into one
It is that answer word or phrase are accurately extracted from answer using the deep structure analytical technology of language that step is extracted by word phrases.
Candidate answers are confirmed as in the answer extracted, are exported after selecting optimum answer in these candidate answers as final result.
And be answering marks confidence level, for reference, the confidence level which can be labeled for search result.
Preferably, before exporting final result, problem and corresponding multiple answers are used to the deep learning of defined domain
Mode carry out question and answer matching, and be arranged output answer number threshold value, multiple answers are ranked up according to confidence level, are more than
The answer of the sequence of the number threshold value rearward will be rejected, and output is less than the forward answer of sorting of the number threshold value, and is
Each answering marks confidence level, for reference, the confidence level which can be labeled for search result.Wherein, depth
The mode of habit specifically can be the analysis that sentence is carried out by deep neural network, for example, by CNN (convolutional neural networks) and
RNN (Recognition with Recurrent Neural Network) completes the sentence modeling of question sentence and answer, and the question and answer semantic expressiveness of high level output is passed to more
Layer perceptron carries out question and answer matching.Defined domain refers to loan transaction field.
It should be noted that the confidence level is also possible to according to the verification algorithm based on statistical machine learning, by problem with
Answer carries out the verification processing in syntax and semantic level, obtains the confidence level of different answers, such as by classifier in algorithm
Confidence level is as confidence level.
In the present embodiment, current loan problem is obtained by analyzing question sentence, and according to current loan problem in multiple data
It in source, is retrieved according to the priority of multiple data sources, wherein the data source of highest priority is learning database,
Practising database is obtained by learning the correct option of the past, and search result is accurate, therefore obtains from its search result
The accuracy of question and answer can be improved in answer, due to a possibility that thering are multiple data sources to provide retrievals, no answer output can be reduced,
Improve user experience.
The embodiment of the present application also protects a kind of automatic answering system for application of providing a loan, referring to fig. 2, the automatic-answering back device system
In built-in terminal in the embodiment shown in fig. 1 of uniting.The system includes: extraction module 201, analysis module 202, retrieval module
203 and answer generation module 204.
Extraction module 201, for extracting question sentence in the received message;
Analysis module 202 obtains current loan problem for analyzing the question sentence;
Retrieval module 203 is used for according to the current loan problem, in multiple data sources, according to multiple data source
Priority is retrieved, wherein the data source of highest priority is learning database, and the data in the learning database are to pass through
Study correct option obtains;
Answer generation module 204, for correct answer to be extracted or selected from search result and exports the answer.
Detail in the present embodiment, referring to the description of aforementioned embodiment illustrated in fig. 1.
In the present embodiment, current loan problem is obtained by analyzing question sentence, and according to current loan problem in multiple data
It in source, is retrieved according to the priority of multiple data sources, wherein the data source of highest priority is learning database,
Practising database is obtained by learning the correct option of the past, and search result is accurate, therefore obtains from its search result
The accuracy of question and answer can be improved in answer, due to a possibility that thering are multiple data sources to provide retrievals, no answer output can be reduced,
Improve user experience.
Referring to Fig. 3, the automatic answering system for application of providing a loan that another embodiment of the application provides, with reality shown in Fig. 2
Apply in example for provide a loan apply automatic answering system the difference is that:
Further, multiple data sources include: learning database, encyclopaedia website, Ask-Answer Community and search engine.
Retrieval module 203 further comprises:
Submodule 2031 is retrieved, for retrieving the current loan problem and answer in the learning database;
Extracting sub-module 2031, if extracting the current loan for not retrieving the current loan problem and answer and asking
Critical entities word in topic;
Submodule 2031 is retrieved, is also used to retrieve in the encyclopaedia website comprising the critical entities word problem and answer;
Submodule 2031 is retrieved, if being also used to not retrieve comprising the critical entities word problem and answer, is asked at this
Answer the problem and answer that retrieval is related to the current loan problem in community;
Submodule 2031 is retrieved, if being also used to not retrieve related problem of the matching degree higher than preset matching degree, is led to
It crosses the search engine and searches for the current loan problem, the corpus that search is obtained is as search result.
Further, submodule 2031 is retrieved, is also used in the learning database, the encyclopaedia website and the Ask-Answer Community
When retrieving to the current loan problem, multiple candidate problems are obtained.
The automatic answering system of the application that is used to provide a loan further include:
Matching module 301, for matching multiple candidate problems with the current loan problem respectively, and according to default
Matching rule calculates matching degree, if calculated highest matching degree is more than or equal to preset matching degree threshold value, highest is matched
The problem of spending is as target problem;
Matching module 301 triggers inspection if being also used to calculated highest matching degree less than the preset matching degree threshold value
Large rope module 2031 is retrieved in data source of the priority lower than current retrieval data source.
Further, submodule 2031 is retrieved, is also used to scan for the current loan problem by the search engine
When, building includes the searching request of the current loan problem and the searching request is sent to the search engine;
By htmlunit simulation browser obtain searched page, and by dom tree parse in the searched page with deserve
The uniform resource locator of the relevant webpage of preceding loan problem, obtains and the current loan problem according to the uniform resource locator
The content of relevant webpage;
According to the generic web pages text extracting algorithm based on row block distribution function, webpage in the content of the webpage is being obtained just
Text.
Further, the automatic answering system of the application that is used to provide a loan further include:
Update module 302, for periodically will with user talk record, loan transaction field specialized knowledge base and from
Encyclopaedia website, Ask-Answer Community and search engine retrieving obtain correct question and answer information, update in the learning database;
Setup module 303, for different confidence levels to be arranged for multiple data source;
Mark module 304, for confirming the corresponding data source of the answer, the data source that will confirm that when exporting the answer
Confidence level be labeled as the answer confidence level;
Question and answer are to study module 305, for obtaining the question and answer pair saved, and by generating confrontation network to the question and answer of preservation
To carrying out deep learning, will the obtained question and answer of study to being stored in the learning database.
Its correlation specifically describes, referring to the description of earlier figures 1 and embodiment illustrated in fig. 2.
In the present embodiment, current loan problem is obtained by analyzing question sentence, and according to current loan problem in multiple data
It in source, is retrieved according to the priority of multiple data sources, wherein the data source of highest priority is learning database,
Practising database is obtained by learning the correct option of the past, and search result is accurate, therefore obtains from its search result
The accuracy of question and answer can be improved in answer, due to a possibility that thering are multiple data sources to provide retrievals, no answer output can be reduced,
Improve user experience.
Further, the embodiment of the present application also provides a kind of terminal, as terminal described in previous embodiment, the end
End includes: memory, processor and is stored in the computer program that can be run on institute's memory and on the processor, described
When processor executes the computer program, realize described in embodiment as illustrated in the foregoing fig. 1 for the automatic of application of providing a loan
Answer method.
The embodiment of the present application also provides a kind of computer readable storage medium, which be can be
It is set in the terminal in the various embodiments described above, which can be the memory in above-described embodiment,
Specifically it can be hard drive memory, nonvolatile memory (such as flash memory or the other electricity for being used to form solid state drive
The memory etc. of the programmable limit deleting of son), volatile memory (such as either statically or dynamically random access memory etc.) etc..It should
It is stored with computer program on computer readable storage medium, realizes when which is executed by processor and implements shown in earlier figures 1
Auto-answer method described in example for application of providing a loan.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiments.
The above are to the description for the provide a loan auto-answer method and automatic answering system applied provided herein,
For those skilled in the art, it according to the thought of the embodiment of the present application, has in specific embodiments and applications
Change place, to sum up, the contents of this specification should not be construed as limiting the present application.