CN108121800B - Information generation method and device based on artificial intelligence - Google Patents

Information generation method and device based on artificial intelligence Download PDF

Info

Publication number
CN108121800B
CN108121800B CN201711396776.XA CN201711396776A CN108121800B CN 108121800 B CN108121800 B CN 108121800B CN 201711396776 A CN201711396776 A CN 201711396776A CN 108121800 B CN108121800 B CN 108121800B
Authority
CN
China
Prior art keywords
target
question
preset
similarity
professional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711396776.XA
Other languages
Chinese (zh)
Other versions
CN108121800A (en
Inventor
于佃海
陈立玮
贺文嵩
周晓
刘琼琼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711396776.XA priority Critical patent/CN108121800B/en
Publication of CN108121800A publication Critical patent/CN108121800A/en
Application granted granted Critical
Publication of CN108121800B publication Critical patent/CN108121800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Abstract

The embodiment of the application discloses an information generation method and device based on artificial intelligence. One embodiment of the method comprises: acquiring an answer acquisition request sent by a terminal, wherein the answer acquisition request comprises a target application identifier and a target question input by a user; determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to the corresponding relation between the preset application identifier and the professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to the target professional field, and a target similarity calculation model is arranged in association with the target professional question-answer library; selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; and for each candidate preset problem, generating the similarity between the candidate preset problem and the target problem based on the target similarity calculation model. This embodiment enriches the way information is generated.

Description

Information generation method and device based on artificial intelligence
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to an information generation method and device based on artificial intelligence.
Background
Artificial Intelligence (Artificial Intelligence), abbreviated in english as AI. The method is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, a field of research that includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others.
The intelligent question-answering system is a novel information service system, can analyze user intention on the basis of functions such as knowledge processing, semantic recognition and the like, and quickly and accurately solves and answers questions for users. The intelligent question-answering system can replace a real person to carry out conversation with the user, and has the characteristics of rich knowledge plane, high response speed and the like, so the intelligent question-answering system is popular with the majority of users.
Disclosure of Invention
The embodiment of the application provides an information generation method and device based on artificial intelligence.
In a first aspect, an embodiment of the present application provides an information generation method based on artificial intelligence, including: acquiring an answer acquisition request sent by a terminal, wherein the answer acquisition request comprises a target application identifier and a target question input by a user; determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to a corresponding relation between a preset application identifier and a professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in association with the target professional question-answer library, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; for each candidate preset question in the at least one candidate preset question, generating the similarity between the candidate preset question and the target question based on the target similarity calculation model.
In some embodiments, the target similarity calculation model comprises a word vector submodel and a similarity operator model, wherein the input of the similarity operator model comprises the output of the word vector submodel, the word vector submodel is used for representing the correspondence between the input text and the word vector, and the similarity operator model is used for representing the correspondence between the word vector pair and the similarity of the input text corresponding to the word vector pair.
In some embodiments, the target similarity calculation model is used for representing a correspondence between the target problem and the preset problem and a similarity between the target problem and the preset problem; and the generating a similarity between the candidate preset problem and the target problem based on the target similarity calculation model for each of the at least one candidate preset problem comprises: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question and the target question into the target similarity calculation model, and generating the similarity between the candidate preset question and the target question.
In some embodiments, the importing, for each candidate preset problem in the at least one candidate preset problem, the candidate preset problem and the target problem into the target similarity calculation model to generate the similarity between the candidate preset problem and the target problem includes: importing the target question into the word vector submodel to generate a first word vector corresponding to the target question; for each candidate preset problem in the at least one candidate preset problem, acquiring a second word vector corresponding to the candidate preset problem generated by the word vector sub-model in advance; and importing the first word vector and the second word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some embodiments, the target specialized question-answering library further includes preset answers set in association with preset questions, and the target similarity calculation model is used for representing a corresponding relationship between the target question, the preset question, and the preset answers set in association with the preset questions and similarities between the preset questions and the target question; and the generating a similarity between the candidate preset problem and the target problem based on the target similarity calculation model for each of the at least one candidate preset problem comprises: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question, a preset answer set in association with the candidate preset question, and the target question into the target similarity calculation model, and generating a similarity between the candidate preset question and the target question.
In some embodiments, the importing, for each candidate preset question of the at least one candidate preset question, the candidate preset question, a preset answer set in association with the candidate preset question, and the target question into the target similarity calculation model to generate the similarity between the candidate preset question and the target question includes: importing the target question into the word vector submodel to generate a third word vector corresponding to the target question; for each candidate preset question in the at least one candidate preset question, acquiring a fourth word vector corresponding to the candidate preset question generated by the word vector submodel in advance and an answer set in association with the candidate preset question; and importing the third word vector and the fourth word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some embodiments, the target similarity calculation model is trained by the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; training a pre-established initial first neural network by using the universal sample set to obtain an initial second neural network; acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information; and training the initial second neural network by using the target professional sample set to obtain the target similarity calculation model.
In some embodiments, the target similarity calculation model is trained by the following steps: acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; and training an initial third neural network by using the target professional sample set to obtain the target similarity calculation model.
In some embodiments, the generating, for each of the at least one candidate preset problem, a similarity between the candidate preset problem and the target problem based on the target similarity calculation model includes: for each candidate preset problem in the at least one candidate preset problem, generating a first similarity between the candidate preset problem and the target problem by using a target similarity calculation model; generating a second similarity between the candidate preset problem and the target problem by using a general similarity calculation model, wherein the general similarity calculation model is used for determining the similarity between the preset problem and the target problem; and according to a preset weight, carrying out weighted summation on the first similarity and the second similarity to obtain the similarity between the candidate preset problem and the target problem.
In some embodiments, the general similarity calculation model is trained by the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating whether the corpora in the corpus pair express the same meaning; and training a pre-established initial fourth neural network by using the universal sample set to obtain the universal similarity calculation model.
In some embodiments, the above method further comprises: and selecting a preset number of candidate preset problems as the problems to be displayed according to the sequence of the similarity from high to low from the at least one candidate preset problem.
In some embodiments, the above method further comprises: acquiring a preset answer which is set in association with the question to be displayed; and sending the question to be displayed and the preset answer to the terminal.
In some embodiments, the above method further comprises: sending the question to be displayed to the terminal, wherein the terminal displays the question to be displayed to a user, receives confirmation information which is input by the user and used for indicating the question to be displayed and matched with the target question, and returns the confirmation information; receiving the confirmation information; and returning the preset answer associated with the question to be displayed and indicated by the confirmation information to the terminal.
In a second aspect, an embodiment of the present application provides an artificial intelligence-based information generating apparatus, where the apparatus includes: the terminal comprises a first obtaining unit, a second obtaining unit and a third obtaining unit, wherein the first obtaining unit is used for obtaining an answer obtaining request sent by the terminal, and the answer obtaining request comprises a target application identifier and a target question input by a user; the system comprises a determining unit, a target professional question-answer library and a target similarity calculation model, wherein the determining unit is used for determining the target professional question-answer library corresponding to a target application identifier from a professional question-answer library set according to the corresponding relation between a preset application identifier and the professional question-answer library in the professional question-answer library set, the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in the target professional question-answer library in a relevant mode, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; the first selection unit is used for selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; and the generating unit is used for generating the similarity between each candidate preset question and the target question based on the target similarity calculation model for each candidate preset question in the at least one candidate preset question.
In some embodiments, the target similarity calculation model comprises a word vector submodel and a similarity operator model, wherein the input of the similarity operator model comprises the output of the word vector submodel, the word vector submodel is used for representing the correspondence between the input text and the word vector, and the similarity operator model is used for representing the correspondence between the word vector pair and the similarity of the input text corresponding to the word vector pair.
In some embodiments, the target similarity calculation model is used for representing a correspondence between the target problem and the preset problem and a similarity between the target problem and the preset problem; and the generating unit is further configured to: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question and the target question into the target similarity calculation model, and generating the similarity between the candidate preset question and the target question.
In some embodiments, the generating unit is further configured to: importing the target question into the word vector submodel to generate a first word vector corresponding to the target question; for each candidate preset problem in the at least one candidate preset problem, acquiring a second word vector corresponding to the candidate preset problem generated by the word vector sub-model in advance; and importing the first word vector and the second word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some embodiments, the target specialized question-answering library further includes preset answers set in association with preset questions, and the target similarity calculation model is used for representing a corresponding relationship between the target question, the preset question, and the preset answers set in association with the preset questions and similarities between the preset questions and the target question; and the generating unit is further configured to: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question, a preset answer set in association with the candidate preset question, and the target question into the target similarity calculation model, and generating a similarity between the candidate preset question and the target question.
In some embodiments, the generating unit is further configured to: importing the target question into the word vector submodel to generate a third word vector corresponding to the target question; for each candidate preset question in the at least one candidate preset question, acquiring a fourth word vector corresponding to the candidate preset question generated by the word vector submodel in advance and an answer set in association with the candidate preset question; and importing the third word vector and the fourth word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some embodiments, the target similarity calculation model is trained by the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; training a pre-established initial first neural network by using the universal sample set to obtain an initial second neural network; acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information; and training the initial second neural network by using the target professional sample set to obtain the target similarity calculation model.
In some embodiments, the target similarity calculation model is trained by the following steps: acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; and training an initial third neural network by using the target professional sample set to obtain the target similarity calculation model.
In some embodiments, the generating unit is further configured to: for each candidate preset problem in the at least one candidate preset problem, generating a first similarity between the candidate preset problem and the target problem by using a target similarity calculation model; generating a second similarity between the candidate preset problem and the target problem by using a general similarity calculation model, wherein the general similarity calculation model is used for determining the similarity between the preset problem and the target problem; and according to a preset weight, carrying out weighted summation on the first similarity and the second similarity to obtain the similarity between the candidate preset problem and the target problem.
In some embodiments, the general similarity calculation model is trained by the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating whether the corpora in the corpus pair express the same meaning; and training a pre-established initial fourth neural network by using the universal sample set to obtain the universal similarity calculation model.
In some embodiments, the above apparatus further comprises: and the second selecting unit is used for selecting a preset number of candidate preset problems as the problems to be displayed according to the sequence of the similarity from high to low from the at least one candidate preset problem.
In some embodiments, the above apparatus further comprises: the second acquisition unit is used for acquiring a preset answer which is set in association with the question to be displayed; and the first sending unit is used for sending the question to be displayed and the preset answer to the terminal.
In some embodiments, the above apparatus further comprises: a second sending unit, configured to send the question to be presented to the terminal, where the terminal presents the question to be presented to a user, receives confirmation information, which is input by the user and used for indicating the question to be presented that matches the target question, and returns the confirmation information; a receiving unit, configured to receive the acknowledgement information; and the returning unit is used for returning the preset answer associated with the question to be displayed and indicated by the confirmation information to the terminal.
In a third aspect, an embodiment of the present application provides a server, where the server includes: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, which when executed by a processor, implements the method according to the first aspect.
According to the information generation method and device based on artificial intelligence, an answer obtaining request sent by a terminal is obtained, wherein the answer obtaining request comprises a target application identifier and a target problem input by a user; determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to a corresponding relation between a preset application identifier and a professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in association with the target professional question-answer library, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; for each candidate preset problem in the at least one candidate preset problem, the similarity between the candidate preset problem and the target problem is generated based on the target similarity calculation model, and the accuracy of the generated information is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of an artificial intelligence based information generation method according to the present application;
FIG. 3 is a schematic diagram of an application scenario of an artificial intelligence based information generation method according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of an artificial intelligence based information generation method according to the present application;
FIG. 5 is a flow diagram of yet another embodiment of an artificial intelligence based information generation method according to the present application;
FIG. 6 is a flow diagram of yet another embodiment of an artificial intelligence based information generation method according to the present application;
FIG. 7 is an exemplary flow chart according to one implementation of the method shown in FIG. 6;
FIG. 8 is a flow diagram of yet another embodiment of an artificial intelligence based information generation method according to the present application;
FIG. 9 is an exemplary flow chart according to one implementation of the method shown in FIG. 8;
FIG. 10 is a schematic block diagram of one embodiment of an artificial intelligence based information generating apparatus according to the present application;
FIG. 11 is a block diagram of a computer system suitable for use in implementing a server according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the artificial intelligence based information generation method or apparatus of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 may be a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as an intelligent question and answer application, a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg Audio Layer 4), laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for smart question-and-answer type applications on the terminal devices 101, 102, 103. The backend server may analyze and otherwise process data such as the received answer acquisition request, and feed back a processing result (e.g., a matched question and/or answer) to the terminal device.
It should be noted that the artificial intelligence based information generating method provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the artificial intelligence based information generating apparatus is generally provided in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of an artificial intelligence based information generation method in accordance with the present application is illustrated. The information generation method based on artificial intelligence comprises the following steps:
step 201, an answer obtaining request sent by a terminal is obtained.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the artificial intelligence based information generation method operates may acquire an answer acquisition request transmitted by a terminal from a local or other electronic device.
In this embodiment, the electronic device may directly receive the answer obtaining request from another electronic device to obtain the disease description information; or receiving the answer obtaining request from other electronic equipment, storing the answer obtaining request to the local, and obtaining the disease description information from the local.
In this embodiment, the answer obtaining request includes a target application identifier and a target question input by a user.
In this embodiment, the terminal may be a terminal with which the user inputs a question.
In this embodiment, the application identifier may be an application in which the user inputs a question. By way of example, the application may be a medical-type application, a banking-type application, or the like. When the user uses the application, the user can input a question from a question consultation window preset in the application. The application may send the question to a server using a terminal.
Alternatively, the user-entered question may be a question relating to a professional field. As an example, the question of the user input may be "how many total patent laws? ".
Step 202, according to the preset corresponding relation between the application identifier and the professional question-answer library in the professional question-answer library set, determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set.
In this embodiment, the electronic device may determine, from the professional question and answer library set, a target professional question and answer library corresponding to the target application identifier according to a preset correspondence between the application identifier and a professional question and answer library in the professional question and answer library set.
By way of example, the specialized question-answer libraries of the set of specialized question-answer libraries may include, but are not limited to, one or more of the following: a professional question and answer library in the legal field, a professional question and answer library in the medical field, a professional question and answer library in the financial field and the like.
As an example, if the target application identifier of the answer obtaining request is an application identifier of a legal application, the professional question and answer library in the legal field may be determined as the target professional question and answer library.
In this embodiment, the target professional question-and-answer library includes preset questions related to the target professional field.
Optionally, the target professional question and answer library further includes preset answers set in association with preset questions.
As an example, a professional question-and-answer library of the legal domain may comprise some pre-set questions related to the legal domain, e.g. "how many criminal laws? "how many patent laws? "and the like.
In this embodiment, the target professional question-answering library is provided with a target similarity calculation model in association.
As an example, a professional question and answer library association in the legal field is provided with a similarity calculation model related to legal questions, a professional question and answer library association in the medical field is provided with a similarity calculation model related to medical questions, and a professional question and answer library medical association in the financial field is provided with a similarity calculation model related to financial questions.
It should be noted that, corresponding to a certain field, the corpus of the field may be used to train the similarity calculation model in a targeted manner. The obtained model aiming at the field has better matching function for the problems in the field, thereby improving the accuracy of problem matching.
In the present embodiment, the target similarity calculation model described above is used to determine the similarity between the preset problem and the target problem.
It should be noted that the target similarity calculation model is not used for determining the similarity between the target question and the answer, but is used for determining the similarity between the preset question in the target professional question-answer library and the target question.
Step 203, selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question.
In this embodiment, the electronic device may select at least one candidate preset question from the target professional question-and-answer library according to the keyword in the target question.
As an example, word segmentation of the target question and then extraction of keywords from the results of the word segmentation may be utilized.
As an example, the target question "how many bars there are in total for a patent law" may be "patent law, how many, bars".
Optionally, one or more preset questions with a higher degree of coincidence with the keywords in the target question may be selected from the target professional question-and-answer library in an inverted index manner to serve as candidate preset questions.
As an example, for the target question "how many total patent laws" three candidate preset questions "how many patent laws are? "what is the effect of patent law? "and" what is the number of words in the patent law? ".
And step 204, for each candidate preset problem in the at least one candidate preset problem, generating the similarity between the candidate preset problem and the target problem based on the target similarity calculation model.
In this embodiment, the electronic device may generate, for each candidate preset question of the at least one candidate preset question, a similarity between the candidate preset question and the target question based on the target similarity calculation model.
In some optional implementation manners of this embodiment, a predetermined number of candidate preset questions are selected as the questions to be presented according to the order from high similarity to low similarity from among the at least one candidate preset question.
As an example, for the target question "how many total patent laws" two questions to be presented "how many pieces of patent laws are? "and" what is the number of words in the patent law? ".
In some optional implementations of this embodiment, the method further includes: acquiring a preset answer which is set in association with the question to be displayed; and sending the question to be displayed and the preset answer to the terminal.
As an example, the electronic device may further obtain a question to be presented and "what is the number of pieces of patent law? The preset answer to the "association setting may be, for example," 67 bars ". The electronic device can also obtain the question to be displayed and the number of words of the patent law? ", the preset answer to the association setting, for example, may be" nine thousand or so ".
In some optional implementations of this embodiment, the method further includes: sending the question to be displayed to the terminal, wherein the terminal displays the question to be displayed to a user, receives confirmation information which is input by the user and used for indicating the question to be displayed and matched with the target question, and returns the confirmation information; receiving the confirmation information; and returning the preset answer associated with the question to be displayed and indicated by the confirmation information to the terminal.
As an example, the above-mentioned electronic device may display a question "what are the number of pieces of patent law? "and" what is the number of words in the patent law? "send to terminal. And the terminal displays the received problems to be displayed. Then the terminal receives confirmation information input by the user, confirming "what is the number of patent laws? "is the question to be presented that matches the target question. And the terminal sends the confirmation information to the server. The server receives the confirmation information, "what is the number of patent laws? And the associated preset answers are '67 pieces', and the terminal returns. The terminal displays the preset answer of '67 bars'.
As an example, the target similarity calculation model may be a correspondence table, and the target similarity calculation model may correspond to a storage problem pair and a similarity. After the electronic device receives the target problem and determines the candidate preset problem, the electronic device can search a problem pair matched with the target problem and the candidate preset problem in a large number of stored problem pairs. And determining the similarity corresponding to the searched problem pair as the similarity between the candidate preset problem and the target problem.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the artificial intelligence based information generation method according to the present embodiment. In the application scenario of fig. 3, a user first uses a terminal 301 to initiate an answer obtaining request 303 to a server 302; then, the server can obtain the content of the answer obtaining request in the background, for example, how many total the target application identifies "legal application" and the target question "patent law? ". Then, the server determines a target professional question-and-answer library 'in the legal field' corresponding to the target application identifier 'legal application' from the professional question-and-answer library set according to the corresponding relation between the preset application identifier and the professional question-and-answer library in the professional question-and-answer library set. Then, the server may select at least one candidate preset question from the target professional question-and-answer library according to the keyword "patent law, number of patents, and bars" in the target question, for example, select three candidate preset questions "number of patents? "what is the effect of patent law? "and" what is the number of words in the patent law? ". Then, the server may generate, for each of the at least one candidate preset question, a similarity between the candidate preset question and the target question based on the target similarity calculation model, for example, "what is the number of patent laws? Similarity between "how many bars are in total with the objective problem" patent method "is 95%; can "what is the effect of patent law? Similarity between "how many bars there are in total with the objective problem" patent method "is 30%; can "how many words of patent law can be generated? The similarity between "how many pieces are in total with the objective problem" patent method "is 60%. Finally, the server may return the answer 304 corresponding to the candidate preset question with the highest similarity, for example, "67 pieces," to the terminal.
According to the method provided by the embodiment of the application, an answer obtaining request sent by a terminal is obtained, wherein the answer obtaining request comprises a target application identifier and a target question input by a user; determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to a corresponding relation between a preset application identifier and a professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in association with the target professional question-answer library, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; for each candidate preset problem in the at least one candidate preset problem, the similarity between the candidate preset problem and the target problem is generated based on the target similarity calculation model, and the accuracy of the generated information is improved.
Referring to FIG. 4, a flow diagram 400 is shown illustrating one embodiment of an artificial intelligence based information generation method according to the present application. The information generation method based on artificial intelligence comprises the following steps:
step 401, obtaining an answer obtaining request sent by a terminal.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the artificial intelligence based information generation method operates may acquire an answer acquisition request transmitted by a terminal from a local or other electronic device.
Step 402, according to the corresponding relation between the preset application identification and the professional question-answer library in the professional question-answer library set, determining a target professional question-answer library corresponding to the target application identification from the professional question-answer library set.
In this embodiment, the electronic device may determine, from the professional question and answer library set, a target professional question and answer library corresponding to the target application identifier according to a preset correspondence between the application identifier and a professional question and answer library in the professional question and answer library set.
In this embodiment, the target similarity calculation model is used to represent the correspondence between the target problem and the preset problem and the similarity between the target problem and the preset problem.
And 403, selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question.
In this embodiment, the electronic device may select at least one candidate preset question from the target professional question-and-answer library according to the keyword in the target question.
Step 404, for each preset candidate preset question in at least one candidate preset question, importing the candidate preset question and the target question into a target similarity calculation model, and generating a similarity between the candidate preset question and the target question.
In some optional implementations of the present embodiment, the target similarity calculation model may include a word vector sub-model and a similarity operator model.
In this implementation, the word vector submodel is used to represent the correspondence between the input text and the word vector.
In this implementation manner, the similarity measure operator model is used to represent the correspondence between word vectors and the similarities of the word vectors to the corresponding input texts.
In this implementation, the word vector sub-model may be obtained by training an initial model using a text sample set. Here, the text sample includes text and word vectors. The initial model may be a bag of words model, a convolutional neural network model, a long-and-short-term memory model, etc.
By way of example, an initial model may refer to a model that is untrained or that is untrained. The initial model may be provided with initial parameters that may be continuously adjusted during the training process.
In this implementation manner, the similarity operator model may be obtained by training an initial neural network by using a word vector sample set. Here, the word vector sample includes a word vector pair and a similarity between the word vector pair.
In this implementation, the initial neural network may be a variety of neural networks, such as a convolutional neural network, a cyclic neural network, a long-short term memory neural network, and the like.
By way of example, an initial neural network may refer to a neural network that is untrained or is untrained to complete. Each layer of the initial neural network may be provided with initial parameters, which may be adjusted continuously during the training process.
As an example, the initial neural network may be various types of untrained or untrained artificial neural networks or models obtained by combining various types of untrained or untrained artificial neural networks, for example, the initial neural network may be an untrained convolutional neural network, an untrained cyclic neural network, or a model obtained by combining an untrained convolutional neural network, an untrained cyclic neural network, and an untrained fully-connected layer.
In this implementation, the word vector submodel and the similarity calculation submodel may be trained together or separately.
In this implementation, the input of the similarity measure operator model includes the output of the word vector sub-model.
As an example, the input of the similarity model may include a word vector corresponding to the target question output by the word vector sub-model, and may further include a word vector corresponding to the candidate preset question. It should be noted that the word vector corresponding to the candidate default question may not be output by the word vector submodel.
In this implementation, step 404 may be implemented by: importing the target question into the word vector submodel to generate a first word vector corresponding to the target question; for each candidate preset problem in the at least one candidate preset problem, acquiring a second word vector corresponding to the candidate preset problem generated by the word vector sub-model in advance; and importing the first word vector and the second word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
It should be noted that, for each preset question in the target question-answering library, before implementing the method shown in this embodiment, the word vector submodel may be used to generate the word vector of the preset question in advance, so as to avoid calculating the word vector of the candidate preset question during the implementation of the method shown in this embodiment, thereby increasing the calculation speed.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the artificial intelligence based information generation method in the present embodiment highlights the step of calculating the similarity using the candidate preset problem and the above target problem as inputs. Therefore, the scheme described in the embodiment can enrich the way of generating information.
Referring to FIG. 5, a flow 500 of one embodiment of an artificial intelligence based information generation method according to the present application is shown. The information generation method based on artificial intelligence comprises the following steps:
step 501, an answer obtaining request sent by a terminal is obtained.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the artificial intelligence based information generation method operates may acquire an answer acquisition request transmitted by a terminal from a local or other electronic device.
And 502, determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to the preset corresponding relation between the application identifier and the professional question-answer library in the professional question-answer library set.
In this embodiment, the target specialized question-and-answer library further includes a preset answer set in association with a preset question.
In this embodiment, the target similarity calculation model is used to represent a corresponding relationship between a target question, a preset answer associated with the preset question, and a similarity between the preset question and the target question.
Step 503, selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question.
In this embodiment, the electronic device may select at least one candidate preset question from the target professional question-and-answer library according to the keyword in the target question.
Step 504, for each candidate preset question of at least one candidate preset question, importing the candidate preset question, a preset answer associated with the candidate preset question, and the target question into the target similarity calculation model, and generating a similarity between the candidate preset question and the target question.
It should be noted that, by using the preset answer as auxiliary information of the candidate preset question and inputting the auxiliary information into the target similarity calculation model together with the candidate preset question, the candidate preset question more similar to the target question can be determined.
As an example, the preset answer may repeat a keyword in the question. For example, for the target problem "how many total patent laws? "how many pieces of the patent law are to be able to improve the candidate preset problem" by using the preset answer of "67 pieces" as the auxiliary information? How many in total are the "and objective problem" patent laws? "similarity between them.
It should be noted that the preset answer is used as auxiliary information of the candidate preset question and is input into the target similarity calculation model together with the candidate preset question, and the similarity between the candidate preset question and the target question may be improved because the preset answer is more matched with the preset question.
As an example, some preset questions, although having different meanings, can be answered with the same answer, which generally occurs because the answer content is large and covers several different aspects. In this case, a more similar question may be found due to a more appropriate answer.
In some optional implementations of the present embodiment, the target similarity calculation model may include a word vector sub-model and a similarity operator model.
In this implementation, the input of the similarity measure operator model includes the output of the word vector sub-model.
As an example, the input of the similarity model may include a word vector corresponding to the target question output by the word vector submodel, and may further include a word vector corresponding to the candidate preset question and the preset answer. It should be noted that the word vector corresponding to the candidate preset question and the preset answer may not be output by the word vector submodel.
In this implementation, the word vector submodel is used to represent the correspondence between the input text and the word vector.
In this implementation manner, the similarity measure operator model is used to represent the correspondence between word vectors and the similarities of the word vectors to the corresponding input texts.
In some optional implementations of the embodiment, step 504 may be implemented by: importing the target question into the word vector submodel to generate a third word vector corresponding to the target question; for each candidate preset question in the at least one candidate preset question, acquiring a fourth word vector corresponding to the candidate preset question generated by the word vector submodel in advance and an answer set in association with the candidate preset question; and importing the third word vector and the fourth word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In this implementation manner, for each preset question and the preset answer associated with the preset question, the preset question and the preset answer are spliced to obtain a question-answer splicing result. And then calculating a question answer splicing result, and generating a word vector of the question answer splicing result by using a word vector submodel in advance.
As can be seen from fig. 5, compared with the embodiment corresponding to fig. 2, the flow 500 of the artificial intelligence based information generating method in the present embodiment highlights a step of generating the similarity using the candidate preset question, the preset answer, and the target question as input. Therefore, the scheme described in the embodiment can introduce the preset answers as auxiliary information and determine the similarity, so that the information generation mode is enriched, and the accuracy of the generated information can be further improved.
Referring to FIG. 6, a flow 600 of one embodiment of an artificial intelligence based information generation method according to the present application is shown. The information generation method based on artificial intelligence comprises the following steps:
step 601, obtaining an answer obtaining request sent by a terminal.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the artificial intelligence based information generation method operates may acquire an answer acquisition request transmitted by a terminal from a local or other electronic device.
Step 602, according to the preset corresponding relationship between the application identifier and the professional question-answer library in the professional question-answer library set, determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set.
In the present embodiment, referring to fig. 7, the target similarity calculation model is obtained by the training of the flow 700 shown in fig. 7:
step 701, acquiring a universal sample set.
In this implementation, the general sample includes a general corpus pair and indication information, and the indication information is used to indicate that the corpuses in the corpus pair express the same meaning or do not express the same meaning.
By way of example, a generic corpus pair includes "today's day of the week? "and" today's day of the week? ", and indication information is used to indicate that both convey the same meaning.
Step 702, training a pre-established initial first neural network by using a universal sample set to obtain an initial second neural network.
In this implementation, the initial first neural network may be various neural networks, such as a convolutional neural network, a cyclic neural network, a long-short term memory neural network, and the like.
As an example, the initial first neural network may refer to the initial first neural network as untrained or as untrained. Each layer of the initial first neural network may be provided with initial parameters, which may be adjusted continuously during the training process.
As an example, the initial first neural network may be various types of untrained or untrained artificial neural networks or models obtained by combining various types of untrained or untrained artificial neural networks, for example, the initial neural network may be an untrained convolutional neural network, an untrained cyclic neural network, or a model obtained by combining an untrained convolutional neural network, an untrained cyclic neural network, and an untrained fully-connected layer.
Step 703, acquiring a target professional sample set.
Here, the target professional sample includes a target professional corpus pair and indication information.
As an example, a professional corpus pair may include "what are the number of patent laws? "and" how many pieces there are in total for patent law ", and indication information is used to indicate that both express the same meaning.
And step 704, training an initial second neural network by using the target professional sample set to obtain a target similarity calculation model.
It should be noted that, on the basis of the general model, the target similarity calculation model can recognize both the general language type sentences and the professional language type sentences by using the way of performing fine tuning (hot start) on the model by using the professional corpus. Thus, the accuracy of recognition can be improved.
Step 603, selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question.
In this embodiment, the electronic device may select at least one candidate preset question from the target professional question-and-answer library according to the keyword in the target question.
And step 604, for each candidate preset problem in the at least one candidate preset problem, generating the similarity between the candidate preset problem and the target problem by the target similarity calculation model.
In this embodiment, the electronic device may generate, for each of the at least one candidate preset questions, a similarity between the candidate preset question and the target question from the target similarity calculation model.
As can be seen from fig. 6 and 7, compared with the embodiment corresponding to fig. 2, the flow 600 of the artificial intelligence based information generation method in this embodiment highlights the step of training the model by using the general sample set and then the target professional sample set. Therefore, the scheme described in the embodiment can improve the accuracy of the information output by the target similarity calculation model.
Referring to FIG. 8, a flow diagram 800 illustrating one embodiment of an artificial intelligence based information generation method in accordance with the present application is shown. The information generation method based on artificial intelligence comprises the following steps:
step 801, obtaining an answer obtaining request sent by a terminal.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the artificial intelligence based information generation method operates may acquire an answer acquisition request transmitted by a terminal from a local or other electronic device.
And step 802, determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to the preset corresponding relation between the application identifier and the professional question-answer library in the professional question-answer library set.
In this embodiment, the electronic device may determine, from the professional question and answer library set, a target professional question and answer library corresponding to the target application identifier according to a preset correspondence between the application identifier and a professional question and answer library in the professional question and answer library set.
In this embodiment, referring to fig. 9, the target similarity calculation model is obtained by a process 900:
and step 901, acquiring a target professional sample set.
Here, the target professional sample includes a target professional corpus pair and indication information for indicating whether the corpuses in the corpus pair express the same meaning or do not express the same meaning.
And step 902, training an initial third neural network by using the target professional sample set to obtain a target similarity calculation model.
It should be noted that the target similarity calculation model obtained by training the initial third neural network with the target professional sample set, i.e., the target professional corpus, has a better processing capability for the problems related to the target specialty.
In this implementation, the initial third neural network may be various neural networks, such as a convolutional neural network, a cyclic neural network, a long-short term memory neural network, and the like.
By way of example, the initial third neural network may refer to the initial third neural network as untrained or as untrained. Each layer of the initial third neural network may be provided with initial parameters, which may be adjusted continuously during the training process.
As an example, the initial third neural network may be various types of untrained or untrained artificial neural networks or models obtained by combining various types of untrained or untrained artificial neural networks, for example, the initial neural network may be an untrained convolutional neural network, an untrained cyclic neural network, or a model obtained by combining an untrained convolutional neural network, an untrained cyclic neural network, and an untrained fully-connected layer.
Step 804, for each candidate preset problem in the at least one candidate preset problem, generating a first similarity between the candidate preset problem and the target problem by using the target similarity calculation model.
In this embodiment, the electronic device may generate, for each candidate preset question of the at least one candidate preset question, a first similarity between the candidate preset question and the target question using the target similarity calculation model.
Step 805, for each candidate preset question in the at least one candidate preset question, generating a second similarity between the candidate preset question and the target question by using the general similarity calculation model.
In this embodiment, the electronic device may generate, for each candidate preset question of the at least one candidate preset question, a second similarity between the candidate preset question and the target question using the general similarity calculation model.
In this embodiment, the above-mentioned general similarity calculation model is used to determine the similarity between the preset problem and the target problem.
In this embodiment, the general similarity calculation model is obtained by training through the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating whether the corpora in the corpus pair express the same meaning; and training a pre-established initial fourth neural network by using the universal sample set to obtain the universal similarity calculation model.
It should be noted that the target similarity calculation model obtained by training the initial fourth neural network with the universal sample set, i.e., the universal corpus, has a better processing capability for the problem related to the universal language.
In this implementation, the initial fourth neural network may be various neural networks, such as a convolutional neural network, a cyclic neural network, a long-short term memory neural network, and the like.
As an example, the initial fourth neural network may refer to the initial fourth neural network as untrained or as untrained. Each layer of the initial fourth neural network may be provided with initial parameters, which may be adjusted continuously during the training process.
As an example, the initial fourth neural network may be various types of untrained or untrained artificial neural networks or models obtained by combining various types of untrained or untrained artificial neural networks, for example, the initial neural network may be an untrained convolutional neural network, an untrained cyclic neural network, or a model obtained by combining an untrained convolutional neural network, an untrained cyclic neural network, and an untrained fully-connected layer.
Step 806, for each candidate preset problem in the at least one candidate preset problem, performing weighted summation on the first similarity and the second similarity according to a preset weight to obtain a similarity between the candidate preset problem and the target problem.
In this embodiment, the electronic device may perform, for each candidate preset question of the at least one candidate preset question, a weighted summation of the first similarity and the second similarity according to a preset weight, so as to obtain a similarity between the candidate preset question and the target question.
It should be noted that, by using both the general model and the target similarity calculation model to determine the final similarity, the similarity of the two problems in the general language and the similarity of the professional speech can be considered, and the accuracy of the generated similarity can be improved.
As can be seen from fig. 8 and 9, compared with the embodiment corresponding to fig. 2, the flow 800 of the artificial intelligence based information generation method in the present embodiment highlights the step of determining the final similarity for both the general model and the target similarity calculation model. Therefore, the scheme described in the embodiment enriches the information generation modes, and can improve the accuracy of the generated similarity.
With further reference to fig. 10, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of an artificial intelligence based information generating apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied in various electronic devices.
As shown in fig. 10, the artificial intelligence based information generating apparatus 1000 according to the present embodiment includes: a first acquisition unit 1001, a determination unit 1002, a first selection unit 1003, and a generation unit 1004. The first obtaining unit is used for obtaining an answer obtaining request sent by a terminal, wherein the answer obtaining request comprises a target application identifier and a target question input by a user; the system comprises a determining unit, a target professional question-answer library and a target similarity calculation model, wherein the determining unit is used for determining the target professional question-answer library corresponding to a target application identifier from a professional question-answer library set according to the corresponding relation between a preset application identifier and the professional question-answer library in the professional question-answer library set, the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in the target professional question-answer library in a relevant mode, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; the first selection unit is used for selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; and the generating unit is used for generating the similarity between each candidate preset question and the target question based on the target similarity calculation model for each candidate preset question in the at least one candidate preset question.
In some optional implementations of this embodiment, the target similarity calculation model includes a word vector submodel and a similarity operator model, where an input of the similarity operator model includes an output of the word vector submodel, the word vector submodel is used to represent a correspondence between an input text and a word vector, and the similarity operator model is used to represent a correspondence between a word vector pair and a similarity of the input text corresponding to the word vector pair.
In some optional implementation manners of this embodiment, the target similarity calculation model is used to represent a correspondence between the target problem and the preset problem and the similarity between the target problem and the preset problem; and the generating unit is further configured to: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question and the target question into the target similarity calculation model, and generating the similarity between the candidate preset question and the target question.
In some optional implementation manners of this embodiment, the generating unit is further configured to: importing the target question into the word vector submodel to generate a first word vector corresponding to the target question; for each candidate preset problem in the at least one candidate preset problem, acquiring a second word vector corresponding to the candidate preset problem generated by the word vector sub-model in advance; and importing the first word vector and the second word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some optional implementation manners of this embodiment, the target specialized question-answering library further includes preset answers set in association with preset questions, and the target similarity calculation model is used for representing a corresponding relationship between the target question, the preset question, and the preset answers set in association with the preset questions and similarities between the preset questions and the target question; and the generating unit is further configured to: for each candidate preset question in the at least one candidate preset question, importing the candidate preset question, a preset answer set in association with the candidate preset question, and the target question into the target similarity calculation model, and generating a similarity between the candidate preset question and the target question.
In some optional implementation manners of this embodiment, the generating unit is further configured to: importing the target question into the word vector submodel to generate a third word vector corresponding to the target question; for each candidate preset question in the at least one candidate preset question, acquiring a fourth word vector corresponding to the candidate preset question generated by the word vector submodel in advance and an answer set in association with the candidate preset question; and importing the third word vector and the fourth word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
In some optional implementations of the present embodiment, the target similarity calculation model is obtained by training through the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; training a pre-established initial first neural network by using the universal sample set to obtain an initial second neural network; acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information; and training the initial second neural network by using the target professional sample set to obtain the target similarity calculation model.
In some optional implementations of the present embodiment, the target similarity calculation model is obtained by training through the following steps: acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning; and training an initial third neural network by using the target professional sample set to obtain the target similarity calculation model.
In some optional implementation manners of this embodiment, the generating unit is further configured to: for each candidate preset problem in the at least one candidate preset problem, generating a first similarity between the candidate preset problem and the target problem by using a target similarity calculation model; generating a second similarity between the candidate preset problem and the target problem by using a general similarity calculation model, wherein the general similarity calculation model is used for determining the similarity between the preset problem and the target problem; and according to a preset weight, carrying out weighted summation on the first similarity and the second similarity to obtain the similarity between the candidate preset problem and the target problem.
In some optional implementation manners of this embodiment, the general similarity calculation model is obtained by training through the following steps: acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating whether the corpora in the corpus pair express the same meaning; and training a pre-established initial fourth neural network by using the universal sample set to obtain the universal similarity calculation model.
In some optional implementations of this embodiment, the apparatus further includes: and a second selecting unit (not shown) configured to select a predetermined number of candidate preset questions as the questions to be presented according to the order from high similarity to low similarity from the at least one candidate preset question.
In some optional implementations of this embodiment, the apparatus further includes: a second obtaining unit (not shown) for obtaining a preset answer set in association with the question to be presented; a first sending unit (not shown) for sending the question to be presented and the preset answer to the terminal.
In some optional implementations of this embodiment, the apparatus further includes: a second sending unit (not shown) configured to send the question to be presented to the terminal, where the terminal presents the question to be presented to a user, receives confirmation information input by the user and indicating a question to be presented that matches the target question, and returns the confirmation information; a receiving unit (not shown) for receiving the above-mentioned confirmation information; a returning unit (not shown) for returning the preset answer associated with the question to be presented, indicated by the confirmation information, to the terminal.
In this embodiment, specific processes of the first obtaining unit 1001, the determining unit 1002, the first selecting unit 1003, and the generating unit 1004 and technical effects thereof may refer to relevant descriptions of step 201, step 202, step 203, and step 204 in the corresponding embodiment of fig. 2, which are not described herein again.
It should be noted that, for details of implementation and technical effects of each unit in the artificial intelligence based information generating apparatus provided in this embodiment, reference may be made to descriptions of other embodiments in this application, and details are not described herein again.
Referring now to FIG. 11, a block diagram of a computer system 11 suitable for use in implementing a server according to embodiments of the present application is shown. The server shown in fig. 11 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 11, the computer system 1100 includes a Central Processing Unit (CPU)1101, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data necessary for the operation of the system 1100 are also stored. The CPU 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.
The following components are connected to the I/O interface 1105: an input portion 1106 including a keyboard, mouse, and the like; an output portion 1107 including a signal output unit such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1108 including a hard disk and the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, or the like. The communication section 1109 performs communication processing via a network such as the internet. A driver 1110 is also connected to the I/O interface 1105 as necessary. A removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1110 as necessary, so that a computer program read out therefrom is mounted into the storage section 1108 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 1109 and/or installed from the removable medium 1111. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 1101.
It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first acquisition unit, a determination unit, a first selection unit, and a generation unit. The names of these units do not in some cases form a limitation on the units themselves, and for example, the first obtaining unit may also be described as a "unit that obtains an answer obtaining request sent by a terminal".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring an answer acquisition request sent by a terminal, wherein the answer acquisition request comprises a target application identifier and a target question input by a user; determining a target professional question-answer library corresponding to the target application identifier from the professional question-answer library set according to a corresponding relation between a preset application identifier and a professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in association with the target professional question-answer library, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems; selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question; for each candidate preset question in the at least one candidate preset question, generating the similarity between the candidate preset question and the target question based on the target similarity calculation model.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (14)

1. An artificial intelligence based information generation method, comprising:
acquiring an answer acquisition request sent by a terminal, wherein the answer acquisition request comprises a target application identifier and a target question input by a user;
determining a target professional question-answer library corresponding to the target application identifier from a professional question-answer library set according to a corresponding relation between a preset application identifier and a professional question-answer library in the professional question-answer library set, wherein the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in association with the target professional question-answer library, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems;
selecting at least one candidate preset question from the target professional question-answer library according to the keywords in the target question;
for each candidate preset problem in the at least one candidate preset problem, generating a similarity between the candidate preset problem and the target problem based on the target similarity calculation model;
wherein the generating a similarity between the candidate preset problem and the target problem based on the target similarity calculation model for each candidate preset problem of the at least one candidate preset problem comprises:
generating a first similarity between each candidate preset problem and the target problem by using a target similarity calculation model, wherein the target similarity calculation model is obtained by training an initial third neural network through a target professional sample set, the target professional samples in the target professional sample set comprise target professional corpus pairs and indication information, and the indication information is used for indicating the corpora in the corpus pairs to express the same meaning or not to express the same meaning;
generating a second similarity between each candidate preset problem and the target problem by using a general similarity calculation model, wherein the general similarity calculation model is used for determining the similarity between the preset problem and the target problem;
and according to preset weight, carrying out weighted summation on the first similarity and the second similarity to obtain the similarity between each candidate preset problem and the target problem.
2. The method of claim 1, wherein the target similarity computation model comprises a word vector submodel and a similarity operator model, wherein an input of the similarity operator model comprises an output of the word vector submodel, the word vector submodel being for characterizing a correspondence between input text and word vectors, and the similarity operator model being for characterizing a correspondence between word vector pairs and similarities of input text to which the word vector pairs correspond.
3. The method according to claim 2, wherein the target similarity calculation model is used for representing the corresponding relationship between the similarity between the target problem and the preset problem and the similarity between the target problem and the preset problem; and
for each candidate preset question in the at least one candidate preset question, generating a similarity between the candidate preset question and the target question based on the target similarity calculation model, including:
and for each candidate preset problem in the at least one candidate preset problem, importing the candidate preset problem and the target problem into the target similarity calculation model, and generating the similarity between the candidate preset problem and the target problem.
4. The method according to claim 3, wherein the importing, for each candidate preset question of the at least one candidate preset question, the candidate preset question and the target question into the target similarity calculation model to generate the similarity between the candidate preset question and the target question comprises:
importing the target problem into the word vector submodel to generate a first word vector corresponding to the target problem;
for each candidate preset problem in the at least one candidate preset problem, acquiring a second word vector corresponding to the candidate preset problem generated by utilizing the word vector sub-model in advance; and importing the first word vector and the second word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
5. The method according to claim 2, wherein the target specialized question-answering library further comprises preset answers set in association with preset questions, and the target similarity calculation model is used for representing the corresponding relationship between the target question, the preset answers set in association with the preset questions and the similarity between the preset questions and the target question; and
for each candidate preset question in the at least one candidate preset question, generating a similarity between the candidate preset question and the target question based on the target similarity calculation model, including:
and for each candidate preset question in the at least one candidate preset question, importing the candidate preset question, a preset answer set in association with the candidate preset question and the target question into the target similarity calculation model, and generating the similarity between the candidate preset question and the target question.
6. The method according to claim 5, wherein the importing, for each candidate preset question of the at least one candidate preset question, the candidate preset question, a preset answer set in association with the candidate preset question, and the target question into the target similarity calculation model to generate a similarity between the candidate preset question and the target question comprises:
importing the target problem into the word vector submodel to generate a third word vector corresponding to the target problem;
for each candidate preset question in the at least one candidate preset question, acquiring a fourth word vector corresponding to the candidate preset question generated by the word vector sub-model in advance and an answer set in association with the candidate preset question; and importing the third word vector and the fourth word vector into the similarity calculation operator model to generate the similarity between the candidate preset problem and the target problem.
7. The method according to any one of claims 1-6, wherein the target similarity calculation model is trained by:
acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating that the corpora in the corpus pair express the same meaning or do not express the same meaning;
training a pre-established initial first neural network by using the universal sample set to obtain an initial second neural network;
acquiring a target professional sample set, wherein the target professional sample comprises a target professional corpus pair and indicating information;
and training the initial second neural network by using the target professional sample set to obtain the target similarity calculation model.
8. The method of claim 1, wherein the generic similarity calculation model is trained by:
acquiring a universal sample set, wherein the universal sample comprises a universal corpus pair and indicating information, and the indicating information is used for indicating whether the corpora in the corpus pair express the same meaning;
and training a pre-established initial fourth neural network by using the universal sample set to obtain the universal similarity calculation model.
9. The method according to any one of claims 1-6, wherein the method further comprises:
and selecting a preset number of candidate preset problems as the problems to be displayed according to the sequence of the similarity from high to low from the at least one candidate preset problem.
10. The method of claim 9, wherein the method further comprises:
acquiring a preset answer which is set in association with the question to be displayed;
and sending the question to be displayed and the preset answer to the terminal.
11. The method of claim 9, wherein the method further comprises:
sending the question to be displayed to the terminal, wherein the terminal displays the question to be displayed to a user, receives confirmation information which is input by the user and used for indicating the question to be displayed and matched with the target question, and returns the confirmation information;
receiving the confirmation information;
and returning a preset answer associated with the question to be displayed and indicated by the confirmation information to the terminal.
12. An artificial intelligence based information generating apparatus comprising:
the terminal comprises a first obtaining unit, a second obtaining unit and a third obtaining unit, wherein the first obtaining unit is used for obtaining an answer obtaining request sent by the terminal, and the answer obtaining request comprises a target application identifier and a target question input by a user;
the system comprises a determining unit, a target professional question-answer library and a target similarity calculation model, wherein the determining unit is used for determining the target professional question-answer library corresponding to a target application identifier from a professional question-answer library set according to the corresponding relation between a preset application identifier and the professional question-answer library in the professional question-answer library set, the target professional question-answer library comprises preset problems related to a target professional field, a target similarity calculation model is arranged in the target professional question-answer library in a correlation mode, and the target similarity calculation model is used for determining the similarity between the preset problems and the target problems;
the first selection unit is used for selecting at least one candidate preset question from the target professional question-and-answer library according to the keywords in the target question;
a generating unit, configured to generate, for each candidate preset question in the at least one candidate preset question, a similarity between the candidate preset question and the target question based on the target similarity calculation model;
the generation unit is further configured to:
generating a first similarity between each candidate preset problem and the target problem by using a target similarity calculation model, wherein the target similarity calculation model is obtained by training an initial third neural network through a target professional sample set, the target professional samples in the target professional sample set comprise target professional corpus pairs and indication information, and the indication information is used for indicating the corpora in the corpus pairs to express the same meaning or not to express the same meaning; generating a second similarity between each candidate preset problem and the target problem by using a general similarity calculation model, wherein the general similarity calculation model is used for determining the similarity between the preset problem and the target problem; and according to preset weight, carrying out weighted summation on the first similarity and the second similarity to obtain the similarity between each candidate preset problem and the target problem.
13. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-11.
14. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-11.
CN201711396776.XA 2017-12-21 2017-12-21 Information generation method and device based on artificial intelligence Active CN108121800B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711396776.XA CN108121800B (en) 2017-12-21 2017-12-21 Information generation method and device based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711396776.XA CN108121800B (en) 2017-12-21 2017-12-21 Information generation method and device based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN108121800A CN108121800A (en) 2018-06-05
CN108121800B true CN108121800B (en) 2021-12-21

Family

ID=62231088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711396776.XA Active CN108121800B (en) 2017-12-21 2017-12-21 Information generation method and device based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN108121800B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846126B (en) * 2018-06-29 2021-07-27 北京百度网讯科技有限公司 Generation of associated problem aggregation model, question-answer type aggregation method, device and equipment
CN110209782B (en) * 2018-09-25 2023-08-25 腾讯科技(深圳)有限公司 Question-answering model and answer sentence generation method and device, medium and electronic equipment
CN110990541A (en) * 2018-09-30 2020-04-10 北京国双科技有限公司 Method and device for realizing question answering
CN111159363A (en) * 2018-11-06 2020-05-15 航天信息股份有限公司 Knowledge base-based question answer determination method and device
CN109697228A (en) * 2018-12-13 2019-04-30 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
CN109829048B (en) * 2019-01-23 2023-06-23 平安科技(深圳)有限公司 Electronic device, interview assisting method, and computer-readable storage medium
CN111854748B (en) * 2019-04-09 2022-11-22 北京航迹科技有限公司 Positioning system and method
CN110457432B (en) * 2019-07-04 2023-05-30 平安科技(深圳)有限公司 Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium
CN111079938B (en) * 2019-11-28 2020-11-03 百度在线网络技术(北京)有限公司 Question-answer reading understanding model obtaining method and device, electronic equipment and storage medium
CN111475630A (en) * 2020-03-31 2020-07-31 联想(北京)有限公司 Information processing method and device and electronic equipment
CN111488442A (en) * 2020-04-09 2020-08-04 深圳追一科技有限公司 Data processing method, device, server and storage medium
CN111986771A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Medical prescription query method and device, electronic equipment and storage medium
CN114416953B (en) * 2022-01-20 2023-10-31 北京百度网讯科技有限公司 Question-answering processing method, question-answering model training method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN104573000A (en) * 2015-01-07 2015-04-29 北京云知声信息技术有限公司 Sequential learning based automatic questions and answers device and method
CN104850539A (en) * 2015-05-28 2015-08-19 宁波薄言信息技术有限公司 Natural language understanding method and travel question-answering system based on same
CN105843897A (en) * 2016-03-23 2016-08-10 青岛海尔软件有限公司 Vertical domain-oriented intelligent question and answer system
CN105893523A (en) * 2016-03-31 2016-08-24 华东师范大学 Method for calculating problem similarity with answer relevance ranking evaluation measurement

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10489712B2 (en) * 2016-02-26 2019-11-26 Oath Inc. Quality-based scoring and inhibiting of user-generated content
JP2017204018A (en) * 2016-05-09 2017-11-16 富士通株式会社 Search processing method, search processing program and information processing device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN104573000A (en) * 2015-01-07 2015-04-29 北京云知声信息技术有限公司 Sequential learning based automatic questions and answers device and method
CN104850539A (en) * 2015-05-28 2015-08-19 宁波薄言信息技术有限公司 Natural language understanding method and travel question-answering system based on same
CN105843897A (en) * 2016-03-23 2016-08-10 青岛海尔软件有限公司 Vertical domain-oriented intelligent question and answer system
CN105893523A (en) * 2016-03-31 2016-08-24 华东师范大学 Method for calculating problem similarity with answer relevance ranking evaluation measurement

Also Published As

Publication number Publication date
CN108121800A (en) 2018-06-05

Similar Documents

Publication Publication Date Title
CN108121800B (en) Information generation method and device based on artificial intelligence
CN107273503B (en) Method and device for generating parallel text in same language
CN107193792B (en) Method and device for generating article based on artificial intelligence
CN107491534B (en) Information processing method and device
CN107766940B (en) Method and apparatus for generating a model
CN107491547B (en) Search method and device based on artificial intelligence
CN107346336B (en) Information processing method and device based on artificial intelligence
CN108416310B (en) Method and apparatus for generating information
CN107241260B (en) News pushing method and device based on artificial intelligence
CN111428010B (en) Man-machine intelligent question-answering method and device
CN109034069B (en) Method and apparatus for generating information
CN109543058B (en) Method, electronic device, and computer-readable medium for detecting image
CN109635094B (en) Method and device for generating answer
CN108121699B (en) Method and apparatus for outputting information
CN110969012A (en) Text error correction method and device, storage medium and electronic equipment
CN109582825B (en) Method and apparatus for generating information
CN110009059B (en) Method and apparatus for generating a model
CN109801527B (en) Method and apparatus for outputting information
EP3832475A1 (en) Sentence processing method and system and electronic device
CN109190123B (en) Method and apparatus for outputting information
CN111666416A (en) Method and apparatus for generating semantic matching model
CN111738010A (en) Method and apparatus for generating semantic matching model
WO2020052061A1 (en) Method and device for processing information
CN111897950A (en) Method and apparatus for generating information
CN110232920B (en) Voice processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant