CN116595132A - Intelligent question-answering method, device, electronic equipment and medium - Google Patents
Intelligent question-answering method, device, electronic equipment and medium Download PDFInfo
- Publication number
- CN116595132A CN116595132A CN202310342206.1A CN202310342206A CN116595132A CN 116595132 A CN116595132 A CN 116595132A CN 202310342206 A CN202310342206 A CN 202310342206A CN 116595132 A CN116595132 A CN 116595132A
- Authority
- CN
- China
- Prior art keywords
- corpus
- question
- questions
- preprocessed
- matching model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 238000007781 pre-processing Methods 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims description 35
- 239000013598 vector Substances 0.000 claims description 25
- 238000004590 computer program Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 18
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 9
- 230000001629 suppression Effects 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 2
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 13
- 230000015654 memory Effects 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 239000004698 Polyethylene Substances 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000006403 short-term memory Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- -1 polyethylene Polymers 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3335—Syntactic pre-processing, e.g. stopword elimination, stemming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides an intelligent question-answering method, and relates to the technical field of artificial intelligence. The intelligent question-answering method comprises the following steps: acquiring a to-be-solved problem, preprocessing the to-be-solved problem, and extracting target keywords from the preprocessed to-be-solved problem; extracting a fourth corpus under the target keyword names from the third corpus; inputting the preprocessed questions to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each question in the fourth corpus and the questions to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and determining an answer corresponding to the target question in the first corpus, and determining the answer as an answer corresponding to the to-be-solved question. The disclosure also provides an intelligent question answering device, an electronic device, a storage medium and a program product.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence, and more particularly, to an intelligent question-answering method, apparatus, electronic device, storage medium, and program product.
Background
Because of the numerous applications developed and maintained by banks and the interdependence among many applications, the support manager of the application often needs to ask questions to other support managers, and the support manager not only needs to complete own work, but also answers questions posed by other support managers, so that the support manager often fails to answer the questions in real time, which results in influencing the working efficiency.
Currently, the mainstream technology of text matching is to extract keywords by a TF-IDF (Term Frequency Inverse Document Frequency, word frequency-reverse document frequency) method, and then calculate the similarity of texts by calculating the cosine similarity. However, due to the numerous aliases of the application, technology, and data maintained by me, similarity calculations can only be performed by searching for the same keywords between the two questions. In addition to cosine similarity, there is a method of text similarity calculation based on LSTM (Long Short Term Memory, long short term memory network), but this method weakens the information in long text. In addition, when performing problem retrieval, a full-scale similarity calculation is typically performed on the corpus. In the case of too many corpus problems, the computation time tends to be too long.
Therefore, the existing text matching method generally causes overlong operation time and affects question-answering efficiency.
Disclosure of Invention
In view of the foregoing, the present disclosure provides an intelligent question-answering method, apparatus, electronic device, storage medium, and program product that can reduce operation time.
According to a first aspect of the present disclosure, there is provided an intelligent question-answering method, including: acquiring a to-be-solved problem, preprocessing the to-be-solved problem, and extracting target keywords from the preprocessed to-be-solved problem; extracting a fourth corpus under the target keyword names from the third corpus; inputting the preprocessed questions to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each question in the fourth corpus and the questions to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and determining an answer corresponding to the target question in the first corpus, and determining the answer as an answer corresponding to the to-be-solved question.
According to an embodiment of the present disclosure, a text matching model is trained by: acquiring a first corpus and a second corpus, wherein the first corpus comprises a plurality of groups of first questions and first answers which correspond to each other; the second corpus includes a plurality of first questions, and second questions semantically similar to each of the first questions; preprocessing the second corpus, extracting keywords from the preprocessed second corpus, and adding the problem of the same keywords to a third corpus under the keyword names; training a preset text matching model by using the preprocessed second corpus to obtain a trained text matching model.
According to an embodiment of the present disclosure, preprocessing the second corpus includes, for each problem in the second corpus, performing the following operations: dividing the problem into a plurality of words; deleting the stop word in the plurality of words; each term of the plurality of terms is converted into a term vector.
According to an embodiment of the present disclosure, keyword extraction is performed on the preprocessed second corpus, and a problem including the same keyword is added to a third corpus under the keyword name, including: extracting keywords from the preprocessed second corpus to obtain at least one keyword of each problem; for each keyword of the at least one keyword, adding a first question containing the keyword and a second question semantically similar to the first question to a third corpus under the keyword name.
According to an embodiment of the present disclosure, training the preset text matching model using the preprocessed second corpus includes: any two problems in the preprocessed second corpus are obtained, and the two problems are respectively processed into word embedding matrixes through a word embedding layer; passing the word embedding matrix through a position coding layer to position code a plurality of words of each of the two questions; and embedding the words subjected to the position coding into a matrix input Transformer layer, a suppression layer, a batch standardization layer, a gating circulation unit, a linear subtraction layer and a linear output layer to obtain the semantic similarity of the two problems.
According to an embodiment of the present disclosure, passing a word embedding matrix through a position encoding layer to position encode a plurality of words of each of the two questions, includes: for each of the two questions, the plurality of words in the question are alternately encoded using sine and cosine functions, respectively, according to the location where each word in the question appears.
According to an embodiment of the present disclosure, training a preset text matching model by using a preprocessed second corpus to obtain a trained text matching model, including: any two problems in the preprocessed second corpus are obtained, labels are added to the two problems, and the labels represent actual values of semantic similarity of the two problems; determining a semantic similarity score for the two questions using a text matching model; determining a difference between the semantic similarity score and the tag according to the loss function; and under the condition that the difference accords with the preset condition, adjusting parameters of the text matching model according to the difference, and returning to the operation of determining semantic similarity scores of the two problems by using the text matching model aiming at the other two problems in the preprocessed second corpus.
According to an embodiment of the present disclosure, determining a semantic similarity score for the two questions using a text matching model includes: inputting the two problems into a text matching model to obtain first time sequence data and second time sequence data; respectively carrying out average pooling on the first time sequence data and the second time sequence data to obtain a first vector and a second vector; and calculating the difference between the first vector and the second vector, and determining the semantic similarity score of the two problems according to the difference.
A second aspect of the present disclosure provides an intelligent question-answering apparatus, comprising: the to-be-solved problem processing module is used for acquiring to-be-solved problems, preprocessing the to-be-solved problems and extracting target keywords from the preprocessed to-be-solved problems; extracting a fourth corpus under the target keyword names from the third corpus; the target problem determining module is used for inputting the preprocessed problems to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each problem in the fourth corpus and the problems to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and the answer determining module is used for determining an answer corresponding to the target question in the first corpus and determining the answer as an answer corresponding to the to-be-solved question.
According to an embodiment of the present disclosure, a trained text matching model includes: the corpus acquisition unit is used for acquiring a first corpus and a second corpus, wherein the first corpus comprises a plurality of groups of first questions and first answers which correspond to each other; the second corpus includes a plurality of first questions, and second questions semantically similar to each of the first questions; the keyword extraction unit is used for preprocessing the second corpus, extracting keywords from the preprocessed second corpus and adding the problem containing the same keywords to a third corpus under the keyword name; the model training unit is used for training the preset text matching model by using the preprocessed second corpus to obtain a trained text matching model.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the intelligent question-answering method described above.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the intelligent question-answering method described above.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the intelligent question-answering method described above.
Through the intelligent question-answering method, the device, the electronic equipment, the storage medium and the program product provided by the embodiment of the disclosure, after the user designates the question to be answered, extracting the keywords again, and performing similarity calculation in the corpus of the designated keywords so as to effectively match the question in the third corpus and obtain a target question matched with the question to be answered; and finally, matching the answer to the target question from a first corpus containing questions and answers to serve as the best answer corresponding to the to-be-solved question. The method and the device can remarkably reduce operation time and improve question-answering accuracy.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario suitable for an intelligent question-answering method and apparatus according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of intelligent question-answering according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a training process of a text matching model according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow chart of a third corpus adding process according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a network structure diagram of a text matching model according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow chart of a text matching model training process according to an embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow chart of a semantic similarity score determination process according to an embodiment of the present disclosure;
fig. 8 schematically illustrates a block diagram of an intelligent question-answering apparatus according to an embodiment of the present disclosure;
FIG. 9 schematically illustrates a block diagram of a text matching model according to an embodiment of the disclosure;
fig. 10 schematically illustrates a block diagram of an electronic device adapted to implement an intelligent question-answering method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Some of the block diagrams and/or flowchart illustrations are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, when executed by the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon, the computer program product being for use by or in connection with an instruction execution system.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing, applying and the like of the personal information of the user all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public order harmony is not violated.
In the technical scheme of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.
In carrying out the disclosed concept, the inventors found that: currently, in order to meet the personalized needs of clients, commercial banks need an operation and maintenance question-answering robot capable of answering questions in real time according to a corpus. The question-answering robot using the form of one-to-one correspondence between questions and answers has the best effect because of higher accuracy requirements on the questions and the answers. In addition, because of numerous aliases of application, technology and data maintained by commercial banks, and because of different structures and words, the input question sentences often cannot be matched with answers accurately, and the question questions are answered.
In view of this situation, embodiments of the present disclosure provide an intelligent question-answering method, apparatus, electronic device, storage medium, and program product, relating to the field of artificial intelligence technology. The intelligent question-answering method comprises the following steps: acquiring a to-be-solved problem, preprocessing the to-be-solved problem, and extracting target keywords from the preprocessed to-be-solved problem; extracting a fourth corpus under the target keyword names from the third corpus; inputting the preprocessed questions to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each question in the fourth corpus and the questions to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and determining an answer corresponding to the target question in the first corpus, and determining the answer as an answer corresponding to the to-be-solved question.
Fig. 1 schematically illustrates an application scenario suitable for the intelligent question-answering method and apparatus according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 1, an application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that, the intelligent question-answering method provided by the embodiments of the present disclosure may be generally executed by the server 105. Accordingly, the intelligent question and answer apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The intelligent question-answering method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the intelligent question answering apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The intelligent question-answering method according to the embodiments of the present disclosure will be described in detail below with reference to fig. 2 to 7 based on the application scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of an intelligent question-answering method according to an embodiment of the present disclosure.
As shown in fig. 2, the intelligent question-answering method of this embodiment may include operations S210 to S230.
In operation S210, obtaining a problem to be solved, preprocessing the problem to be solved, and extracting a target keyword from the preprocessed problem to be solved; and extracting a fourth corpus under the target keyword names from the third corpus. The third corpus is used in operation S210 of the training method embodiment of the text matching model.
The to-be-solved question may be, for example, a question input by a user, and the present operation is used to extract a target keyword corresponding to the question input by the user, and a fourth corpus under the name of the target keyword.
In operation S220, inputting the preprocessed questions to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each question in the fourth corpus and the questions to be solved; and determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved. The trained text matching model is obtained by training the training method of the text matching model in the embodiment.
The method comprises the steps of obtaining a plurality of semantic similarities between a preprocessed problem to be solved and each problem in a fourth corpus through a trained text matching model, wherein the number of the semantic similarities is the same as that of the problems in the fourth corpus. Then, the problem with the highest semantic similarity in the fourth corpus is extracted and used as a target problem matched with the problem to be solved.
In operation S230, an answer corresponding to the target question is determined in the first corpus, and the answer is determined as an answer corresponding to the question to be answered.
Searching the target problem in the first corpus, and returning an answer corresponding to the target problem to a required user.
Through the embodiment of the disclosure, after the user designates the problem to be solved, extracting the keywords again, and performing similarity calculation in the corpus of the designated keywords to effectively match the problem in the third corpus so as to obtain a target problem matched with the problem to be solved; and finally, matching the answer to the target question from a first corpus containing questions and answers to serve as the best answer corresponding to the to-be-solved question. The method and the device can remarkably reduce operation time and improve question-answering accuracy.
Fig. 3 schematically illustrates a flowchart of a training process of a text matching model according to an embodiment of the present disclosure.
As shown in fig. 3, the trained text matching model of operation S220 is trained by the following operations S321 to S323.
In operation S321, a first corpus and a second corpus are obtained, where the first corpus includes a plurality of sets of first questions and first answers corresponding to each other; the second corpus includes a plurality of first questions, and second questions semantically similar to each of the first questions.
For example, first questions and corresponding first answers that may be used by daily questions and answers may be collected at each application and stored in a question and answer corpus (i.e., a first corpus) in a question and answer format. Meanwhile, a plurality of first questions in the first corpus and second questions semantically similar to the first questions are stored in another question corpus (namely, the second corpus).
In the first corpus, each question corresponds to a unique answer. In the second corpus, each question corresponds to one or more semantically similar questions. The second corpus also stores semantic similarity for each of the first question and the second question.
In operation S322, the second corpus is preprocessed, keyword extraction is performed on the preprocessed second corpus, and the problem including the same keyword is added to the third corpus under the keyword name.
The operation is based on the extracted keywords, and each keyword is named as a corpus, namely a third corpus. Then, based on the keywords of all the questions, the second corpus is segmented, and the questions containing the same keywords are stored in a third corpus named as the keywords.
In operation S323, training the preset text matching model by using the preprocessed second corpus, to obtain a trained text matching model.
In the embodiment of the disclosure, a text matching model based on a transducer and a gating circulation unit is built and used for solving semantic similarity between any two problems.
According to the embodiment of the disclosure, first, after a first corpus and a second corpus are obtained, the second corpus containing a plurality of similar problems is segmented through preprocessing and keyword extraction, so that a plurality of third corpuses are obtained. The text matching model is then trained using the preprocessed second corpus as training data. Based on the trained text matching model, the semantic similarity of any two problems can be solved.
In an embodiment of the present disclosure, preprocessing the second corpus in the operation S322 includes, for each problem in the second corpus, performing the following operations: dividing the problem into a plurality of words; deleting the stop word in the plurality of words; each term of the plurality of terms is converted into a term vector.
For example, the second corpus is cleaned first, and each question is segmented using a chinese segmentation tool jieba to obtain a plurality of words for the question.
Then, using the preset stop word list, removing the common stop words. Among them, stop Words are Words that are often ignored in text processing, as they do not typically make a significant contribution to the meaning of the text. Common disuse words include pronouns, prepositions, conjunctions, articles, and the like, such as "and", "and even", and the like.
Each Word is then converted to a Word vector (e.g., word embedding matrix) using a Word2Vec function. Specifically, each word may be converted to a word vector using a vocab vocabulary to convert each word to an index (index).
Fig. 4 schematically illustrates a flow chart of a third corpus adding process according to an embodiment of the disclosure.
As shown in fig. 4, in the embodiment of the present disclosure, the extracting of the keyword from the second corpus after preprocessing in the operation S322, and adding the problem including the same keyword to the third corpus under the keyword name may include operations S421 to S422.
In operation S421, keyword extraction is performed on the preprocessed second corpus, so as to obtain at least one keyword of each question.
In operation S422, for each keyword of the at least one keyword, a first question containing the keyword and a second question semantically similar to the first question are added to a third corpus under the keyword name.
With embodiments of the present disclosure, for each question in the preprocessed second corpus, one or more keywords are extracted. Based on the keywords of all the questions, the second corpus is segmented, and the questions containing the same keywords and the similar questions are stored in a third corpus named as the keywords. The method and the device can reduce the number of model calculation and improve the speed of question answering.
It should be noted that, since the different third corpora are distinguished by the extracted keywords, the different third corpora may contain the same problem.
In the embodiment of the present disclosure, the extracting the keyword from the preprocessed second corpus in operation S322 includes: and extracting keywords from the preprocessed second corpus by using a word frequency-reverse file frequency algorithm (TF-IDF).
In an embodiment of the present disclosure, the adding the problem including the same keyword to the third corpus under the keyword name in operation S322 further includes: and establishing a two-dimensional index of each question in the third corpus, wherein the two-dimensional index comprises an index of the third corpus and an index of each question in the third corpus.
Because each keyword corresponds to a third corpus, each problem in the third corpus is built with a two-dimensional index, the first dimension of the two-dimensional index is the index of a certain third corpus, and the second dimension is the index of each problem in the third corpus. It can be seen that each question in the third corpus carries a two-dimensional index to distinguish between different corpora and different questions.
Fig. 5 schematically illustrates a network structure diagram of a text matching model according to an embodiment of the present disclosure.
As shown in fig. 5, in the embodiment of the present disclosure, the preset text matching model includes: a word embedding layer (embedding), a position encoding layer (word encoding), a Transformer layer, a suppression layer (dropout), a batch normalization layer (batch normlization), a gate loop unit (Gated Recurrent Unit, GRU), a Linear subtraction layer, and a Linear output layer (Linear), wherein the Transformer layer includes a multi-headed self-attention layer, a first residual connection and normalization layer, a feedforward neural network, and a second residual connection and normalization layer.
In the deep learning field, the transducer layer is also known as Transformer Layer, or Transformer Block. At the transducer layer, the word-to-word relationships in each question can be extracted through a multi-headed self-attention mechanism. For example, the multi-head self-attention layer (multi-head self-attention) may be set to 16 heads, i.e. after the self-attention calculation is performed by the 16-head self-attention layer, there are 16 result vectors. The multi-head self-attention mechanism helps the network model capture richer features or information than the single-head self-attention mechanism.
Based on the network structure of the text matching model, in conjunction with fig. 5, in the embodiment of the present disclosure, training the preset text matching model using the preprocessed second corpus in operation S323 includes: any two problems in the preprocessed second corpus are obtained, and the two problems are respectively processed into word embedding matrixes through a word embedding layer; passing the word embedding matrix through a position coding layer to position code a plurality of words of each of the two questions; and embedding the words subjected to the position coding into a matrix input Transformer layer, a suppression layer, a batch standardization layer, a gating circulation unit, a linear subtraction layer and a linear output layer to obtain the semantic similarity of the two problems.
Through the embodiment of the disclosure, the first neural network model based on the transducer and the second neural network model2 based on the gating cycle unit are connected in series to form a preset text matching model so as to solve the semantic similarity between any two problems. It can be seen that the present disclosure captures more links between words based on self-intent mechanism in text matching model than by calculating text similarity through cosine similarity. And compared with the text similarity calculated by an LSTM (Long Short-Term Memory network) model, the method can capture and memorize the information of the Long text more.
For the position coding layer, since self-attribute calculation is performed in parallel in the transducer layer, the input information does not contain position information, so that position coding is required to acquire the relative positional relationship between words. Position coding may be such that multiple words in a question sentence, after scrambling, still know the word-to-word positional relationship.
In an embodiment of the present disclosure, passing a word embedding matrix through a position encoding layer to position encode a plurality of words of each of the two questions, includes: for each of the two questions, the plurality of words in the question are alternately encoded using sine and cosine functions, respectively, according to the location where each word in the question appears.
For example, at the location encoding layer, each question in the preprocessed second corpus is embedded into a corresponding location encoding, which may be performed with reference to the following formula:
in the formulas (1) and (2), pos represents the position of each word in the problem; i represents the index of the dimension in the embedded vector, i is a positive integer; d, d model Is a superparameter of the transducer, which is the output dimension of all layers; PE (polyethylene) (pos,2i) ,PE (pos,2i+1) Representing the coding of words in even and odd positions, respectively.
According to the above equations (1) and (2), a plurality of words in the problem are alternately embedded using sine and cosine functions, respectively. According to the position of each word in the problem, performing sine function calculation on word vectors at even positions, and performing sine function calculation on word vectors at odd positions.
Fig. 6 schematically illustrates a flow chart of a text matching model training process according to an embodiment of the present disclosure.
As shown in fig. 6, in the embodiment of the present disclosure, training the preset text matching model using the preprocessed second corpus in operation S323 to obtain a trained text matching model may include operations S631 to S634.
In operation S631, any two questions in the preprocessed second corpus are obtained, labels are added to the two questions, and the labels represent actual values of semantic similarity of the two questions.
In operation S632, a text matching model is used to determine a semantic similarity score for the two questions.
In operation S633, a difference between the semantic similarity score and the tag is determined according to the loss function.
In operation S634, if the difference meets the preset condition, parameters of the text matching model are adjusted according to the difference, and for the other two questions in the preprocessed second corpus, an operation of determining semantic similarity scores of the two questions by using the text matching model is returned.
For example, the training set and the test set are divided according to the ratio of 8:2 from the data in the preprocessed second corpus, and the training is performed on the preset text matching model, and the training frequency can be set to 1000 times, for example. After the model is trained 1000 times, the trained text matching model is stored in a specified position.
The training set is used for training the model, and the testing set is used for verifying whether the trained text matching model is good or bad. In the training set, each piece of training data is any two questions in the preprocessed second corpus, and the training data carries a label which represents the actual value of the semantic similarity of the two questions. For example, the tag may be set to either 0 or 1, 1 indicating that the two question semantics are the same, and 0 indicating that the two question semantics are different.
Determining a semantic similarity score for the two questions using a text matching model, the closer the score is to 1, the closer the semantics of the two questions are shown; the closer the score is to 0, the more different the semantics of the two questions are indicated.
In an embodiment of the present disclosure, the loss function in operation S633 may be a binary cross entropy function (Binary Crossentropy), and the optimizer may be Adam.
Fig. 7 schematically illustrates a flowchart of a semantic similarity score determination process according to an embodiment of the present disclosure.
As shown in fig. 7, further, determining the semantic similarity score of the two questions using the text matching model in operation S632 may include operations S732a to S732c.
In operation S732a, the two questions are input into a text matching model to obtain first time-series data and second time-series data.
Time series data, also called 3D tensors, should be stored in a 3D tensor with a time axis when the time (or sequence order) of the data collection is important for the data.
In operation S732b, the first time-series data and the second time-series data are respectively averaged and pooled to obtain a first vector and a second vector.
In operation S732c, a gap between the first vector and the second vector is calculated, and a semantic similarity score for the two questions is determined based on the gap.
For example, the first vector and the second vector are denoted as x, respectively 1 And x 2 The semantic similarity score abs (x 1 -x 2 ) The score represents the first vector x J And a second vector x 2 Absolute value of the difference of (c).
Through the above disclosure, a trained text matching model can be obtained.
Through the embodiment of the present disclosure, after obtaining the trained text matching model, first, after the user designates the problem to be solved, the problem to be solved is preprocessed, and the preprocessing mode in the above operation S322 may be referred to for details, which are not described herein. The extracting method of the target keyword from the preprocessed to-be-solved question may refer to the keyword extracting method in the above operation S322, for example, a word frequency-reverse file frequency algorithm is adopted, which is not described herein. And traversing each third corpus based on the extracted target keywords to obtain a fourth corpus under the target keyword names. It can be seen that the fourth corpus is a specific one of the third corpuses, the keywords of which correspond to the target keywords.
And then, calculating the semantic similarity of each problem in the fourth corpus and the problem to be solved by using the trained text matching model, and determining the problem with the highest semantic similarity as a target problem matched with the problem to be solved. And finally, searching the target problem in the first corpus, and returning an answer corresponding to the target problem to a required user.
Based on the intelligent question-answering method, the disclosure also provides an intelligent question-answering device. The device will be described in detail below in connection with fig. 8.
Fig. 8 schematically illustrates a block diagram of an intelligent question-answering apparatus according to an embodiment of the present disclosure.
As shown in fig. 8, the intelligent question answering apparatus 800 of this embodiment includes a to-be-answered question processing module 810, a target question determining module 820, and an answer determining module 830.
The to-be-solved problem processing module 810 is configured to obtain a to-be-solved problem, pre-process the to-be-solved problem, and extract a target keyword from the pre-processed to-be-solved problem; and extracting a fourth corpus under the target keyword names from the third corpus. In an embodiment, the to-be-solved problem processing module 810 may be configured to perform the operation S210 described above, which is not described herein.
The target problem determining module 820 is configured to input the preprocessed problem to be solved and the fourth corpus into a trained text matching model, so as to obtain semantic similarity between each problem in the fourth corpus and the problem to be solved; and determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved. In an embodiment, the objective problem determination module 820 may be configured to perform the operation S220 described above, which is not described herein.
The answer determining module 830 is configured to determine an answer corresponding to the target question in the first corpus, and determine the answer as an answer corresponding to the to-be-solved question. In an embodiment, the answer determining module 830 may be configured to perform the operation S230 described above, which is not described herein.
Fig. 9 schematically illustrates a block diagram of a text matching model according to an embodiment of the disclosure.
In an embodiment of the present disclosure, the trained text matching model 900 in the above-mentioned target problem determining module 820 includes a corpus obtaining unit 910, a keyword extracting unit 920 and a model training unit 930.
A corpus obtaining unit 910, configured to obtain a first corpus and a second corpus, where the first corpus includes a plurality of sets of first questions and first answers corresponding to each other; the second corpus includes a plurality of first questions, and second questions semantically similar to each of the first questions. In an embodiment, the corpus obtaining unit 910 may be configured to perform the operation S321 described above, which is not described herein.
The keyword extraction unit 920 is configured to pre-process the second corpus, extract keywords from the pre-processed second corpus, and add the problem including the same keywords to the third corpus under the keyword name. In an embodiment, the keyword extraction unit 920 may be used to perform the operation S322 described above, which is not described herein.
The model training unit 930 is configured to train the preset text matching model by using the preprocessed second corpus, so as to obtain a trained text matching model. In an embodiment, the model training unit 930 may be used to perform the operation S323 described above, which is not described herein.
According to an embodiment of the present disclosure, any of the plurality of modules among the question processing module 810 to be solved, the target question determining module 820 and the answer determining module 830, the corpus acquiring unit 910, the keyword extracting unit 920 and the model training unit 930 may be combined in one module to be implemented, or any of the plurality of modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the question processing module 810 to be answered, the objective question determination module 820 and the answer determination module 830, the corpus acquisition unit 910, the keyword extraction unit 920 and the model training unit 930 may be at least partially implemented as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable way of integrating or packaging the circuitry, or as any one of or a suitable combination of any of the three implementations of software, hardware and firmware. Alternatively, at least one of the question processing module to be solved 810, the objective question determining module 820 and the answer determining module 830, the corpus acquiring unit 910, the keyword extracting unit 920 and the model training unit 930 may be at least partially implemented as a computer program module, which may perform corresponding functions when being executed.
Fig. 10 schematically illustrates a block diagram of an electronic device adapted to implement an intelligent question-answering method according to an embodiment of the present disclosure.
As shown in fig. 10, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. The processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 1001 may also include on-board memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of the method flows according to embodiments of the present disclosure.
In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiment of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of the method flow according to the embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the disclosure, the electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to the bus 1004. The electronic device 1000 may also include one or more of the following components connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output portion 1007 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; a storage portion 1008 including a hard disk or the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The drive 1010 is also connected to the I/O interface 1005 as needed. A removable medium 1011, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in the drive 1010, so that a computer program read out therefrom is installed as needed in the storage section 1008.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs that, when executed, implement the intelligent question-answering method according to embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 1002 and/or RAM 1003 and/or one or more memories other than ROM 1002 and RAM 1003 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code means for causing a computer system to carry out the intelligent question-answering method provided by the embodiments of the present disclosure when the computer program product is run on the computer system.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of signals on a network medium, distributed, and downloaded and installed via the communication section 1009, and/or installed from the removable medium 1011. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1009, and/or installed from the removable medium 1011. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.
Claims (13)
1. An intelligent question-answering method, comprising:
acquiring a to-be-solved problem, preprocessing the to-be-solved problem, and extracting target keywords from the preprocessed to-be-solved problem; extracting a fourth corpus under the target keyword names from the third corpus;
Inputting the preprocessed questions to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity of each question in the fourth corpus and the questions to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and
and determining an answer corresponding to the target question in the first corpus, and determining the answer as the answer corresponding to the to-be-solved question.
2. The method of claim 1, wherein the text matching model is trained by:
acquiring a first corpus and a second corpus, wherein the first corpus comprises a plurality of groups of first questions and first answers which correspond to each other; the second corpus includes a plurality of the first questions, and second questions semantically similar to each of the first questions;
preprocessing the second corpus, extracting keywords from the preprocessed second corpus, and adding the problem of the same keywords to a third corpus under the keyword name;
training a preset text matching model by using the preprocessed second corpus to obtain a trained text matching model.
3. The method of claim 2, wherein the preprocessing the second corpus comprises, for each question in the second corpus, performing the following:
dividing the problem into a plurality of words;
deleting the stop words in the words;
each term of the plurality of terms is converted into a term vector.
4. The method of claim 2, wherein the extracting the keyword from the preprocessed second corpus, adding the question containing the same keyword to the third corpus under the keyword name, includes:
extracting keywords from the preprocessed second corpus to obtain at least one keyword of each problem;
for each keyword of the at least one keyword, adding a first question containing the keyword and a second question semantically similar to the first question to a third corpus under the keyword name.
5. The method of claim 2, wherein training the pre-set text matching model using the pre-processed second corpus comprises:
any two problems in the preprocessed second corpus are obtained, and the two problems are respectively processed into word embedding matrixes through a word embedding layer;
Passing the word embedding matrix through a position coding layer to position code a plurality of words of each of the two questions;
and embedding the words subjected to the position coding into a matrix input Transformer layer, a suppression layer, a batch standardization layer, a gating circulation unit, a linear subtraction layer and a linear output layer to obtain the semantic similarity of the two problems.
6. The method of claim 5, wherein said passing the word embedding matrix through a position coding layer to position code a plurality of words of each of the two questions comprises:
for each of the two questions, the plurality of words in the question are alternately encoded using sine and cosine functions, respectively, according to the location where each word in the question appears.
7. The method of claim 2, wherein training the preset text matching model using the preprocessed second corpus to obtain a trained text matching model comprises:
any two problems in the preprocessed second corpus are obtained, labels are added to the two problems, and the labels represent actual values of semantic similarity of the two problems;
Determining a semantic similarity score for the two questions using the text matching model;
determining a difference between the semantic similarity score and the tag according to a loss function;
and under the condition that the difference accords with a preset condition, adjusting parameters of the text matching model according to the difference, and returning to the operation of determining semantic similarity scores of the two problems by using the text matching model aiming at the other two problems in the preprocessed second corpus.
8. The method of claim 7, wherein said determining a semantic similarity score for the two questions using the text matching model comprises:
inputting the two problems into a text matching model to obtain first time sequence data and second time sequence data;
respectively carrying out average pooling on the first time sequence data and the second time sequence data to obtain a first vector and a second vector;
and calculating the difference between the first vector and the second vector, and determining the semantic similarity score of the two problems according to the difference.
9. An intelligent question-answering device, comprising:
the system comprises a to-be-solved problem processing module, a target keyword processing module and a target keyword processing module, wherein the to-be-solved problem processing module is used for acquiring to-be-solved problems, preprocessing the to-be-solved problems and extracting the target keyword from the preprocessed to-be-solved problems; extracting a fourth corpus under the target keyword names from the third corpus;
The target problem determining module is used for inputting the preprocessed problems to be solved and the fourth corpus into a trained text matching model to obtain semantic similarity between each problem in the fourth corpus and the problems to be solved; determining the problem with the highest semantic similarity in the fourth corpus as a target problem matched with the problem to be solved; and
and the answer determining module is used for determining an answer corresponding to the target question in the first corpus and determining the answer as the answer corresponding to the to-be-solved question.
10. The apparatus of claim 9, wherein the trained text matching model comprises:
the corpus acquisition unit is used for acquiring a first corpus and a second corpus, wherein the first corpus comprises a plurality of groups of first questions and first answers which correspond to each other; the second corpus includes a plurality of the first questions, and second questions semantically similar to each of the first questions;
the keyword extraction unit is used for preprocessing the second corpus, extracting keywords from the preprocessed second corpus and adding the problem containing the same keywords to a third corpus under the keyword name;
The model training unit is used for training the preset text matching model by using the preprocessed second corpus to obtain a trained text matching model.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310342206.1A CN116595132A (en) | 2023-03-31 | 2023-03-31 | Intelligent question-answering method, device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310342206.1A CN116595132A (en) | 2023-03-31 | 2023-03-31 | Intelligent question-answering method, device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116595132A true CN116595132A (en) | 2023-08-15 |
Family
ID=87605150
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310342206.1A Pending CN116595132A (en) | 2023-03-31 | 2023-03-31 | Intelligent question-answering method, device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116595132A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117575020A (en) * | 2023-11-14 | 2024-02-20 | 平安创科科技(北京)有限公司 | Intelligent question-answering method, device, equipment and medium based on artificial intelligence |
-
2023
- 2023-03-31 CN CN202310342206.1A patent/CN116595132A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117575020A (en) * | 2023-11-14 | 2024-02-20 | 平安创科科技(北京)有限公司 | Intelligent question-answering method, device, equipment and medium based on artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11334635B2 (en) | Domain specific natural language understanding of customer intent in self-help | |
CN107273503B (en) | Method and device for generating parallel text in same language | |
CN112069302B (en) | Training method of conversation intention recognition model, conversation intention recognition method and device | |
US11429405B2 (en) | Method and apparatus for providing personalized self-help experience | |
CN111597830A (en) | Multi-modal machine learning-based translation method, device, equipment and storage medium | |
US11238132B2 (en) | Method and system for using existing models in connection with new model development | |
CN107862058B (en) | Method and apparatus for generating information | |
CN117351336A (en) | Image auditing method and related equipment | |
CN116595132A (en) | Intelligent question-answering method, device, electronic equipment and medium | |
CN117093687A (en) | Question answering method and device, electronic equipment and storage medium | |
CN117112829B (en) | Medical data cross-modal retrieval method and device and related equipment | |
CN117272937B (en) | Text coding model training method, device, equipment and storage medium | |
CN117764373A (en) | Risk prediction method, apparatus, device and storage medium | |
CN117275466A (en) | Business intention recognition method, device, equipment and storage medium thereof | |
CN116090471A (en) | Multitasking model pre-training method and device, storage medium and electronic equipment | |
CN115358817A (en) | Intelligent product recommendation method, device, equipment and medium based on social data | |
CN113807920A (en) | Artificial intelligence based product recommendation method, device, equipment and storage medium | |
CN118227910B (en) | Media resource aggregation method, device, equipment and storage medium | |
CN118536606B (en) | Man-machine interaction method and device and electronic equipment | |
CN116187346A (en) | Man-machine interaction method, device, system and medium | |
CN116737749A (en) | Data retrieval method, device, electronic equipment and storage medium | |
CN116824600A (en) | Company seal identification method and related equipment thereof | |
CN116992018A (en) | Data processing method, apparatus, device, readable storage medium, and program product | |
CN117216203A (en) | Interactive question-answering method, device, computer equipment and medium | |
CN116663570A (en) | Text translation method, device, computer equipment and medium based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |