CN113627198A - Reference answer generation method, system and device based on historical question-answer data and computer equipment - Google Patents

Reference answer generation method, system and device based on historical question-answer data and computer equipment Download PDF

Info

Publication number
CN113627198A
CN113627198A CN202110965077.2A CN202110965077A CN113627198A CN 113627198 A CN113627198 A CN 113627198A CN 202110965077 A CN202110965077 A CN 202110965077A CN 113627198 A CN113627198 A CN 113627198A
Authority
CN
China
Prior art keywords
question
answer
language
historical
questioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110965077.2A
Other languages
Chinese (zh)
Inventor
邵睿
宋旸
蒋宏飞
吕少科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zuoyebang Education Technology Beijing Co Ltd
Original Assignee
Zuoyebang Education Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zuoyebang Education Technology Beijing Co Ltd filed Critical Zuoyebang Education Technology Beijing Co Ltd
Priority to CN202110965077.2A priority Critical patent/CN113627198A/en
Publication of CN113627198A publication Critical patent/CN113627198A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention belongs to the field of education, and provides a reference answer generation method, a system and a device for acquiring historical question-answer data and computer equipment, which are used for on-line question answering of students and teachers. The invention comprises the following steps: obtaining historical question-answer data, wherein the historical question-answer data comprises a plurality of question-answer pairs, each question-answer pair comprises a question language section and an answer language section, calculating the relevance score of the question language section and the answer language section in each question-answer pair, calculating the quality score of the answer language section in the question-answer pair, obtaining an on-line question language section, calculating the similarity of the on-line question language section and the question language section of each question-answer pair in the historical question-answer data, screening the question-answer pair matched with the on-line question language section according to the similarity, sequencing the screened question-answer pair according to the quality score of the answer language section, and taking the answer language section of the question-answer pair with the front sequencing as the reference answer of the on-line question language section. The method can provide quick and accurate response based on historical question and answer data when the user puts forward the online question, and improves the user experience.

Description

Reference answer generation method, system and device based on historical question-answer data and computer equipment
Technical Field
The invention belongs to the field of education, is particularly suitable for the field of online education, and particularly relates to a method and a device for generating reference answers based on historical question answering data and computer equipment.
Background
In recent years, with the increasingly frequent participation rate of online courses, more pre-class or post-class interactive communication between students and teachers also becomes a necessary communication link of the online courses, the number of the students is far more than that of the online teachers, so that part of the students cannot get timely and accurate question responses, the learning enthusiasm of indirect influence on the course experience is reduced, if the number of the students is increased, the teachers are used for communicating with the students, unnecessary labor waste can be increased, the teachers need to be trained in advance, and the teacher cannot monitor the accuracy of answering questions.
Therefore, how to improve the response accuracy and improve the user experience by solving the problem that how to improve the response accuracy when the user communicates with the teacher without increasing the manpower when the user proposes the repeated problem is a problem that needs to be solved at present.
Therefore, it is necessary to provide a reply content scoring and recommending method based on historical data to solve the above problems.
Disclosure of Invention
Technical problem to be solved
The invention aims to solve the technical problems that when online students interact with teachers, questions and answers cannot be replied in time, the response pertinence is low, repeated questions cannot be preferentially replied, the user experience is poor and the like.
(II) technical scheme
In order to solve the above technical problems, an aspect of the present invention provides a method for generating a reference answer based on historical question-answer data, which is used in the field of online education, the method including the steps of: obtaining historical question-answer data, wherein the historical question-answer data comprises a plurality of question-answer pairs, each question-answer pair comprises a question language section and an answer language section, calculating the relevance score of the question language section and the answer language section in each question-answer pair, calculating the quality score of the answer language section in the question-answer pair, obtaining an on-line question language section, calculating the similarity of the on-line question language section and the question language section of each question-answer pair in the historical question-answer data, screening the question-answer pair matched with the on-line question language section according to the similarity, sequencing the screened question-answer pair according to the quality score of the answer language section, and taking the answer language section of the question-answer pair with the front sequencing as the reference answer of the on-line question language section.
Optionally, the calculating the relevance score of the question language segment and the answer language segment in each question and answer pair includes: setting a relevance grade threshold, storing the answer speech segments when the relevance grade of the answer speech segments is above the grade threshold, and discarding the answer speech segments when the relevance grade of the answer speech segments is less than the grade threshold.
According to the preferred embodiment of the invention, a relevance grade threshold is set, the relevance of the question language segment and the answer language segment is matched and marked with a grade, the grade is in the interval of 0-1, wherein when the question language segment and the answer language segment are completely related, the grade is 1, if the question language segment and the answer language segment are completely unrelated, the grade is 0, and a plurality of answer language segments corresponding to the same question language segment are ranked from high to low according to the grade.
According to the preferred embodiment of the present invention, obtaining an online question section, and calculating the similarity between the online question section and the question section of each question-answer pair in the historical question-answer data includes: the method comprises the steps of segmenting a question phrase, extracting keywords of the question phrase, establishing a question-answer pair retrieval base, setting a reverse index according to the keywords of the question phrase in the question-answer pair, matching the similarity between the question phrase and the keywords of the on-line question phrase, and extracting the question-answer pair with high similarity between the on-line question phrase and the question phrase.
According to the preferred embodiment of the invention, the method for setting the reverse index according to the keyword of the questioning section in the question-answering pair comprises the following steps: and segmenting words or records in the question and answer pairs and the question and answer pairs, taking the words or records as key words and the question and answer pairs as records, matching the similarity between the question and answer segments on line, and extracting the answer segments in the matched question and answer pairs.
According to the preferred embodiment of the invention, the similarity of the on-line question words and the question words is matched, the matched question-answer pairs are extracted, the grading and the ordering are carried out according to the answer words of the question-answer pairs, and when the similarity between the keywords of the on-line question words and the keywords of the matched question words is higher, the answer words corresponding to the question words are displayed.
According to an optional embodiment of the present invention, the displaying the answer corpus corresponding to the question corpus includes: displaying the on-line question words and the question words matched with the similarity of the on-line question words, displaying the question-answer pairs matched with the question words, displaying a plurality of answer words corresponding to the same question-answer pairs, sequencing the answer words from high to low according to quality scores, and feeding back the answer words to the user as reference answer choices of the on-line question words.
A second aspect of the present invention provides an online question-and-answer system, including: a client and an opposite end, at least one server for interaction between the client and the opposite end at the time of on-line question answering, and the client or the at least one server receives question answering data using the reference answer generating method based on historical question answering data according to claim 1, the at least one server transmits the generated question answering data to the opposite end and displays it in the client and/or the opposite end.
A third aspect of the present invention provides a reference answering device based on historical question-answering data, the reference answering device including: the system comprises an acquisition module, a matching module and a generation module, wherein the acquisition module is used for acquiring historical question-answer data, the historical question-answer data comprises a plurality of question-answer pairs, each question-answer pair comprises a question language section and an answer language section, the first calculation module is used for calculating the relevance grade of the question language section and the answer language section in each question-answer pair, the second calculation module is used for calculating the quality grade of the answer language section in the question-answer pair, the matching module is used for acquiring an on-line question language section, calculating the similarity of the on-line question language section and the question language section of each question-answer pair in the historical question-answer data, screening the question-answer pair matched with the on-line question language section according to the similarity, and the generation module is used for sorting the screened question-answer pair according to the quality grade of the answer language section and using the answer section with the highest ranking as the reference answer of the on-line question language section.
A fourth aspect of the present invention provides a computer program product comprising computer programs/instructions, wherein the computer programs/instructions, when executed by a processor, implement the method for generating reference answers based on historical question-answer data according to the present invention.
(III) advantageous effects
Compared with the prior art, the method and the system have the advantages that the communication records of the user on the platform are periodically acquired as the historical data, when the user asks the question again, the platform screens the content relevance of the historical data, and the optimal response with high matching approximation degree and high response quality can be quickly matched and fed back to the client. In the process, similar problems are effectively avoided, manpower waste caused by repeated answers is avoided, the accuracy of the answers is ensured by screening high-quality answers, and user experience is improved.
Drawings
FIG. 1 is a flow diagram of the present invention for reference answer generation based on historical question-answer data;
FIG. 2 is a flow diagram of an example of the generation of a reference answer based on historical question-answer data in accordance with the present invention.
FIG. 3 is a schematic diagram of one example of an online question-and-answer system of the present invention.
FIG. 4 is a schematic diagram of a historical question and answer data based reference answer generating device of the present invention;
fig. 5 is a schematic diagram of an example of a reference answer generating apparatus based on historical question-answer data of the present invention.
Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
FIG. 7 is a schematic diagram of a computer program product of an embodiment of the invention.
Detailed Description
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
In order to further promote quick feedback response aiming at the questions and answers on line, the invention provides a reference answer generating method based on historical question-answer data, which is used for on-line education. Therefore, the method can give quick and accurate responses according to the online questions proposed by the user, can improve the user experience, and can further optimize the reference answer generating method based on the historical question-answer data.
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
It should be noted that, for the convenience of understanding, the embodiment of the present invention is described in an online education scenario, but those skilled in the art will understand that the application of the present invention is not limited thereto. The invention can also be used in other scenes that the user question server prefers to reply and feed back.
Fig. 1 is a flowchart of reference answer generation based on historical question-answer data according to embodiment 1 of the present invention.
As shown in fig. 1, the present invention provides a reference answer generation method based on historical question-answer data, the method including:
step S101, obtaining historical question-answer data, wherein the historical question-answer data comprises a plurality of question-answer pairs, and each question-answer pair comprises a question language segment and an answer language segment;
step S102, calculating the relevance grade of the question language segment and the answer language segment in each question-answer pair;
step S103, calculating the quality scores of the answer speech segments in the question-answer pairs;
step S104, obtaining on-line question words, calculating the similarity between the on-line question words and the question words of each question-answer pair in the historical question-answer data, and screening out question-answer pairs matched with the on-line question-answer words according to the similarity;
and step S105, sequencing the screened question-answer pairs according to the quality scores of the answer language segments of the question-answer pairs, and taking the answer language segments of the question-answer pairs which are sequenced at the front as the reference answers of the on-line question-answer language segments.
In this example, in the application scenario of online education, including a client and an opposite end, a reference answer system based on historical question and answer data is run on either or both of the client and the opposite end, and the answer method is used for online education through user interaction between the client and the opposite end, such as asking questions and feeding back answers. The client and the opposite end are usually provided with a human-machine interaction interface, including for example a display or a display screen, a mouse, a keyboard, etc., which usually has a visual interaction interface. For example, mobile intelligent clients such as mobile phones, the man-machine interaction interface is usually a touch screen and buttons. In any case, the user may interact with the client and the opposite end through a human-machine interface.
First, step S101, obtaining historical question-answer data, where the historical question-answer data includes a plurality of question-answer pairs, and each question-answer pair includes a question phrase segment and an answer phrase segment.
In this example, the client user is a student, the corresponding end user is a teacher, and in the on-line question-and-answer process, daily communication records between the student and the teacher are periodically obtained as historical question-and-answer data, each historical question-and-answer data includes a plurality of question-and-answer pairs, and each question-and-answer pair is composed of a question language segment and an answer language segment.
As a preferred embodiment, the historical question-answer data is segmented into individual question-answer pairs by creating different time intervals, which can be specifically set according to different requirements, for example, the staff can use the average question-answer time of the student and the teacher as the time interval for segmenting the current historical question-answer data, for example, when the average answer time of the teacher is within 15 seconds after the student asks a question, the time interval can be set to 15 seconds, so as to ensure that the complete question-answer pairs can be segmented. In order to more accurately segment the question-answer pairs in the historical question-answer data, the historical question-answer data may be segmented in such a way that when a student asks a question, a teacher immediately segments the question after completing the response to the question, so that each question-answer pair only includes a question section and an answer section.
It should be noted that the above description is only given by way of example, and the present invention is not limited thereto. In other examples, historical question and answer data may also be manually segmented by staff, and so on.
In step S102, a relevance score of the question phrase and the answer phrase in each question-answer pair is calculated.
The segmented historical question-answer data comprises question language segments and answer language segments, and in the process, whether the current question-answer pairs are reserved or not is screened by calculating the relevancy scores of the question language segments and the answer language segments.
Specifically, each question-answer pair is scored by using a content relevance model, a relevance scoring threshold is set, the answer speech segments are stored when the relevance scoring of the answer speech segments is above the scoring threshold, and the answer speech segments are discarded when the relevance scoring of the answer speech segments is less than the scoring threshold. And setting a relevance grading threshold, namely matching the relevance of the question language segments and the answer language segments and marking a grade, wherein the grading interval is 0-1, when the question language segments and the answer language segments are completely correlated, the grade is 1, if the question language segments and the answer language segments are completely uncorrelated, the grade is 0, and the answer language segments corresponding to the same question language segments are sorted from high to low according to the grade. In an embodiment, the relevancy scoring may be performed manually according to a teacher or may be performed by setting a specific scoring standard, for example, the scoring is performed according to the number of times of keywords appearing in the question section and the answer section, and the scoring is performed according to the frequency of occurrence of the keywords appearing in the question section in combination with the frequency of occurrence of the keywords appearing in the answer section, where the scoring is performed higher the frequency of occurrence of the keywords is. The relevance score threshold value can be set to be 0.6, question-answer pairs with the relevance scores of the question language segments and the answer language segments of more than 0.6 in the question-answer pairs are stored, and the question-answer pairs with the relevance scores of less than 0.6 are directly discarded. The relevancy score threshold may be adjusted according to different scenarios and other reasons, and is not limited herein.
The content relevancy training method comprises the following steps: the method comprises the steps of periodically obtaining daily communication records of students and teachers as historical question and answer data, carrying out relevancy scoring on a questioning section and a questioning and answering section in a questioning and answering pair generated by segmenting the historical question and answering data to generate scores with a value of 0-1, respectively carrying out vectorization on the questioning section of the students and the answering section of the teachers, and putting the students and the teachers into a bert model to carry out fine-tuning.
In step S103, a quality score of the answer speech segment in the question-answer pair is calculated.
In the example, through the screening of the relevance scores of the question words and the answer words in each question-answer pair, the question-answer pairs with the relevance score lower than the set relevance score threshold are abandoned, the quality scoring is performed on the answer speech segments in the remained question-answer pairs, and it is worth mentioning that the quality scoring rules can be set according to the difference of application scenes, content length, content complexity, number of paragraphs, language words, user later-stage satisfaction scoring and the like, for example, a certain scoring rule applies a plurality of 'please' to the answer phrase segment to guide the user operation as an addend, the content length reaches the designated word number as an addend, the reply segment falls above 2 segments as an addend or is given a score of 5 stars as an addend on the subsequent satisfaction score, the answer language segments calculate quality scores according to the specific adding items, and sort and store the quality scores of a plurality of answer language segments corresponding to the same question language segment from high to low according to the quality scores.
Further, the query language segment corresponding to the reserved answer language segment is segmented, the segmentation is a keyword recorded in the query language segment, for example, the query language segment of the student is: the idea of the julian is? Then the keywords in the questioning section are julian and/or thought. In order to better store different keywords, a search base and an inverted index are established in a server according to keywords in a question section and a answer section which are screened by relevancy scoring in a plurality of questions and answers of historical question and answer data.
Next, in step S104, an online question and answer section is obtained, the similarity between the online question and the question and answer section of each question and answer pair in the historical question and answer data is calculated, and a question and answer pair matched with the online question and answer section is screened according to the similarity.
Specifically, when a student asks a teacher on line through a client, after obtaining an on-line question section, the server calculates the similarity between the on-line question section and the question section of each question-answer pair in the historical question-answer data, wherein the server divides the on-line question section again to extract the keywords of the on-line question section, and the server extracts the question-answer pair with high similarity between the on-line question section and the question section from a retrieval library by matching the similarity between the question section and the keywords of the on-line question section.
Preferably, the reverse index is set according to the keyword of the question and answer field in the question and answer pair, the word or record in the question and answer field is segmented according to the historical question and answer data, the word or record is used as the keyword, the question and answer pair is used as the record, the similarity between the question and answer field on line is matched with the similarity between the question and answer field, and the answer field in the matched question and answer pair is extracted.
Compared with the prior art, the calculation of the similarity between the on-line question section and the question section of each question-answer pair in the historical question-answer data is completed through an improved bm25 algorithm, the bm25 algorithm is a practical algorithm for evaluating the correlation between search terms and documents at present, but the parameter b in the bm25 algorithm is modified due to the fact that the question section proposed by students is short, and the parameter b is mainly reduced to serve as an improved algorithm to weaken the punishment degree of the length to the final score. The method is used for better realizing the matching of the similarity between the question words on the line and the question words.
And step S105, sequencing the screened question-answer pairs according to the quality scores of the answer language segments of the question-answer pairs, and taking the answer language segments of the question-answer pairs which are sequenced at the front as the reference answers of the on-line question-answer language segments.
Specifically, when the server matches the on-line question words and question words through the similarity, the server sorts the quality scores of the answer words according to the question and answer, and displays the answer words corresponding to the question words.
In an embodiment, the presenting the answer corpus corresponding to the question corpus includes:
and displaying the questioning sections matched with the similarity of the questioning sections on the line at the corresponding end of the teacher, displaying the questioning and answering pairs matched with the questioning sections, displaying a plurality of answering sections corresponding to the same questioning and answering pairs and sequencing from high to low according to quality scores. That is, the corresponding end of the teacher will display the following: questions posed by the student, historical questions matching the student questions, and historical replies corresponding to the historical questions, wherein the replies are ordered from high to low by a quality score.
As a preferred embodiment, the teacher may browse and select the answer language fragment fed back to the corresponding end through the server, the teacher may preferably feed back the answer language fragment to the student according to the on-line question language fragment, and display the teacher feedback answer language fragment on the client of the student.
Referring to fig. 3, fig. 3 is a schematic diagram of an example of the live online system of the present invention.
According to a second aspect of the present invention, the present invention further provides an online live broadcast system, where the online question and answer system includes: the system comprises a client and an opposite end, and at least one server, wherein the server is used for interaction between the client and the opposite end when the client asks and answers online.
The client or the at least one server receiving question-answer data using the historical question-answer data based reference answer generating method of claim 1; the at least one server transmits the generated question and answer data to the opposite end and displays it in the client and/or the opposite end.
Preferably, the live online system may include client devices 301, 302, 303, corresponding end devices 311, 312, 313, a network 304 and a server 305. The network 304 serves to provide a medium for communication links between the client devices 301, 302, 303 and the server 305 and the corresponding end devices 311, 312, 313. Network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may interact with a server 305 over a network 304 using client devices 301, 302, 303 or corresponding end devices 311, 312, 313 to receive or send messages or the like. Various communication client applications, such as an online course application, a web browser application, a financial network platform application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the client devices 301, 302, 303 and the corresponding end devices 311, 312, 313.
The client devices 301, 302, 303 and the corresponding end devices 311, 312, 313 may be various electronic devices having display screens and supporting online courses, including but not limited to smartphones, tablets, laptop portable computers, desktop computers, and the like. The client devices 301, 302, 303 may, for example, obtain historical question-answer data for a user, the historical question-answer data including a plurality of question-answer pairs; the client devices 301, 302, 303 may create different time intervals according to the server requirement, for example, when the user acquires a question-answer pair at the client, and segment the question-answer pair into separate question-answer pairs, and the client devices 301, 302, 303 may acquire, for example, a question phrase segment submitted by the user at the client, an answer phrase segment fed back by the server 305, and display the answer phrase segment fed back by the server 305.
The corresponding end devices 311, 312, 313 may, for example, display the question sections matched with the similarity thereof on the line; the corresponding end devices 311, 312, 313 may, for example, display question-answer pairs matching the question utterance sections; the corresponding end devices 311, 312, 313 may, for example, display a plurality of answer speech segments corresponding to the same question-answer pair and sort the answer speech segments from high to low according to the quality scores, and select the reference answer selection of the on-line question speech segment for feedback to the user. The server 305 may be a server providing various services, for example, calculating relevance scores of the query term and the answer term in each question-answer pair fed back by the client devices 301, 302, 303 and the corresponding end devices 311, 312, 313; calculating the quality score of the answer speech segment in the question-answer pair; obtaining on-line questioning sections from the client devices 301, 302 and 303, calculating the similarity between the on-line questioning sections and the questioning sections of each questioning and answering pair in the historical questioning and answering data, and screening out the questioning and answering pairs matched with the on-line questioning sections according to the similarity. The server 305 may monitor the process of the received user service application, and may also feed back the selected answer speech segments to the client device according to the quality scores of the answer speech segments sorted by the corresponding device.
For example, after the user sends the online query word segment to the server 305 by using the client device 301 (or the client device 302 or 303), the corresponding device 311 (or the corresponding device 312 or 313) may perform screening and answering on the query word segment, and the server 305 may feed back, for example, the answer word segment selected by the corresponding device 311 (or the corresponding device 312 or 313) to the client device 301 (or the client device 302 or 303).
The server 305 may be a server of an entity, and may also be composed of a plurality of servers, for example, it should be noted that the reference answer generating method provided by the embodiment of the present disclosure may be executed by the server 305, the client devices 301, 302, 303 and/or the corresponding end devices 311, 312, 313, and accordingly, the reference answer generating apparatus may be disposed in the server 305, the client devices 301, 302, 303 and/or the corresponding end devices 311, 312, 313.
Compared with the prior art, the system provided by the invention can give quick and accurate response based on the historical question-answer data when the user puts forward an online question and improve the use experience of the user by using the reference answer generating method based on the historical question-answer data of the embodiment. .
Embodiments of the apparatus of the present invention are described below, which may be used to perform method embodiments of the present invention. The details described in the device embodiments of the invention should be regarded as complementary to the above-described method embodiments; reference is made to the above-described method embodiments for details not disclosed in the apparatus embodiments of the invention.
According to a third aspect of the present invention, there is also provided a reference answer apparatus based on historical question-answer data, fig. 4 is a schematic diagram of the reference answer generating apparatus based on historical question-answer data according to the present invention, and according to fig. 4, the apparatus includes:
the acquisition module is used for acquiring historical question-answer data, wherein the historical question-answer data comprises a plurality of question-answer pairs, and each question-answer pair comprises a question language segment and an answer language segment. Daily communication records of students and teachers are periodically acquired to serve as historical question and answer data, each historical question and answer data comprises a plurality of question and answer pairs, and each question and answer pair is composed of a question language section and an answer language section.
And the first calculation module is used for calculating the relevance scores of the question language segments and the answer language segments in the question-answer pairs, setting a relevance score threshold value, and screening whether the current question-answer pair is reserved or not by calculating the relevance scores of the question language segments and the answer language segments in the process, wherein the segmented historical question-answer data comprises the question language segments and the answer language segments. When the relevance grade of the answer language segments is more than a grade threshold value, the answer language segments are saved, when the relevance grade of the answer language segments is less than the grade threshold value, the answer language segments are abandoned, the relevance grade threshold value is set, the relevance of the question language segments and the answer language segments is matched and marked for grade, the grade is in a range of 0-1, when the question language segments are completely relevant to the answer language segments, the grade is 1, if the question language segments are completely irrelevant, the grade is 0, and a plurality of answer language segments corresponding to the same question language segments are sorted from high to low according to the grade.
The second calculation module is configured to calculate a quality score of the answer speech segment in the question-answer pair, and it is worth mentioning that the quality score rule may be set according to different application scenarios, content length, content complexity, number of paragraphs, language word, user later-stage satisfaction score, and the like.
The matching module is used for obtaining an online questioning section, calculating the similarity between the online questioning section and the questioning section of each questioning and answering pair in the historical questioning and answering data, screening out the questioning and answering pair matched with the online questioning section according to the similarity, performing word segmentation on the questioning section, extracting keywords of the questioning section, establishing a questioning and answering pair retrieval library, setting a reverse index according to the keywords of the questioning section in the questioning and answering pair, matching the similarity between the questioning section and the keywords of the online questioning section, and extracting the questioning and answering pair with high similarity between the online questioning section and the questioning section.
And the generation module is used for sequencing the screened question-answer pairs according to the quality scores of the answer language sections of the question-answer pairs, and taking the answer language sections of the question-answer pairs which are sequenced in the front as the reference answers of the on-line question-answer language sections. Matching the similarity of the on-line question words and the question words, extracting matched question-answer pairs, grading and sequencing according to the quality of the answer words of the question-answer pairs, and displaying the answer words corresponding to the question words when the similarity between the keywords of the on-line question words and the keywords of the matched question words is high.
In an embodiment, fig. 5 is a schematic diagram of an example of the reference answer generating apparatus based on historical question-answer data according to the present invention, and as shown in fig. 5, the apparatus further includes a display module, configured to display the on-line question section and the fed-back answer section on the writing track in the client, display the question section matching the similarity of the on-line question section and the question section on the opposite end, display a question-answer pair matching the question section, display a plurality of answer sections corresponding to the same question-answer pair, and sort from high to low according to quality scores.
Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
In the following, embodiments of the computer apparatus of the present invention are described, which may be seen as specific physical embodiments for the above-described embodiments of the method and apparatus of the present invention. The details described in the computer device embodiment of the invention should be considered as additions to the method or apparatus embodiment described above; for details which are not disclosed in the embodiments of the computer device of the invention, reference may be made to the above-described embodiments of the method or apparatus.
Fig. 6 is a schematic block diagram of a computer device according to an embodiment of the present invention, which includes a processor and a memory, the memory storing a computer executable program, the processor executing the method according to any one of the embodiments when the computer program is executed by the processor, including but not limited to the method of fig. 1.
As shown in fig. 6, the computer device is in the form of a general purpose computing device. The processor can be one or more and can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The computer device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.
The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable a computer device to perform the method of the invention, or at least some of the steps of the method.
The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may also be non-volatile memory, such as read-only memory (ROM).
Optionally, in this embodiment, the computer device further includes an I/O interface, which is used for data exchange between the computer device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.
It should be understood that the computer device shown in fig. 6 is only one example of the present invention, and elements or components not shown in the above examples may also be included in the computer device of the present invention. For example, some computer devices also include display units such as display screens, and some computer devices also include human-computer interaction elements such as buttons, keyboards, and the like. The computer device can be considered to be covered by the present invention as long as the computer device can execute the computer readable program in the memory to implement the method of the present invention or at least part of the steps of the method.
FIG. 7 is a schematic diagram of a computer program product of an embodiment of the invention. As shown in fig. 7, the computer program product has stored therein a computer executable program, which when executed, implements the above-described over-the-air writing method of the present invention. The computer program product may comprise a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. The computer program product may be transmitted, propagated, or transported by a computer to be used by or in connection with an instruction execution system, apparatus, or device. Program code embodied on the computer program product may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system. The invention may also be implemented by computer software for performing the method of the invention, e.g. control software executed by a microprocessor, an electronic control unit, a client, a server, etc. It should be noted that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, and can also be realized in a distributed manner by non-specific hardware. For computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be distributed over a network, as long as it enables the computer device to perform the method according to the present invention.
While the foregoing detailed description has described the objects, aspects and advantages of the present invention in further detail, it should be appreciated that the present invention is not inherently related to any particular computer, virtual machine, or computer apparatus, as various general purpose devices may implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (10)

1. A method for generating reference answers based on historical question-answer data, the method comprising the steps of:
obtaining historical question-answer data, wherein the historical question-answer data comprises a plurality of question-answer pairs, and each question-answer pair comprises a question language segment and an answer language segment;
calculating the relevancy scores of the question language segments and the answer language segments in the question-answer pairs;
calculating the quality score of the answer speech segment in the question-answer pair;
obtaining on-line questioning sections, calculating the similarity between the on-line questioning sections and questioning sections of all questioning and answering pairs in the historical questioning and answering data, and screening out questioning and answering pairs matched with the on-line questioning sections according to the similarity;
and sorting the screened question-answer pairs according to the quality scores of the answer language segments of the question-answer pairs, and taking the answer language segments of the question-answer pairs which are sorted in the front as the reference answers of the on-line question-answer language segments.
2. The method for generating reference answers based on historical question-answer data according to claim 1, wherein the calculating the relevance scores of the question words and answer words in each question-answer pair comprises:
setting a relevance grade threshold, and storing the answer speech segments when the relevance grade of the answer speech segments is above the grade threshold;
and when the relevance score of the answer speech segment is smaller than a score threshold value, discarding the answer speech segment.
3. The reply content scoring and recommending method based on historical data according to claim 2, wherein a relevancy scoring threshold is set, the question speech segment and answer speech segment are matched in relevancy and scored, and the score is in the interval of 0-1, wherein,
when the question language segments and the answer language segments are completely related, the score is 1, if the question language segments and the answer language segments are completely unrelated, the score is 0, and the answer language segments corresponding to the same question language segments are sorted from high to low according to the scores.
4. The method for generating reference answers based on historical question-answer data according to claim 1, wherein obtaining an online question phrase segment, and calculating the similarity between the online question phrase segment and the question phrase segment of each question-answer pair in the historical question-answer data comprises:
segmenting words of the questioning sections, and extracting keywords of the questioning sections;
establishing a question-answer pair retrieval library, and setting a reverse index according to the keyword of the question segment in the question-answer pair;
matching the similarity between the questioning section and the keywords of the on-line questioning section;
and extracting question-answer pairs with high similarity between the on-line question sentence sections and the question sentence sections.
5. The method for generating reference answers based on historical question-answer data according to claim 4, wherein the step of setting an inverted index for the keyword of the question field in the question-answer pair comprises the following steps:
segmenting words or records in a question and answer pair question and answer field, taking the words or records as key words, and taking the question and answer pair as records;
and when the similarity of the question words and the question words is matched, extracting the answer words in the matched question-answer pair.
6. The method of claim 5, wherein the online question-answer field is matched with the similarity of the question-answer field, the matched question-answer pairs are extracted, the order is ranked according to the answer field quality scores of the question-answer pairs, and when the similarity between the keywords of the online question-answer field and the keywords of the matched question-answer field is high, the answer field corresponding to the question-answer field is displayed.
7. The method according to claim 6, wherein the displaying of the answer fields corresponding to the question fields comprises:
displaying the question language sections on the line and the question language sections matched with the similarity of the question language sections;
displaying question-answer pairs matched with the question sentence sections;
and displaying a plurality of answer language sections corresponding to the same question-answer pair, sequencing the answer language sections from high to low according to the quality scores, and feeding back the answer language sections as reference answer selections of the on-line question language sections to the user.
8. An on-line question-answering system, comprising:
a client and an opposite end;
at least one server for interaction between a client and an opposite end in on-line question answering, and the client or the at least one server receives question answering data using the reference answer generating method based on historical question answering data of claim 1; the at least one server transmits the generated question and answer data to the opposite end and displays it in the client and/or the opposite end.
9. A reference answering apparatus based on historical question-answering data, the reference answering apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring historical question-answer data, the historical question-answer data comprises a plurality of question-answer pairs, and each question-answer pair comprises a question language segment and an answer language segment;
the first calculation module is used for calculating the relevance scores of the question language segments and the answer language segments in the question-answer pairs;
the second calculation module is used for calculating the quality scores of the answer speech segments in the question-answer pairs;
the matching module is used for acquiring an online questioning section, calculating the similarity between the online questioning section and the questioning section of each questioning and answering pair in the historical questioning and answering data, and screening out the questioning and answering pairs matched with the online questioning and answering section according to the similarity;
and the generation module is used for sequencing the screened question-answer pairs according to the quality scores of the answer language sections of the question-answer pairs, and taking the answer language sections of the question-answer pairs which are sequenced in the front as the reference answers of the on-line question-answer language sections.
10. A computer program product comprising computer programs/instructions, characterized in that said computer programs/instructions, when executed by a processor, implement the historical question-answer data based reference answer generating method of any one of claims 1-7.
CN202110965077.2A 2021-08-20 2021-08-20 Reference answer generation method, system and device based on historical question-answer data and computer equipment Pending CN113627198A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110965077.2A CN113627198A (en) 2021-08-20 2021-08-20 Reference answer generation method, system and device based on historical question-answer data and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110965077.2A CN113627198A (en) 2021-08-20 2021-08-20 Reference answer generation method, system and device based on historical question-answer data and computer equipment

Publications (1)

Publication Number Publication Date
CN113627198A true CN113627198A (en) 2021-11-09

Family

ID=78387131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110965077.2A Pending CN113627198A (en) 2021-08-20 2021-08-20 Reference answer generation method, system and device based on historical question-answer data and computer equipment

Country Status (1)

Country Link
CN (1) CN113627198A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115098664A (en) * 2022-08-24 2022-09-23 中关村科学城城市大脑股份有限公司 Intelligent question answering method and device, electronic equipment and computer readable medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115098664A (en) * 2022-08-24 2022-09-23 中关村科学城城市大脑股份有限公司 Intelligent question answering method and device, electronic equipment and computer readable medium
CN115098664B (en) * 2022-08-24 2022-11-29 中关村科学城城市大脑股份有限公司 Intelligent question answering method and device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
US10733197B2 (en) Method and apparatus for providing information based on artificial intelligence
JP6414956B2 (en) Question generating device and computer program
US20180181573A1 (en) Search method and device for asking type query based on deep question and answer
CN109492164A (en) A kind of recommended method of resume, device, electronic equipment and storage medium
CN106126524B (en) Information pushing method and device
EP1872353A2 (en) Systems and methods for semantic knowledge assessment, instruction, and acquisition
CN109783631B (en) Community question-answer data verification method and device, computer equipment and storage medium
CN111507680A (en) Online interviewing method, system, equipment and storage medium
CN113851020A (en) Self-adaptive learning platform based on knowledge graph
CN112685550B (en) Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium
Almgerbi et al. A systematic review of data analytics job requirements and online-courses
CN110609947A (en) Learning content recommendation method, terminal and storage medium of intelligent learning system
CN116796802A (en) Learning recommendation method, device, equipment and storage medium based on error question analysis
CN115048506A (en) Test question generation system, method and device based on knowledge graph and storage medium
CN110929169A (en) Position recommendation method based on improved Canopy clustering collaborative filtering algorithm
CN113627198A (en) Reference answer generation method, system and device based on historical question-answer data and computer equipment
CN113259763A (en) Teaching video processing method and device and electronic equipment
CN112559711A (en) Synonymous text prompting method and device and electronic equipment
CN111930908A (en) Answer recognition method and device based on artificial intelligence, medium and electronic equipment
CN116595188A (en) Educational knowledge graph system based on artificial intelligence and big data
CN115757720A (en) Project information searching method, device, equipment and medium based on knowledge graph
Basyuk et al. Peculiarities of an Information System Development for Studying Ukrainian Language and Carrying out an Emotional and Content Analysis
Tsinakos et al. Identification of conflicting questions in the PARES system
CN110472140B (en) Object word recommendation method and device and electronic equipment
CN112507082A (en) Method and device for intelligently identifying improper text interaction and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination