CN113111159A

CN113111159A - Question and answer record generation method and device, electronic equipment and storage medium

Info

Publication number: CN113111159A
Application number: CN202110429297.3A
Authority: CN
Inventors: 朱章春
Original assignee: Kangjian Information Technology Shenzhen Co Ltd
Current assignee: Kangjian Information Technology Shenzhen Co Ltd
Priority date: 2021-04-21
Filing date: 2021-04-21
Publication date: 2021-07-13
Also published as: WO2022222942A1

Abstract

The invention relates to a data analysis technology and discloses a question and answer record generation method which comprises the steps of extracting participles in a chat record and calculating the occurrence frequency of each participle; summarizing the participles with the frequency greater than a preset threshold value to obtain a popular participle set, sequencing the participles in the popular participle set, and generating a popular list according to the sequencing; sequentially selecting one of the participles in the hot word list, and taking the selected participle as a search word to search in the chat records to obtain a problem corresponding to the search word; and classifying the problems obtained by searching according to the purpose of the problems, solving the problems, generating a question and answer record, and pushing the question and answer record to the user side. In addition, the invention also relates to a blockchain technology, and the hot word segmentation set can be stored in the nodes of the blockchain. The invention also provides a question and answer record generating device, electronic equipment and a computer readable storage medium. The invention can solve the problem of low efficiency when the question is matched with the question-answer database to generate the question-answer record.

Description

Question and answer record generation method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of data analysis technologies, and in particular, to a method and an apparatus for generating a question and answer record, an electronic device, and a computer-readable storage medium.

Background

With the rapid development of science and technology, the process of proposing and answering questions is important no matter in a live broadcast room with strong manual operation property, a classroom of a teacher or in the daily communication of people, and the generated question-answer record can be used as a reference for subsequent communication.

The existing question-answer record generation method is generally to directly match and reply the acquired questions with the existing question-answer database, and does not consider the frequency of the questions, which causes repeated matching of the same or the same type of questions, and the efficiency of generating the question-answer records by using the method is low.

Disclosure of Invention

The invention provides a question and answer record generation method and device and a computer readable storage medium, and mainly aims to solve the problem of low efficiency in generating question and answer records by matching questions with a question and answer database.

In order to achieve the above object, the present invention provides a method for generating a question and answer record, comprising:

performing word segmentation processing on the obtained chat records, and counting the occurrence frequency of each word segmentation;

summarizing the participles with the frequency larger than a preset threshold value to obtain a popular participle set;

sorting the participles in the popular participle set according to the frequency to generate a popular list;

sequentially selecting one participle according to the arrangement sequence of the participles in the hot word list, and searching the selected participle serving as a search word in the chat records to obtain a problem corresponding to the search word;

acquiring an original question-answer data set, extracting each flow node in the original question-answer data set and corpus data corresponding to the flow node, and marking and combining the flow nodes of the corpus data to obtain a training corpus;

performing feature coding on the training corpus to obtain a training corpus vector, and training a preset multi-classification model by using the training corpus to obtain a problem intention classification model;

inputting the retrieved questions into the question intention classification model to obtain question intentions, answering the questions according to the question intentions to generate question and answer records, and pushing the question and answer records to a client.

Optionally, the performing word segmentation processing on the obtained chat records includes:

preprocessing the chat records according to a preset rule to obtain initial chat records;

performing word segmentation processing on the initial chat records by using a word segmentation tool to obtain a word segmentation chat set;

and screening out the participles from the participle chat set according to a preset keyword dictionary.

Optionally, before performing word segmentation processing on the obtained chat records, the method further includes:

identifying a user corresponding to the chat record;

judging whether the user passes identity verification;

if the user does not pass the identity verification, deleting the chat record of the user;

and if the user passes the identity verification, reserving the chat record corresponding to the user.

Optionally, the sequentially selecting one of the participles according to the ranking order of the participles in the hot list, and retrieving the selected participle as a search term in the chat record to obtain a problem corresponding to the search term includes:

sequentially selecting one participle according to the arrangement sequence of the participles in the hot word list through traversal operation, taking the selected participle as a search word, and carrying out vectorization processing on the search word to obtain a search word vector;

extracting chat keywords in the initial chat records, and performing vectorization processing on the chat keywords to obtain keyword vectors;

calculating the similarity between the search word vector and the keyword vector, and selecting the problem corresponding to the keyword with the similarity being greater than or equal to a preset similarity threshold as the problem corresponding to the search word.

Optionally, the performing feature coding on the corpus to obtain a corpus vector includes:

summarizing the total number of the corpuses of the training corpuses in the original question-answer data set to obtain the total number of the corpuses;

constructing to obtain an initial matrix vector by taking the training corpus as the row number of a preset matrix and taking the total corpus as the column number of the preset matrix;

and setting the position of the column number corresponding to the corpus in the initial matrix vector as a first numerical value, and setting the rest column numbers as second numerical values to obtain the corpus vector.

Optionally, the training a preset multi-classification model by using the training corpus to obtain a problem intention classification model includes:

classifying the training corpus by using the preset multi-classification model to obtain one or more classification intents;

labeling the intention category in the original question-answer data set, and calculating the repetition degree between the intention category and the classification intention;

when the repetition degree is smaller than a preset classification threshold value, carrying out iterative updating on the preset classification model, and classifying the final expression vector again;

and when the repetition degree is greater than or equal to a preset classification threshold value, obtaining a problem intention classification model.

Optionally, the solving the question according to the question intention and generating a question-answer record includes:

selecting a corresponding preset question-answer database according to the question intention, performing matching processing on the question by using the question-answer database, and judging whether the question is matched with a question in the question-answer database;

if the question is matched with the question in the question-answer database, taking the answer corresponding to the question in the question-answer database as the answer of the question, and generating a question-answer record according to the question and the answer;

if the question is not matched with the question in the question-answer database, marking the question as an unanswered question and performing question answering on the unanswered question to obtain an answer of the unanswered question;

and generating a question-answer record according to the unanswered question and the answer of the unanswered question.

In order to solve the above problem, the present invention also provides a question-answer record generating apparatus, including:

the segmentation extraction module is used for performing segmentation processing on the obtained chat records and counting the occurrence frequency of each segmentation;

the hot word list generating module is used for summarizing the words with the frequency greater than a preset threshold value to obtain a hot word set; sorting the participles in the popular participle set according to the frequency to generate a popular list;

the record retrieval module is used for sequentially selecting one of the participles according to the arrangement sequence of the participles in the hot word list, and retrieving the selected participle in the chat records as a retrieval word to obtain a problem corresponding to the retrieval word;

the corpus generating module is used for acquiring an original question-answer data set, extracting each flow node in the original question-answer data set and corpus data corresponding to the flow node, and marking and combining the flow nodes of the corpus data to obtain a corpus;

the model training module is used for carrying out feature coding on the training corpus to obtain a training corpus vector, and training a preset multi-classification model by using the training corpus to obtain a problem intention classification model;

and the question-answer record generating module is used for inputting the retrieved questions into the question intention classification model to obtain question intentions, answering the questions according to the question intentions to generate question-answer records, and pushing the question-answer records to the client.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one instruction; and

and the processor executes the instructions stored in the memory to realize the question and answer record generation method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the above question and answer record generating method.

The method comprises the steps of summarizing the participles with the frequency higher than a preset threshold value in the chat records, sorting the participles according to the frequency of the participles, generating a hot word list, training a question intention classification model to perform intention classification on the questions searched according to the participles, wherein the word list comprises a plurality of participles with high mentioning frequency, so that the questions are convenient to answer in a targeted manner, meanwhile, the hot word list also represents the questions which are commonly concerned by people, the question intention can roughly estimate the range and the field of the questions, and corresponding answers are performed according to different user intentions, so that the efficiency of answering the questions can be improved. Therefore, the question and answer record generation method, the question and answer record generation device, the electronic equipment and the computer readable storage medium can solve the problem of low efficiency when the question and answer record is generated by matching the question with the question and answer database.

Drawings

Fig. 1 is a schematic flow chart of a method for generating a question and answer record according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of one step in the question answering record generation method shown in FIG. 1;

fig. 3 is a functional block diagram of a question-answer record generating device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device for implementing the method for generating a question and answer record according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application provides a question and answer record generation method. The execution subject of the question and answer record generation method includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiment of the present application. In other words, the question-answer record generating method may be executed by software or hardware installed in the terminal device or the server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Fig. 1 is a schematic flow chart of a method for generating a question and answer record according to an embodiment of the present invention. In this embodiment, the method for generating a question and answer record includes:

and S1, performing word segmentation processing on the obtained chat records, and counting the occurrence frequency of each word segmentation.

In the embodiment of the present invention, the chat record may be acquired from a preset IM (instant messenger) system, where the IM system is a communication system that is customizable and integrates a multi-person video conference function.

In one embodiment of the present invention, the performing word segmentation processing on the obtained chat records includes:

In the embodiment of the present invention, the preset rule refers to removing special symbols and stop words in the chat log, where the special symbols refer to some mathematical symbols, unit symbols, and tab symbols that are used less frequently and are difficult to directly input, such as @, #, etc. The stop words refer to words without practical meaning, such as: removing stop words in the chat records by using words such as "o", "yaho", and the like can refer to a preset stop word list, wherein the stop word list can be an acquired "word bank of stop words in hayada" and an acquired "word bank of stop words in a machine learning intelligent laboratory of university in Sichuan".

In one embodiment of the invention, a preset Jieba word segmentation device can be used for carrying out word segmentation on the initial chat records to obtain a word segmentation chat set.

Further, in the embodiment of the present invention, the preset keyword dictionary includes professional terms in a preset field, and the word segmentation is screened out from the word segmentation chat set in order to extract keywords fitting an actual application scene, so that useless data is prevented from being extracted and data redundancy is prevented from being caused.

For example, the chat log includes: is it asked for "# that hyperlipidemia leads to the infarct? "how should the doctor and the cerebral infarction be treated? ", removing special signs and stop words from the chat records, removing two special signs of" # "and" rah "in the chat records, and performing word segmentation on the processed chat records to obtain a word segmentation chat set: "ask/hyperlipemia/meeting/causing/cerebral infarction/do? "," doctor/brain stem/should/what/treatment? Further counting the occurrence frequency of each participle in the participle chat set, and judging whether the participle appears in a preset keyword dictionary, wherein the occurrence frequency of the cerebral infarction is 2 times, and the occurrence frequency of the blood fat is 1 time.

In addition, in another embodiment of the present invention, before performing the word segmentation process on the obtained chat records, the method further includes:

identifying a user corresponding to the chat record;

judging whether the user passes identity verification;

In detail, the identity verification of the user corresponding to the chat record is performed to check whether the user is on a predefined user list, only the speech of the user on the user list can be used as the chat record to be processed subsequently, the speech of the user not on the user list cannot be adopted, and the obtained chat record has no actual reference meaning at this time.

And S2, summarizing the participles with the frequency larger than the preset threshold value to obtain a popular participle set.

In the embodiment of the present invention, the preset threshold is a standard for determining whether the word segmentation corresponding to the frequency is hot word segmentation, if the frequency is greater than the preset threshold, the word segmentation corresponding to the frequency is hot word segmentation, and if the frequency is less than or equal to the preset threshold, the frequency of the word segmentation corresponding to the frequency is relatively low, and the word segmentation cannot be determined as hot word segmentation.

S3, sorting the participles in the popular participle set according to the frequency, and generating a hot word list.

In one embodiment of the present invention, the sorting the participles in the popular participle set according to the frequency to generate a popular list, including:

acquiring the participles in the popular participle set and the corresponding frequency of the participles;

arranging the participles according to the sequence of the corresponding frequencies from large to small to obtain an initial list; intercepting the preset number of the participles in the initial list to generate a hot list.

Wherein, the preset number may be 10.

In detail, the step of intercepting the preset number of the pronouns in the initial list is to further screen the initial list, and although the pronouns with the frequency greater than the preset threshold are collected and arranged to obtain the initial list, the initial list may contain more low-frequency pronouns, so that the preset number of the pronouns in the initial list is intercepted to generate a hot list. In one application scenario of the invention, the hot word list can facilitate the anchor in the live broadcast room to more intuitively know the related topics and questions that the audience wants to know, and the anchor can solve the questions on the hot word list in the order from top to bottom.

S4, sequentially selecting one of the participles according to the arrangement sequence of the participles in the heat word list, and taking the selected participle as a search word to search in the chat records to obtain the problem corresponding to the search word.

In the embodiment of the present invention, referring to fig. 2, the sequentially selecting one of the participles according to the ranking order of the participles in the hot list, and retrieving the selected participle as a search term in the chat records to obtain a problem corresponding to the search term includes:

s401, sequentially selecting one participle according to the arrangement sequence of the participles in the hot word list through traversal operation, taking the selected participle as a search word, and performing vectorization processing on the search word to obtain a search word vector;

s402, extracting chat keywords in the initial chat records, and performing vectorization processing on the chat keywords to obtain keyword vectors;

s403, calculating the similarity between the search term vector and the keyword vector, and selecting the problem corresponding to the keyword with the similarity larger than or equal to a preset similarity threshold as the problem corresponding to the search term.

Specifically, in the embodiment of the present invention, the initial chat record may include a plurality of chat keywords, where the chat keywords are medical nouns in the medical field, and the similarity between the search word and the chat keywords is calculated, and as long as one chat keyword in sentences in the initial chat record matches, a corresponding sentence is used as a question corresponding to the search word.

In detail, when the question corresponding to the keyword whose similarity is greater than or equal to the preset similarity threshold is selected as the question corresponding to the search term in the embodiment of the present invention, the question may be a chat record presented in a question form. If the searched keyword appears in the chat records which are not question, the corresponding chat records can not be classified as the problems corresponding to the search words.

For example, the term is "brainstem", and the initial chat records are: "is cerebral infarction and heart disease very frequent in modern society, and a truly effective method for treating cerebral infarction is present at the end? However, the following three are the most effective ones. ", the chat keywords in the initial chat records are: the 'cerebral infarction' and the heart disease can take the problem containing the 'cerebral infarction' corresponding to the initial chat records as the problem corresponding to the search word by calculating the similarity between the search word vector and the keyword vector.

The similarity between the search term vector and the keyword vector can be calculated by adopting a plurality of calculation methods, including, but not limited to, calculating by adopting a cosine similarity formula, calculating by adopting an euclidean distance, and the like.

Optionally, in an embodiment of the present invention, the calculating a similarity between the search term vector and the keyword vector includes:

calculating the similarity between the search term vector and the keyword vector by using the following formula:

wherein cos (a, b) is similarity, a is the search term vector, b is the keyword vector, and | a |, | b | are the module corresponding to the search term vector and the module corresponding to the keyword vector, respectively.

The embodiment of the invention can carry out vectorization processing on the search word and the chat keyword according to a preset word2vec algorithm to obtain the search word vector and the keyword vector.

S5, obtaining an original question-answer data set, extracting each flow node in the original question-answer data set and corpus data corresponding to the flow nodes, and marking and combining the flow nodes of the corpus data to obtain a training corpus.

In the embodiment of the present invention, the original question-answer data set may be medical questions and corresponding answers included in a service scenario of an intelligent question-answer.

In detail, each process node in the original question-answer data set may be each round of dialog between the medical expert and the questioner, and the corresponding corpus data refers to the answer of the medical expert to the dialog between the medical expert and the questioner per round of dialog.

The embodiment of the invention marks the flow nodes of the corpus data and then combines the flow nodes to obtain the training corpus.

And S6, performing feature coding on the training corpus to obtain a training corpus vector, and training a preset multi-classification model by using the training corpus to obtain a problem intention classification model.

In the embodiment of the present invention, the feature coding of the corpus may adopt methods such as unique hot coding, target coding, bayes target coding, and the like, wherein the present invention uses the preset unique hot coding to perform the feature coding of the corpus to obtain the corpus vector.

Specifically, the performing feature coding on the corpus to obtain a corpus vector includes:

For example, the corpus total number is 5, i.e., assuming five dialogues, the corpus vector of the first round of corpus is [1,0,0,0,0 ].

In an embodiment of the present invention, the preset multi-classification model may be a shallow neural network or a support vector machine model. In the embodiment of the invention, the intention corresponding to each question can be identified by the question intention classification model obtained by training the multi-classification model.

In detail, the training of the preset multi-classification model by using the training corpus to obtain a problem intention classification model includes:

In detail, the embodiment of the present invention labels the intention category in the original question-answer dataset by judging the actual intention of the answers of the medical experts in the dialog between each round of medical experts and the questioner. For example, the first round of dialog is: questioner personnel: "ask you to solve a brainstem-related question? ", medical professional: "yes, i can. ", the intention category of the first round of dialogue is to confirm the field of medical expert solution, and the second round of dialogue is: the questioner: "what are the effective treatment regimens for cerebral infarction? ", medical professional: "the following are the most common treatment options for the treatment of cerebral infarction at present, first … … …", and the intention category of the second round of dialogue is to confirm the solution of a specific problem.

And S7, inputting the searched questions into the question intention classification model to obtain question intentions, answering the questions according to the question intentions to generate question and answer records, and pushing the question and answer records to a client.

In the embodiment of the invention, the retrieved questions are input into the question intention classification model to obtain the question intentions, the question intentions can roughly estimate the range and the field of the questions, and the corresponding answers are carried out according to different user intentions, so that the question answering efficiency can be improved.

And solving the retrieved questions according to the question intentions to obtain corresponding answers, and summarizing the questions and the corresponding answers to generate the question-answer records.

Specifically, the solving the question according to the question intention and generating a question-answer record includes:

The different question intentions correspond to different question and answer databases, and the question and answer databases comprise common questions corresponding to some related intentions and answers corresponding to the common questions.

In detail, if the question does not match the question in the question database, the embodiment of the present invention marks the question as an unanswered question, and the unanswered question does not search for a matching question in the question-answer database, and may obtain an answer through manual answering.

Specifically, the pushing the question and answer record to the user side includes:

transmitting the question and answer records to a data push engine according to the transmission file of the question and answer records;

and pushing the question and answer record to a user side by using the data pushing engine.

The method comprises the steps of summarizing the participles with the frequency higher than a preset threshold value in the chat records, sorting the participles according to the frequency of the participles, generating a hot word list, training a question intention classification model to perform intention classification on the questions searched according to the participles, wherein the word list comprises a plurality of participles with high mentioning frequency, so that the questions are convenient to answer in a targeted manner, meanwhile, the hot word list also represents the questions which are commonly concerned by people, the question intention can roughly estimate the range and the field of the questions, and corresponding answers are performed according to different user intentions, so that the efficiency of answering the questions can be improved. Therefore, the question-answer record generation method provided by the invention can solve the problem of low efficiency when the question is matched with the question-answer database to generate the question-answer record.

Fig. 3 is a functional block diagram of a question and answer record generating device according to an embodiment of the present invention.

The question-answer record generating apparatus 100 according to the present invention may be installed in an electronic device. According to the realized functions, the question-answer record generating device 100 may include a participle extracting module 101, a hot list generating module 102, a record retrieving module 103, a corpus generating module 104, a model training module 105, and a question-answer record generating module 106. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the segmentation extraction module 101 is configured to perform segmentation processing on the obtained chat records, and count the occurrence frequency of each segmentation;

the hot word list generating module 102 is configured to summarize the partial words with the frequency greater than a preset threshold value to obtain a hot partial word set; sorting the participles in the popular participle set according to the frequency to generate a popular list;

the record retrieval module 103 is configured to sequentially select one of the participles according to the arrangement order of the participles in the hot word list, and retrieve the selected participle in the chat record as a retrieval word to obtain a problem corresponding to the retrieval word;

the corpus generating module 104 is configured to obtain an original question-answer dataset, extract each flow node in the original question-answer dataset and corpus data corresponding to the flow node, and mark and combine the flow nodes of the corpus data to obtain a corpus;

the model training module 105 is configured to perform feature coding on the corpus to obtain corpus vectors, and train a preset multi-classification model by using the corpus to obtain a problem intention classification model;

the question-answer record generating module 106 is configured to input the retrieved question into the question intention classification model to obtain a question intention, solve the question according to the question intention, generate a question-answer record, and push the question-answer record to a client.

In detail, the specific implementation of each module of the question-answer record generating device 100 is as follows:

step one, the segmentation extraction module 101 performs segmentation processing on the obtained chat records, and counts the occurrence frequency of each segmentation.

In addition, in another embodiment of the present invention, before the segmentation process is performed on the obtained chat records by the segmentation extracting module 101, the segmentation extracting module 101 is further configured to:

identifying a user corresponding to the chat record;

judging whether the user passes identity verification;

Step two, the hot word list generating module 102 summarizes the participles with the frequency greater than the preset threshold value to obtain a hot participle set, obtains the hot participle set, performs ranking processing on the participles in the hot participle set, and generates the hot word list according to the ranking.

And thirdly, sorting the participles in the popular participle set according to the frequency to generate a popular word list.

Wherein, the preset number may be 10.

And fourthly, sequentially selecting one of the participles according to the arrangement sequence of the participles in the hot word list, and searching the selected participle in the chat records as a search word to obtain the problem corresponding to the search word.

In this embodiment of the present invention, the record retrieval module 103 sequentially selects one of the participles according to the ranking order of the participles in the hot list, and retrieves the selected participle as a retrieval word in the chat record to obtain a problem corresponding to the retrieval word:

sequentially selecting one participle according to the arrangement sequence of the participles in the hot word list through traversal operation, taking the selected participle as a search word, performing vectorization processing on the search word to obtain a search word vector, extracting chat keywords in the initial chat record, and performing vectorization processing on the chat keywords to obtain a keyword vector;

And fifthly, acquiring an original question-answer data set, extracting each flow node in the original question-answer data set and corpus data corresponding to the flow node, and marking and combining the flow nodes of the corpus data to obtain the training corpus.

And step six, carrying out feature coding on the training corpus to obtain a training corpus vector, and training a preset multi-classification model by using the training corpus to obtain a problem intention classification model.

And seventhly, the question and answer record generating module 106 inputs the retrieved questions into the question intention classification model to obtain question intentions, answers the questions according to the question intentions to generate question and answer records, and pushes the question and answer records to a client.

Wherein, the question-answer database comprises some common questions and answers corresponding to the common questions.

Fig. 4 is a schematic structural diagram of an electronic device implementing a method for generating a question and answer record according to an embodiment of the present invention.

The electronic device 1 may include a processor 10, a memory 11, and a bus, and may further include a computer program, such as a question and answer record generating program 12, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of the question and answer record generating program 12, but also to temporarily store data that has been output or is to be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (such as a question and answer record generating program) stored in the memory 11 and calling data stored in the memory 11.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.

Fig. 4 only shows an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 4 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The question-answer record generating program 12 stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, which when executed in the processor 10, can realize:

Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiments corresponding to fig. 1 to fig. 4, which is not repeated herein.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for generating a question-answer record, the method comprising:

2. The method of claim 1, wherein the segmenting the obtained chat history includes:

3. The method of generating a question-answer record according to claim 1, wherein before performing the word segmentation process on the obtained chat records, the method further comprises:

identifying a user corresponding to the chat record;

judging whether the user passes identity verification;

4. The method for generating question-answer records according to claim 1, wherein the sequentially selecting one of the participles according to the ranking order of the participles in the heat word list, and using the selected participle as a search word to search in the chat records to obtain the question corresponding to the search word comprises:

5. The method according to claim 1, wherein the performing feature coding on the corpus to obtain corpus vectors comprises:

6. The method for generating question-answer records according to claim 1, wherein the training of the preset multi-classification model by using the training corpus to obtain the question intention classification model comprises:

7. The method according to any one of claims 1 to 5, wherein the solving the question according to the question intention and generating a question-answer record includes:

8. A question-answer record generation apparatus, characterized in that the apparatus comprises:

the hot word list generating module is used for summarizing the participles with the frequency larger than a preset threshold value to obtain a hot participle set, and sequencing the participles in the hot participle set according to the frequency to generate a hot word list;

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the question-answer record generation method according to any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the question-answer record generation method according to any one of claims 1 to 7.