CN116431787A

CN116431787A - Method, device, equipment and computer storage medium for determining reply information

Info

Publication number: CN116431787A
Application number: CN202310364255.5A
Authority: CN
Inventors: 赵康辉; 黄彩云; 周佳; 白国涛; 孙昊; 张毅
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Priority date: 2023-03-31
Filing date: 2023-03-31
Publication date: 2023-07-14

Abstract

The application discloses a method, a device, equipment and a computer storage medium for determining reply information. The method comprises the following steps: acquiring problem information; inputting the problem information into a pre-trained generation model, and determining reply information of the problem information through the pre-trained generation model; inputting the reply information into a classification discriminator, and searching target problem information with preset similarity with the problem information in a corpus under the condition that the judging result output by the classification discriminator is first indication information; calculating the similarity of the reply information and the target reply information corresponding to the target problem information; and outputting the reply information with the highest similarity under the condition that the similarity is larger than the target value. Thus, the uncontrollability of the recovery is improved, and the reliability of the recovery is improved.

Description

Method, device, equipment and computer storage medium for determining reply information

Technical Field

The application belongs to the field of artificial intelligence, and particularly relates to a method, a device, equipment and a computer storage medium for determining reply information.

Background

The existing chat module generally uses a search or generation mode to acquire replies of questions, the search mode generally searches for questions similar to the questions of the user in an existing corpus database, then returns the replies of the similar questions to the user, the generation mode generally generates replies of the questions through a generation model (such as GPT3, T5, bart and the like), and automatically generates replies to the questions input by the user during reasoning based on the trained generation model. The replies obtained by retrieving similar problems or generated by generating models are easy to cause that the replies are not the replies which the user wants to obtain, and the obtained replies have uncontrollability, so that the reliability of the replies is lower.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment and a computer storage medium for determining reply information, which can improve the uncontrollability of reply and improve the reliability of reply.

In a first aspect, an embodiment of the present application provides a method for determining reply information, where the method includes:

acquiring problem information;

inputting the problem information into a pre-trained generation model, determining reply information of the problem information through the pre-trained generation model, wherein the generation model is a model for generating corresponding reply information according to the input problem information, and the reply information at least comprises one piece;

inputting the reply information into the classification discriminator, and searching target problem information with preset similarity with the problem information in the corpus under the condition that the judging result output by the classification discriminator is first indication information;

calculating the similarity of the reply information and the target reply information corresponding to the target problem information;

and outputting the reply information with highest similarity under the condition that the similarity is larger than the target value.

In one possible implementation embodiment, the method further includes:

and outputting the preset reply information under the condition that the similarity is not greater than the target value.

In one possible implementation embodiment, the method further includes:

and outputting the preset reply information when the judging result output by the classification discriminator is the second indicating information.

In one possible implementation, before inputting the problem information into the pre-trained generative model, the method further comprises:

acquiring multiple rounds of corpus information, wherein the corpus information comprises a question information sample and a reply information sample;

inputting the multiple corpus information into a generation model, and obtaining prediction reply information through the generation model;

and under the condition that the error between the predicted reply information and the real reply information is within a preset range, obtaining a trained generation model.

In one possible implementation embodiment, the multiple-round corpus information is at least two sets of multiple-round corpus information, and before the reply information is input to the classification discriminator, the method further includes:

acquiring turns of at least two sets of multiple-turn corpus information, wherein the corpus information of one turn comprises question information and reply information;

under the condition that the turns are larger than a preset threshold, positive samples are constructed according to a first preset rule by utilizing the same set of multi-turn corpus information, negative samples are constructed according to a second preset rule by utilizing at least two sets of multi-turn corpus information, the initial classification discriminators are trained to obtain the classification discriminators, the first preset rule is to add reply information after one piece of problem information or problem information after one piece of reply information for each sample, and the second preset rule is to include the problem information and/or the reply information in the at least two sets of multi-turn corpus information for each sample.

In one possible implementation embodiment, the method further includes:

converting the multiple times of corpus information into multiple question-answer pairs;

and storing the question-answer pairs into a corpus, wherein the corpus comprises a plurality of question-answer pairs, and one question-answer pair comprises one question information and reply information corresponding to at least one question information.

In a second aspect, an embodiment of the present application provides a device for determining reply information, where the device includes:

the acquisition module is used for acquiring the problem information;

the generating module is used for inputting the problem information into a pre-trained generating model, determining the reply information of the problem information through the pre-trained generating model, wherein the generating model is a model for generating corresponding reply information according to the input problem information, and the reply information at least comprises one piece of reply information;

the judging module is used for inputting the reply information into the classification discriminator, and searching the target problem information with preset similarity with the problem information in the corpus under the condition that the judging result output by the classification discriminator is the first indication information;

the calculation module is used for calculating the similarity of the reply information and the target reply information corresponding to the target problem information;

and the determining module is used for outputting the reply information with highest similarity under the condition that the similarity is larger than the target value.

In a third aspect, an embodiment of the present application provides an electronic device, including:

a processor and a memory storing computer program instructions;

the processor, when executing the computer program instructions, implements any one of the above methods for determining the reply information.

In a fourth aspect, embodiments of the present application provide a computer storage medium, where computer program instructions are stored, where the computer program instructions, when executed by a processor, implement a method for determining reply information in any one of the above.

In a fifth aspect, embodiments of the present application provide a computer program product, where instructions in the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform a method for determining reply information according to any one of the above.

According to the method, the device, the equipment and the computer storage medium for determining the reply information, the problem information is acquired, the problem information is input into a pre-trained generation model, the reply information of the problem information is determined through the pre-trained generation model, the generation model is a model for generating corresponding reply information according to the input problem information, the reply information at least comprises one piece of reply information, the reply information is input into a classification discriminator, target problem information with preset similarity to the problem information is searched in a corpus under the condition that a judging result output by the classification discriminator is first indication information, the similarity of the reply information and target reply information corresponding to the target problem information is calculated, and under the condition that the similarity is larger than a target value, the reply information with the highest similarity is output. In this way, the rationality of the reply information is judged by the classification discriminator, if the rationality is high, the target problem information similar to the problem information is searched in the corpus, the similarity between the reply information and the target reply information corresponding to the target problem information is calculated, the reply information with the highest similarity is selected as the final reply information, the uncontrollable property of the reply information can be improved, and the reliability of the reply information is further improved

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described, and it is possible for a person skilled in the art to obtain other drawings according to these drawings without inventive effort.

FIG. 1 is a flowchart illustrating a method for determining reply messages according to one embodiment of the present disclosure;

FIG. 2 is a flowchart illustrating a method for determining reply messages according to another embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a reply message determining apparatus according to another embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to still another embodiment of the present application.

Detailed Description

Features and exemplary embodiments of various aspects of the present application are described in detail below to make the objects, technical solutions and advantages of the present application more apparent, and to further describe the present application in conjunction with the accompanying drawings and the detailed embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative of the application and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by showing examples of the present application.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

In order to solve the problems in the prior art, embodiments of the present application provide a method, an apparatus, a device, and a computer storage medium for determining reply information. The following first describes a method for determining reply information provided in the embodiments of the present application.

Fig. 1 is a flow chart illustrating a method for determining reply information according to an embodiment of the present application.

As shown in fig. 1, the method for determining reply information provided in the embodiment of the present application includes steps S110 to S150.

S110, acquiring problem information.

In some embodiments, the issue information may be user-entered issue information.

In some embodiments, the problem information may be any of text, audio, and video.

S120, inputting the problem information into a pre-trained generation model, determining reply information of the problem information through the pre-trained generation model, wherein the generation model is a model for generating corresponding reply information according to the input problem information, and the reply information at least comprises one piece.

In some embodiments, the generative model may generate the reply message based on the problem information, e.g., the generative model generates a class model for GPT3, T5, bart, etc. Here, the reply message is a reply message corresponding to the problem information, and the reply message includes at least one piece.

S130, inputting the reply information into the classification discriminator, and searching target problem information with preset similarity with the problem information in the corpus under the condition that the judging result output by the classification discriminator is the first indication information.

In some embodiments, at least one piece of reply information corresponding to the problem information is input to the classification discriminator, whether the at least one piece of reply information is reasonable is judged by the classification discriminator, and if the judgment result of the at least one piece of reply information output by the classification discriminator is the first indication information, the target problem information with the preset similarity to the problem information is searched in the corpus, wherein the first indication information is information representing that the input reply information is reasonable reply information, for example, the first indication information is reasonable.

In some embodiments, the reply information and the multiple corpus before the reply information can be spliced together and input to a classification discriminator to judge the rationality of the reply information.

In some embodiments, the preset similarity is set in advance by the user, and in the case that at least one piece of reply information is reasonable, the problem information is converted into a sentence vector (for example, the sentence vector may be generated by using the penultimate layer of the generation model, may be 768-dimensional sentence vector), the target problem information having the preset similarity with the sentence vector is searched in the corpus, the target problem information at least includes one piece of target reply information corresponding to the target problem information is obtained, the target problem information corresponds to at least one piece of target reply information, where the problem information in the corpus and the reply information corresponding to the problem information are stored in the form of sentence vectors (for example, the sentence vector stored in the corpus may also be generated by using the penultimate layer of the generation model, and is 768-dimensional sentence vector).

As an example, comparing the question information with the question information in the corpus, the target question information similar to the question information Q is retrieved as Q, where q= { Q ₁ ,q ₂ ,…q _n By presetting the similarity tau ₁ Controlling the size of set Q, problem information Q and Q ₁ ,q ₂ ,…q _n The similarity value of (1) is expressed as mu ₁ ，μ ₂ …μ _n Here, μ ₁ ，μ ₂ …μ _n Is greater than a preset similarity tau ₁ 。

In some embodiments, the predetermined reply information is output without retrieving the target question information having a preset similarity to the question information.

S140, calculating the similarity of the reply information and the target reply information corresponding to the target problem information.

In some embodiments, the reply information is converted into a sentence vector, and a similarity between the reply information and the target reply information corresponding to the target question information in the corpus is calculated, wherein the manner of converting the reply information into the sentence vector is the same as the manner of converting the question information into the sentence vector.

As an example, the target reply information corresponding to the target problem information is a, where a= { a ₁₁ ,a ₁₂ ,…a _1k ,a _n1 ,a _n2 ,…a _nm Converting the reply information into sentence vectors, respectively calculating the similarity between the reply information and the target reply information, wherein the similarity at least comprises mu ₁₁ ，μ ₁₂ ，…μ ₁ k，…μ _n1 ，μ _n2 ，…μ _nm 。

And S150, outputting the reply information with highest similarity under the condition that the similarity is larger than the target value.

In some embodiments, the similarity is ranked, and if the similarity is greater than the target value, the reply message with the highest similarity is output.

In some embodiments, the similarity between the problem information and the target problem information and the similarity between the reply information and the target reply information are multiplied, similarity products are determined, the similarity products are ordered, and the reply information with the highest similarity product is output under the condition that the similarity products are larger than the target value.

As an example, for the similarity μ ₁₁ ，μ ₁₂ ，…μ _1k ，…μ _n1 ，μ _n2 ，…μ _nm Ordering at leastA similarity greater than a target value tau ₂ In the case of (2), the reply message with the highest similarity is output.

In another example, μ is calculated separately ₁ ，μ ₂ …μ _n Sum mu ₁₁ ，μ ₁₂ ，…μ _1k ，…μ _n1 ，μ _n2 ，…μ _nm The similarity product between the two is larger than the target value tau at least one similarity product ₂ In the case of (2), the reply message with the highest similarity product is output.

In this way, the rationality of the reply information is judged by the dichotomy discriminator, if the rationality is high, the target problem information similar to the problem information is searched in the corpus, then the similarity between the reply information and the target reply information corresponding to the target problem information is calculated, and the reply information with the highest similarity is selected as the final reply information, so that the uncontrollability of the reply information can be improved, and the reliability of the reply information is further improved.

Based on this, in some embodiments, it may further include:

In some embodiments, the predetermined reply message is a reply message customized by the user, and may include, but is not limited to, any one of "error", "unrecognizable", "do not understand your meaning", and "i don't know.

In some embodiments, the similarities are ordered, and in the event that the similarities are not greater than the target value, a predetermined reply message is output.

In some embodiments, the similarity of the question information and the target question information, and the similarity of the reply information and the target reply information are multiplied, similarity products are determined, the similarity products are ordered, and the predetermined reply information is output when the similarity products are not greater than the target value.

Thus, for reply information with lower similarity, namely under the condition of inaccurate reply information, the reply information is not output any more, and misguidance to users is reduced.

Based on this, in some embodiments, it may further include:

In some embodiments, the second indication information is information that characterizes the input reply information as unreasonable reply information, e.g., the first indication information is unreasonable.

Thus, for unreasonable reply information, namely, under the condition that the reply information is inaccurate, the reply information is not output any more, and misguidance to a user is reduced.

Based on this, in some embodiments, before S120 described above, the method may further include:

The corpus information can be any one of words, audio and video.

In some embodiments, the generative model may include, but is not limited to, any of a GPT3 model, a T5 model, a Bart model.

In some embodiments, multiple rounds of corpus information is acquired, the multiple rounds of corpus information is input into a generation model, prediction reply information is obtained through the generation model, wherein the last reply information is used as a label, the corpus information of the previous rounds is input into the generation model, the last reply information is predicted, the prediction reply information is output, the prediction reply information is compared with real reply information, a cross entropy loss function L= - [ likelihood '+ (1-y) log (1-y') ], the function is used as a loss function, the final training effect is evaluated through the loss function, y is a label value (positive class value is 1, negative class value is 0) of the real reply information, y 'is a probability value (y' e (0, 1)) of the prediction reply information, and the cross entropy loss function represents the difference between the label of the real reply information and the probability value of the prediction reply information.

As one example, the multiple-pass corpus information is denoted as S, and at least two sets of corpus information S may be included in S _i，i≥2 In S form ₁ For example, for the same set of multiple corpus information S ₁ Generating corpus information S of a plurality of previous turns of model input ₁₁ ，S ₁₂ ，…S _1q The label is S _1(q+1) Generating a reply message output by the model as S' _1(q+1) Will S' _1(q+1) And S is _1(q+1) A comparison is made.

Thus, the generated model is trained through a large number of samples, and the accuracy of the reply information is improved.

Based on this, in some embodiments, the multiple-round corpus information is at least two sets of multiple-round corpus information, and before S130 above, the method may further include:

As an example, take S ₁ For example, for the same set of multiple corpus information S ₁ Comprises S ₁₁ ，S ₁₂ ，…S _1q ，S _1(q+1) Multiple rounds, multiple rounds corpus information S ₁ The round of (q+1), and if (q+1) is greater than a predetermined threshold k, constructing positive samples according to a first predetermined rule, the positive samples being S ₁₁ S ₁₂ …S _1k S _1(k+1) ，S ₁₁ S ₁₂ …S _1(k+1) S _1(k+2) ，…，S ₁₁ S ₁₂ …S _1q S _1(q+1) Constructing a negative sample according to a second predetermined rule by using at least two sets of multi-turn corpus information, for example, the negative sample is S ₁₁ S ₁₂ …S _1k S _2(k+1) ，…，S ₁₁ S ₁₂ …S _1u S _5n And training the initial classification discriminators by using the positive samples and the negative samples to obtain the classification discriminators.

Therefore, the two-class discriminators are trained through a large number of samples, the trained two-class discriminators are used for judging the reasonability of the reply information, and the reliability of the reply information is improved.

Based on this, in some embodiments, the method may further comprise:

In some embodiments, the acquired multiple-pass corpus information is converted into multiple question-answer pairs, and the multiple-pass corpus information is stored in a corpus in the form of question-answer pairs.

As an example, take S ₁ For example, if s _1q Is question information, then s _1q And s _1(q+1) As question-answer pairs to be saved in the corpus if s _1q Corresponding to a plurality of reply messages a ₁₁ ,a ₁₂ ,…a _1j Then it is combined with s _1q As question-answer pairs, the question-answer pairs are stored in a corpus, wherein the question-answer pairs are stored in the corpus in the form of vectors.

In this way, the problem information and the reply information corresponding to at least one problem information are stored in the corpus in the form of vectors, so that the retrieval can be conveniently carried out directly according to the problem information.

In the embodiment provided by the application, as shown in fig. 2, multiple-round corpus information is obtained, and at least two groups of multiple-round corpus information are used for training a classification discriminator for subsequently judging whether the generated reply information is reasonable reply information of the problem information. And converting the collected multiple times of corpus information into a question-answer pair form, and storing the question-answer pair form into a corpus for subsequent retrieval and use. The generation model such as GPT3 is trained by using the corpus information for multiple times so that reply information can be generated. And then inputting the generated reply information into a two-class discriminator for judging rationality, if the reply information is judged to be unreasonable, directly returning the configured fixed reply information, if the reply information is judged to be reasonable, searching a corpus, recalling target problem information with similarity higher than preset similarity in the corpus for the problem information input by a user in the corpus searching process, comparing the similarity of a plurality of target reply information corresponding to the target problem information with high similarity with that of the reply information, and if the target value is exceeded, returning the reply information, otherwise, returning the configured fixed reply information, wherein the configured fixed reply information is the preset reply information. According to the embodiment provided by the application, the retrieval of the problem information corresponding to the combined reply information is added on the basis of the generation model, and the retrieval is performed in the corpus, so that the accuracy of the reply information is improved on the basis of guaranteeing the diversity of the reply information.

Based on the method for determining the reply information provided by the embodiment, correspondingly, the application also provides a specific implementation mode of the device for determining the reply information. Please refer to the following examples.

Referring first to fig. 3, a reply message determining apparatus 300 provided in an embodiment of the present application includes:

an obtaining module 310, configured to obtain problem information;

the generating module 320 is configured to input the problem information into a pre-trained generating model, determine reply information of the problem information through the pre-trained generating model, where the generating model is a model that generates corresponding reply information according to the input problem information, and the reply information includes at least one piece of reply information;

the judging module 330 is configured to input the reply information to the classification discriminator, and search the corpus for target problem information with a preset similarity to the problem information when the judging result output by the classification discriminator is the first indication information;

the calculating module 340 is configured to calculate a similarity of the reply information and the target reply information corresponding to the target problem information;

the determining module 350 is configured to output the reply message with the highest similarity when the similarity is greater than the target value.

Based on this, in some embodiments, the apparatus 300 may further include:

the determining module 350 is further configured to output the predetermined reply message if the similarity is not greater than the target value.

Based on this, in some embodiments, the apparatus 300 may further include:

the judging module 330 is further configured to output predetermined reply information when the judging result output by the classification discriminator is the second indication information.

Based on this, in some embodiments, the apparatus may further include:

the obtaining module 310 is further configured to obtain multiple rounds of corpus information before inputting the problem information to the pre-trained generation model, where the corpus information includes a problem information sample and a reply information sample;

the generating module 320 is further configured to input the multiple-round corpus information into a generating model, and obtain prediction reply information through the generating model;

the determining module 350 is further configured to obtain a trained generation model when an error between the predicted reply message and the real reply message is within a preset range.

Based on this, in some embodiments, the multiple-round corpus information is at least two sets of multiple-round corpus information, and the apparatus 300 may further include:

the obtaining module 310 is further configured to obtain at least two rounds of multiple rounds of corpus information before inputting the reply information to the two classification discriminators, where the corpus information of one round includes a question information and a reply information;

the determining module 350 is further configured to construct a positive sample according to a first predetermined rule by using the same set of multiple corpus information when the turn is greater than a predetermined threshold, construct a negative sample according to a second predetermined rule by using at least two sets of multiple corpus information, and train the initial classification discriminator to obtain the classification discriminator, where the first predetermined rule is to add a reply message after a question information or a question information after a reply message for each sample, and the second predetermined rule is to include the question information and/or the reply message in the at least two sets of multiple corpus information in each sample.

Based on this, in some embodiments, the apparatus 300 may further include:

the conversion module is used for converting the multiple-time corpus information into a plurality of question-answer pairs;

the storage module is used for storing the question-answer pairs into a corpus, wherein the corpus comprises a plurality of question-answer pairs, and one question-answer pair comprises one question information and reply information corresponding to at least one question information.

The modules of the reply information determining device provided in the embodiment of the present application may implement the functions of each step of the reply information determining method provided in fig. 1 and fig. 2, and may achieve the corresponding technical effects, which are not described herein for brevity.

Based on the same inventive concept, the embodiment of the application also provides electronic equipment.

Fig. 4 shows a schematic hardware structure of an electronic device according to an embodiment of the present application.

A processor 401 may be included in an electronic device as well as a memory 402 in which computer program instructions are stored.

In particular, the processor 401 described above may include a central processing unit (Central Processing Unit, CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits of embodiments of the present application.

Memory 402 may include mass storage for data or instructions. By way of example, and not limitation, memory 402 may comprise a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of the foregoing. Memory 402 may include removable or non-removable (or fixed) media, where appropriate. Memory 402 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 402 is a non-volatile solid state memory.

The Memory may include Read Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk storage media devices, optical storage media devices, flash Memory devices, electrical, optical, or other physical/tangible Memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to methods in accordance with aspects of the present disclosure.

The processor 401 reads and executes the computer program instructions stored in the memory 402 to implement any one of the reply information determination methods in the above embodiments.

In one example, the electronic device may also include a communication interface 403 and a bus 404. As shown in fig. 4, the processor 401, the memory 402, and the communication interface 403 are connected to each other by a bus 404 and perform communication with each other.

The communication interface 403 is mainly used to implement communication between each module, device, unit and/or apparatus in the embodiments of the present application.

Bus 404 includes hardware, software, or both, that couple components of the electronic device to one another. By way of example, and not limitation, the buses may include an accelerated graphics port (Accelerated Graphics Port, AGP) or other graphics Bus, an enhanced industry standard architecture (Extended Industry Standard Architecture, EISA) Bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an industry standard architecture (Industry Standard Architecture, ISA) Bus, an Infiniband interconnect, a low pin count (Linear Predictive Coding, LPC) Bus, a memory Bus, a micro channel architecture (MicroChannel Architecture, MCa) Bus, a peripheral component interconnect (Peripheral Component Interconnect, PCI) Bus, a PCI-Express (Peripheral Component Interconnect-X, PCI-X) Bus, a serial advanced technology attachment (Serial Advanced Technology Attachment, SATA) Bus, a video electronics standards association Local Bus (VLB) Bus, or other suitable Bus, or a combination of two or more of these. Bus 404 may include one or more buses, where appropriate. Although embodiments of the present application describe and illustrate a particular bus, the present application contemplates any suitable bus or interconnect. The electronic device may execute the method for determining reply information in the embodiment of the present invention, thereby implementing the method for determining reply information described in fig. 1 and fig. 2.

In addition, in combination with the method for determining reply information in the above embodiment, the embodiment of the application may be implemented by providing a computer storage medium. The computer storage medium has stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement a method of determining reply information in any of the above embodiments.

The present application also provides a computer program product, where instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform various processes of implementing any one of the above embodiments of the method for determining reply information.

It should be clear that the present application is not limited to the particular arrangements and processes described above and illustrated in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions, or change the order between steps, after appreciating the spirit of the present application.

The functional blocks shown in the above-described structural block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor Memory devices, read-Only Memory (ROM), flash Memory, erasable Read-Only Memory (Erasable Read Only Memory, EROM), floppy disks, compact discs (Compact Disc Read-Only Memory, CD-ROM), optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be different from the order in the embodiments, or several steps may be performed simultaneously.

Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In the foregoing, only the specific embodiments of the present application are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, which are intended to be included in the scope of the present application.

Claims

1. A method for determining reply information, comprising:

acquiring problem information;

inputting the problem information into a pre-trained generation model, and determining reply information of the problem information through the pre-trained generation model, wherein the generation model is a model for generating corresponding reply information according to the input problem information, and the reply information at least comprises one piece;

inputting the reply information into a classification discriminator, and searching target problem information with preset similarity with the problem information in a corpus under the condition that the judging result output by the classification discriminator is first indication information;

and outputting the reply information with the highest similarity under the condition that the similarity is larger than the target value.

2. The method for determining reply information according to claim 1, further comprising:

and outputting predetermined reply information under the condition that the similarity is not greater than a target value.

3. The method for determining reply information according to claim 1, further comprising:

and outputting predetermined reply information when the judging result output by the classification discriminator is the second indicating information.

4. The method of claim 1, wherein prior to inputting the problem information into a pre-trained generative model, the method further comprises:

acquiring multiple-pass corpus information, wherein the corpus information comprises a question information sample and a reply information sample;

inputting the multiple corpus information into the generation model, and obtaining prediction reply information through the generation model;

5. The method for determining reply information according to claim 4, wherein the multiple-round corpus information is at least two sets of multiple-round corpus information, and the method further comprises, before inputting the reply information to a classification discriminator:

acquiring turns of the at least two sets of multiple-turn corpus information, wherein the corpus information of one turn comprises a question information and a reply information;

under the condition that the turns are larger than a preset threshold, positive samples are constructed according to a first preset rule by utilizing the same set of multi-turn corpus information, negative samples are constructed according to a second preset rule by utilizing at least two sets of multi-turn corpus information, the initial classification discriminator is trained to obtain the classification discriminator, the first preset rule is that the problem information after one piece of problem information is added to each sample or the problem information after one piece of problem information is added to each sample, and the second preset rule is that the problem information and/or the problem information in the at least two sets of multi-turn corpus information are included in each sample.

6. The method for determining reply information according to claim 1, further comprising:

and storing the question-answer pairs into the corpus, wherein the corpus comprises a plurality of question-answer pairs, and one question-answer pair comprises one question information and at least one reply information corresponding to the question information.

7. A reply message determining apparatus, the apparatus comprising:

the acquisition module is used for acquiring the problem information;

the generation module is used for inputting the problem information into a pre-trained generation model, determining reply information of the problem information through the pre-trained generation model, wherein the generation model is a model for generating corresponding reply information according to the input problem information, and the reply information at least comprises one piece of reply information;

the judging module is used for inputting the reply information into the classification discriminator, and searching target problem information with preset similarity with the problem information in the corpus under the condition that the judging result output by the classification discriminator is first indication information;

and the determining module is used for outputting the reply information with the highest similarity under the condition that the similarity is larger than the target value.

8. An electronic device, the device comprising: a processor and a memory storing computer program instructions;

the processor, when executing the computer program instructions, implements a method for determining in reply information according to any one of claims 1-6.

9. A computer readable storage medium, wherein computer program instructions are stored on the computer readable storage medium, which when executed by a processor, implement the method of determining reply information according to any one of claims 1-6.

10. A computer program product, characterized in that instructions in the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform the method of determining reply information according to any one of claims 1-6.