WO2020151690A1 - 语句生成方法、装置、设备及存储介质 - Google Patents

语句生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020151690A1
WO2020151690A1 PCT/CN2020/073407 CN2020073407W WO2020151690A1 WO 2020151690 A1 WO2020151690 A1 WO 2020151690A1 CN 2020073407 W CN2020073407 W CN 2020073407W WO 2020151690 A1 WO2020151690 A1 WO 2020151690A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
sequence
sequences
predetermined number
candidate
Prior art date
Application number
PCT/CN2020/073407
Other languages
English (en)
French (fr)
Inventor
谭翊章
丁佳晨
缪畅宇
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2021540365A priority Critical patent/JP7290730B2/ja
Publication of WO2020151690A1 publication Critical patent/WO2020151690A1/zh
Priority to US17/230,985 priority patent/US20210232751A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3059Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression

Definitions

  • the embodiments of the present application relate to the field of artificial intelligence, and in particular to a method, device, device, and storage medium for generating sentences.
  • the sentence generation method can be used in any functional dialogue system, machine translation system, question answering system, automatic writing system, reading comprehension system, and is especially suitable for dialogue systems that require a large amount of information and diversity.
  • the method of sentence generation based on deep learning is the current development direction. After obtaining the sentence sequence input by the user, the method of generating the output sequence includes: encoding the input sentence sequence into a vector; decoding the vector to obtain the output sequence.
  • a statement generation method executed by an electronic device, the method includes:
  • a second predetermined number of candidate sentence sequences are selected from the set of at least two types of sentence sequences, the second predetermined number of candidate sentence sequences includes at least two sentence feature types, and the second predetermined number is less than the first predetermined number;
  • the output sequence corresponding to the input sequence is determined.
  • a sentence generating device includes:
  • the encoding module is used to encode the input sequence to obtain a sentence feature vector, and the sentence feature vector is a representation of the input sequence;
  • a decoding module for decoding sentence feature vectors to obtain a first predetermined number of candidate sentence sequences
  • a clustering module configured to cluster a first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets
  • a screening module configured to filter a second predetermined number of candidate sentence sequences from a set of at least two types of sentence sequences, the second predetermined number of candidate sentence sequences including at least two sentence feature types, and the second predetermined number is less than the first predetermined number;
  • the determining module is used to determine the output sequence corresponding to the input sequence according to the second predetermined number of candidate sentence sequences.
  • An electronic device that includes one or more processors and a memory.
  • the memory stores at least one computer readable instruction, at least one program, code set, or computer readable instruction set, at least one computer readable instruction, at least one The program, code set or computer readable instruction set is loaded and executed by one or more processors to implement the statement generation method of the first aspect described above.
  • One or more computer-readable storage media stores at least one computer-readable instruction, at least one program, code set, or computer-readable instruction set, at least one computer-readable instruction, at least one program, code
  • the set or computer-readable instruction set is loaded and executed by one or more processors to implement the statement generation method of the first aspect described above.
  • FIG. 1 is a schematic structural diagram of an application scenario provided by an exemplary embodiment of the present application
  • FIG. 2 is a schematic diagram of the hardware structure of an electronic device provided by an exemplary embodiment of the present application.
  • FIG. 3 is a flowchart of a sentence generation method provided by an exemplary embodiment of the present application.
  • FIG. 4 is a flowchart of a sentence generation method provided by another exemplary embodiment of the present application.
  • FIG. 5 is a schematic diagram of principles involved in a sentence generation method provided by an exemplary embodiment of the present application.
  • FIG. 6 is a flowchart of a sentence generation method provided by another exemplary embodiment of the present application.
  • FIG. 7 is a flowchart of a sentence generation method provided by another exemplary embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a sentence generating apparatus provided by an exemplary embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal provided by an exemplary embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a server provided by an exemplary embodiment of the present application.
  • Decoding In natural language processing, it is the process of generating sentences verbatim based on input data.
  • Clustering a process of using a clustering algorithm to aggregate multiple data into at least two sets of different categories.
  • the clustering algorithm includes at least one of K-means clustering algorithm, mean shift clustering algorithm, density-based clustering algorithm, maximum expectation clustering algorithm using Gaussian mixture model, and agglomerative hierarchical clustering algorithm .
  • Sentence scoring model a mathematical model used to determine the sentence score of the sentence sequence according to the input sentence sequence.
  • the sentence scoring model is used to measure whether a sentence sequence is natural language.
  • sentence scoring models include but are not limited to: Deep Neural Network (DNN) model, Recurrent Neural Networks (RNN) model, embedding model, gradient boosting decision tree (Gradient At least one of Boosting Decision Tree (GBDT) model and Logistic Regression (LR) model.
  • DNN Deep Neural Network
  • RNN Recurrent Neural Networks
  • GBDT Gradient At least one of Boosting Decision Tree
  • LR Logistic Regression
  • the DNN model is a deep learning framework.
  • the DNN model includes an input layer, at least one hidden layer (or intermediate layer) and an output layer.
  • the input layer, at least one hidden layer (or intermediate layer), and the output layer all include at least one neuron, and the neuron is used to process the received data.
  • the number of neurons between different layers can be the same; or, can also be different.
  • the RNN model is a neural network with a feedback structure.
  • the output of a neuron can directly affect itself at the next time stamp, that is, the input of the i-th layer of neurons at time m, in addition to the output of the (i-1) layer of neurons at that time, also includes Its own output at time (m-1).
  • the embedding model is based on the distributed vector representation of entities and relationships, and regards the relationship in each triple instance as a translation from the entity head to the entity tail.
  • the triple instance includes subject, relationship, and object.
  • the triple instance can be expressed as (subject, relationship, object); the subject is the entity head and the object is the entity tail. For example, if Xiao Zhang’s father is Da Zhang, it is expressed as (Xiao Zhang, Dad, Da Zhang) through a triple instance.
  • the GBDT model is an iterative decision tree algorithm, which consists of multiple decision trees, and the results of all trees are added together as the final result.
  • Each node of the decision tree will get a predicted value. Taking age as an example, the predicted value is the average of the ages of all people belonging to the node corresponding to the age.
  • LR model refers to a model established by applying a logistic function on the basis of linear regression.
  • Cluster search (English: beam search): is a heuristic graph search algorithm. In the natural language decoding process, the cluster search is a process of searching the currently obtained sentence sequence set (also called sentence bundle) to obtain the final generated output sequence.
  • Bundle size (beams size, BS): is the number of sentence bundles that are restricted in the beam search algorithm.
  • the current decoding technology is based on beam search, and does not reflect the differences in sentence content, so that after multiple decodings, all candidate sentence sequences tend to be in the same category, and they are usually safe output sequences, namely sentences. Smooth but lack of information output sequences, such as "hehe", "right” and so on.
  • the embodiments of the present application provide a sentence generation method, device, equipment, and storage medium.
  • the sentence feature vector is obtained by encoding the input sequence, and the sentence feature vector is decoded to obtain the first predetermined number of candidate sentence sequences.
  • the first predetermined number of candidate sentence sequences are clustered and filtered to obtain the second predetermined number of candidate sentence sequences, so that the generated multiple candidate sentence sequences include at least two sentence feature types, so that the candidate sentence sequences are based on the second predetermined number
  • the generated output sequence has a large diversity, which avoids the situation that the output sequence output by the dialogue system in the related technology is a safe output sequence, which can effectively meet the needs of users and improve the accuracy of sentence generation.
  • FIG. 1 introduce a schematic structural diagram of an application scenario provided by an exemplary embodiment of the present application.
  • This application scenario includes an input object 100 and an electronic device 200 based on deep learning (hereinafter referred to as electronic device).
  • the electronic device 200 is used to perform the following sentence generation process: obtain the input sequence of the input object 100, and then perform the input sequence In response, an output sequence is generated, and the output sequence is presented to the input object 100.
  • the input sequence is an input sentence sequence to be processed
  • the output sequence is an output sentence sequence that has been processed
  • the sentence generation method is applied to a dialogue system, a machine translation system, a question and answer system, an automatic writing system or a reading comprehension system.
  • the dialogue system obtains the reply sentence corresponding to the sentence to be answered input by the user from the Internet or a local database.
  • the machine translation system obtains the translated sentence corresponding to the sentence to be translated input by the user from the Internet or a local database.
  • the question answering system obtains the answer sentence corresponding to the question sentence input by the user from the Internet or a local database.
  • the automatic writing system obtains the content sentences corresponding to the topic sentences input by the user to describe the topic from the Internet or a local database.
  • the reading comprehension system searches the reading materials provided by the user to obtain the answer sentence corresponding to the question sentence input by the user.
  • the input sequence is the sentence to be replied
  • the output sequence is the reply sentence.
  • the input sequence is the sentence of the first language type to be translated
  • the output sequence is the sentence of the second language type after translation, where the first language type is different from the second language type.
  • the first language type is English
  • the second language type is Chinese.
  • the input sequence is a question sentence
  • the output sequence is an answer sentence
  • the input sequence is the subject sentence and the output sequence is the content sentence.
  • the input sequence is the question sentence and the output sequence is the answer sentence.
  • the input object 100 may be a human
  • the electronic device 200 may be a terminal such as a mobile phone or a computer, and the foregoing sentence generation process is implemented between the human and the terminal.
  • a first application program is installed in the electronic device 200, and the first application program is an application program with a sentence generation function.
  • the first application program is an application program with functions such as question and answer, automatic information reply, and machine translation.
  • the user asks a question (input sequence) to the first application program through text or voice input, and the first application program generates an answer (output sequence) according to the user's question and displays it.
  • a question input sequence
  • the first application program generates an answer (output sequence) according to the user's question and displays it.
  • the input object 100 may be a client
  • the electronic device 200 is a server
  • the foregoing sentence generation process is implemented between the client and the server.
  • the client includes but is not limited to mobile phones, computers, etc.
  • the server can be a server capable of providing various services.
  • the server includes but is not limited to weather query, business consulting, smart customer service (for air ticket service or restaurant service, etc.).
  • FIG. 2 is a schematic diagram of the hardware structure of an electronic device provided by an exemplary embodiment of this application.
  • the electronic device includes one or more processors 10, a memory 20 and a communication interface 30.
  • processors 10 the electronic device includes one or more processors 10, a memory 20 and a communication interface 30.
  • FIG. 2 does not constitute a limitation on the electronic device, and may include more or less components than shown in the figure, or a combination of certain components, or different component arrangements. among them:
  • One or more processors 10 are the control centers of electronic devices, which use various interfaces and lines to connect various parts of the entire electronic device, run or execute software programs and/or modules stored in the memory 20, and call The data in 20 performs various functions of the electronic device and processes the data, so as to control the electronic device as a whole.
  • the one or more processors 10 may be implemented by a CPU, or may be implemented by one or more graphics processors (English Graphics Processing Unit, GPU for short).
  • the memory 20 can be used to store software programs and modules.
  • One or more processors 10 execute various functional applications and data processing by running software programs and modules stored in the memory 20.
  • the memory 20 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system 21, an acquisition module 22, an encoding module 23, a decoding module 24, a clustering module 25, a screening module 26, a determination module 27, and at least An application 28 required for a function (such as neural network training, etc.), etc.; the data storage area can store data created according to the use of the electronic device, etc.
  • the memory 20 may be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (Static Random Access Memory, SRAM for short), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory (EEPROM for short), Erasable Programmable Read-Only Memory (EPROM for short), Programmable Read-Only Memory (PROM for short), Read-only memory ( Read Only Memory, ROM for short), magnetic memory, flash memory, magnetic disk or optical disk.
  • the memory 20 may further include a memory controller to provide one or more processors 10 to access the memory 20.
  • one or more processors 20 perform the following functions by running the acquiring module 22: acquiring the input sequence; one or more processors 20 performing the following functions by running the encoding module 23: encoding the input sequence to obtain sentence feature vectors, The feature vector is a representation of the input sequence; one or more processors 20 execute the following functions by running the decoding module 24: decode the sentence feature vector to obtain a first predetermined number of candidate sentence sequences; one or more processors 20 gather by running The class module 25 performs the following functions: clustering a first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets; one or more processors 20 execute the following function by running the filtering module 26: A second predetermined number of candidate sentence sequences is selected from the, the second predetermined number of candidate sentence sequences includes at least two sentence feature types, and the second predetermined number is less than the first predetermined number; one or more processors 20 run the determination module 27 Perform the following functions: determine the output sequence corresponding to the input sequence according to the second predetermined number of candidate sentence
  • Fig. 3 is a flowchart of a sentence generation method provided by an exemplary embodiment of the present application. The method can be implemented by using the electronic device in the aforementioned application scenario. Referring to Fig. 3, the sentence generation method includes the following steps:
  • Step 301 Obtain the input sequence.
  • the input sequence is input text data, or text data recognized based on input voice data or picture data.
  • the electronic device acquiring the input sequence may include: the electronic device receives text data (words, words or sentences), and determines the text data as the input sequence.
  • the electronic device receives voice data, performs voice recognition on the voice data to obtain text data, and determines the text data obtained through voice recognition as the input sequence.
  • the electronic device receives the picture data, performs optical character recognition on the picture data to obtain text data, and determines the recognized text data as the input sequence.
  • Step 302 Perform encoding processing on the input sequence to obtain a sentence feature vector, where the sentence feature vector is a representation of the input sequence.
  • the sentence feature vector is a vector sequence or a single vector.
  • the electronic device encoding the input sequence to obtain the sentence feature vector includes: the electronic device encoding the input sequence into a vector sequence, and the vector sequence includes at least one vector.
  • the electronic device when it encodes a vector sequence, it first performs word segmentation processing on the input sequence to obtain at least one word; then, each word obtained by the word segmentation processing is separately encoded into a vector to form a vector sequence.
  • the electronic device encodes the input sequence into a single vector.
  • An electronic device can use an encoder to encode the input sequence into a vector.
  • the vector encoded by the encoder contains information about all aspects of the input sequence, such as intent (confirmation, inquiry, etc.) and specific named entities (such as location, time, etc.) .
  • the subsequent processing of the input sequence is converted to the processing of the vector.
  • the complexity of the subsequent processing can be greatly reduced.
  • a vector to represent the input sequence can improve semantic integrity.
  • Step 303 Decoding the sentence feature vector to obtain a first predetermined number of candidate sentence sequences.
  • the electronic device decodes the sentence feature vector to obtain a first predetermined number of candidate sentence sequences.
  • the candidate sentence sequence includes at least one decoded word.
  • the first predetermined number is a preset numerical value.
  • the first predetermined numerical value is a user-defined setting or a default setting of the terminal.
  • the first predetermined number is 16 or 24.
  • the generation process of the output sequence includes multiple decoding processes, and each decoding process includes decoding, clustering, and screening.
  • decoding is also called recombination expansion, that is, the decoding process is to expand the decoded words based on a second predetermined number of candidate sentence sequences, and combine the expanded decoded words with the second predetermined number of candidate sentences
  • the sequence is reorganized to obtain a first predetermined number of candidate sentence sequences, and the first predetermined number is greater than the second predetermined number.
  • the clustering includes a process of clustering the first predetermined number of candidate sentence sequences obtained after decoding to obtain at least two types of sentence sequence sets.
  • the screening includes a process of selecting a second predetermined number of candidate sentence sequences from the set of at least two types of sentence sequences obtained by clustering.
  • Step 304 Perform clustering on a first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets.
  • the electronic device clusters the first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets.
  • the sentence sequence set includes at least one candidate sentence sequence.
  • the sentence feature types corresponding to the at least two types of sentence sequence sets are different.
  • the sentence feature type is used to indicate the sentence fluency of the candidate sentence sequence and/or the degree of association between the candidate sentence sequence and the input sequence.
  • Step 305 Filter a second predetermined number of candidate sentence sequences from the set of at least two types of sentence sequences.
  • the second predetermined number of candidate sentence sequences include at least two sentence feature types, and the second predetermined number is less than the first predetermined number.
  • the electronic device screens out a second predetermined number of candidate sentence sequences from at least two types of sentence sequence sets.
  • the electronic device filters at least one candidate sentence sequence from the sentence sequence set to form a second predetermined number of candidate sentence sequences.
  • Step 306 Determine an output sequence corresponding to the input sequence according to the second predetermined number of candidate sentence sequences.
  • the electronic device selects a candidate sentence sequence from the second predetermined number of candidate sentence sequences as the output sequence corresponding to the input sequence.
  • the electronic device selects a candidate sentence sequence from the second predetermined number of candidate sentence sequences according to a preset selection strategy, or randomly selects a candidate sentence sequence as the output sequence corresponding to the input sequence. This embodiment does not limit this.
  • the embodiment of the application obtains sentence feature vectors by encoding the input sequence, decodes the sentence feature vectors to obtain a first predetermined number of candidate sentence sequences, and performs clustering and screening on the first predetermined number of candidate sentence sequences to obtain The second predetermined number of candidate sentence sequences, because the second predetermined number of candidate sentence sequences obtained through clustering and screening includes at least two sentence feature types, so that the output sequence determined according to the second predetermined number of candidate sentence sequences has more
  • the large diversity can effectively meet the needs of users and improve the effect of sentence generation.
  • FIG. 4 shows a flowchart of a sentence generation method provided by another exemplary embodiment of the present application. This method can be implemented by using the electronic device in the aforementioned application scenario.
  • the sentence generation method includes:
  • Step 401 Obtain an input sequence.
  • the electronic device obtains the input sentence through the first application, and generates an input sequence according to the input sentence.
  • Step 402 Perform encoding processing on the input sequence to obtain a sentence feature vector, where the sentence feature vector is a representation of the input sequence.
  • Step 403 Perform the i-th decoding on the sentence feature vector to obtain a first predetermined number of candidate sentence sequences.
  • the candidate sentence sequence includes i decoded words, and the initial value of i is 1.
  • the electronic device decodes the sentence feature vector for the first time to obtain a second predetermined number of candidate sentence sequences.
  • Each candidate sentence sequence includes 1 decoded word.
  • the electronic device decodes the sentence feature vector for the i-th time to obtain the second predetermined number of candidate sentence sequences, including: in the i-th decoding, according to the sentence feature vector and the i-th
  • the second predetermined number of candidate sentence sequences obtained by one decoding are reorganized and expanded to obtain a first predetermined number of candidate sentence sequences, and the first predetermined number is greater than the second predetermined number.
  • the electronic device in the i-th decoding, for at least one candidate sentence sequence among the second predetermined number of candidate sentence sequences obtained from the i-1th decoding, the electronic device reorganizes and expands the candidate sentence sequence to obtain expansion Multiple candidate sentence sequences after.
  • the first predetermined number is a preset value greater than the second predetermined number.
  • the first predetermined number is m times the second predetermined number, and m is a positive integer greater than 1.
  • Step 404 Perform clustering on a first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets.
  • the electronic device clusters the first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets, it may further include: performing deduplication processing on the first predetermined number of candidate sentence sequences, which is used to remove the candidate sentence sequences Repeated words in.
  • the electronic device clusters the first predetermined number of candidate sentence sequences to obtain at least two types of sequence sets, including: for the first predetermined number of candidate sentence sequences, clustering using a specified clustering algorithm to obtain at least two Sequence collection of class statements.
  • the designated clustering algorithm includes at least one of K-means clustering algorithm, mean shift clustering algorithm, density-based clustering algorithm, maximum expectation clustering algorithm using Gaussian mixture model, and agglomerative hierarchical clustering algorithm.
  • this embodiment does not limit the type of the designated clustering algorithm used by the terminal, and the following only takes the designated clustering algorithm as the K-means clustering algorithm as an example for description.
  • At least two types of sentence sequence sets correspond to different sentence feature types.
  • the sentence feature type is used to indicate the sentence fluency of the candidate sentence sequence and/or the degree of association between the candidate sentence sequence and the input sequence.
  • the sentence feature type includes at least one of a first sentence feature type, a second sentence feature type, and a third sentence feature type.
  • the first sentence feature type is used to indicate that the candidate sentence sequence is a safe output sequence, and the safe output sequence is also called a smooth and safe output sequence. That is, the sentence fluency of the candidate sentence sequence is higher than the fluency threshold, and the correlation degree between the candidate sentence sequence and the input sequence is lower than or equal to the correlation threshold.
  • the second sentence feature type is used to indicate that the candidate sentence sequence is an uncomfortable output sequence, that is, the sentence smoothness of the candidate sentence sequence is lower than or equal to the smoothness threshold.
  • the third sentence feature type is used to indicate that the candidate sentence sequence is a smooth and targeted output sequence, that is, the sentence smoothness of the candidate sentence sequence is higher than the smoothness threshold, and the degree of association between the candidate sentence sequence and the input sequence is higher than Correlation threshold.
  • the compliance threshold or the associated threshold is a user-defined setting, or a terminal default setting. This embodiment does not limit this.
  • sentence feature types used by the electronic device during clustering and the number of sentence sequence sets obtained by clustering can be adjusted, which is not limited in this embodiment.
  • At least two types of sentence sequence sets include three types of sentence sequence sets, the first type of sentence sequence set includes multiple candidate sentence sequences of the first sentence feature type, and the first sentence feature type is used to indicate that the candidate sentence sequence is a safe sentence sequence ;
  • the second type of sentence sequence set includes a plurality of candidate sentence sequences of the second sentence feature type, the second sentence feature type is used to indicate that the candidate sentence sequence is an uncomfortable sentence sequence;
  • the third type of sentence sequence set includes multiple third sentences The candidate sentence sequence of the characteristic type, and the third sentence characteristic type is used to indicate that the candidate sentence sequence is a fluent and targeted sentence sequence.
  • Step 405 Filter a second predetermined number of candidate sentence sequences from the set of at least two types of sentence sequences.
  • the second predetermined number of candidate sentence sequences include at least two sentence feature types, and the second predetermined number is less than the first predetermined number.
  • the electronic device pair selects the second predetermined number of candidate sentence sequences from at least two types of sentence sequence sets, including: for each type of sentence sequence set in the at least two types of sentence sequence sets, grouping the sentence sequence sets Sort multiple candidate sentence sequences in the sentence sequence set; obtain the first N candidate sentence sequences in the sentence sequence set after sorting, where N is a positive integer.
  • the electronic device sorts a plurality of candidate sentence sequences in the sentence sequence set according to a preset index.
  • the preset index includes information entropy.
  • the electronic device after the electronic device obtains the K-type sentence sequence set by clustering, it obtains the top N candidate sentence sequences after sorting from each type of sentence sequence set in the K-type sentence sequence set to obtain K*N sentence sequences.
  • candidate sentence sequence where K*N is the second predetermined number.
  • Step 406 When the decoded word obtained by the i-th decoding does not include the predicted termination word, i is increased by 1, and the step of decoding the sentence feature vector for the i-th time to obtain the first predetermined number of candidate sentence sequences is continued.
  • the predicted termination word is a keyword set to terminate decoding.
  • the termination word is "end”.
  • the electronic device When the decoded word obtained by the i-th decoding does not include the predicted termination word, the electronic device will obtain the second predetermined number of candidate sentences obtained at the i-th time (that is, the second A predetermined number of candidate sentence sequences) are used as input for the next decoding (ie, the next time of the current time), and the current i plus 1 is used as the new i-th time, and the above steps 403 to 405 are continued to be executed.
  • the second predetermined number of candidate sentences obtained at the i-th time that is, the second A predetermined number of candidate sentence sequences
  • Step 407 When the decoded word obtained by the i-th decoding includes the predicted termination word, obtain a second predetermined number of candidate sentence sequences after the i-th decoding, clustering and screening.
  • the electronic device obtains the second predetermined number of candidate sentence sequences after the i-th decoding, clustering and screening, and executes step 408.
  • Step 408 Determine an output sequence according to the acquired second predetermined number of candidate sentence sequences.
  • the second predetermined number of candidate sentence sequences in step 408 are the first predetermined number of candidate sentence sequences obtained from the last decoding, and are obtained after performing steps 404 and 405.
  • the electronic device determines the output sequence according to the acquired second predetermined number of candidate sentence sequences, including: acquiring a sentence scoring model, where the sentence scoring model is used to represent the sentence evaluation rule obtained by training based on the sample sentence sequence; For each candidate sentence sequence in the second predetermined number of candidate sentence sequences, input the sentence scoring model to obtain a sentence score. The sentence score is used to indicate the sentence quality of the candidate sentence sequence; according to the sentence score corresponding to the second predetermined number of candidate sentence sequences. To determine the output sequence.
  • the sentence scoring model is a model obtained by training a neural network based on a sample sentence sequence.
  • the sentence scoring model is used to measure the sentence quality of a sentence sequence.
  • sentence quality includes sentence fluency.
  • the sentence scoring model is used to measure whether a sentence sequence is natural language.
  • the sentence scoring model can be pre-trained by the terminal and stored by itself, or it can be pre-trained by the server and sent to the terminal.
  • the sentence scoring model is pre-trained by the server and stored in the server. This embodiment does not limit this. The following only takes the server training sentence scoring model as an example to introduce the model training process.
  • the process of training the sentence scoring model by the server includes: obtaining a training sample set, which includes at least one set of sample data sets; and training the at least one set of sample data sets using an error back propagation algorithm to obtain a sentence scoring model.
  • each sample data group includes: sample sentence sequence and pre-calibrated correct sentence score.
  • the server uses error backpropagation algorithm to train at least one set of sample data sets to obtain sentence scoring model, including but not limited to the following steps:
  • the original parameter model is established based on the neural network model.
  • the original parameter model includes but is not limited to at least one of CNN model, DNN model, RNN model, embedded model, GBDT model, and LR model.
  • the server creates an input-output pair corresponding to the sample data group, the input parameter of the input-output pair is the sample sentence sequence in the sample data group, and the output parameter is the sample data group. Scoring the correct sentence in, the server inputs the input parameters into the original parameter model to obtain the training result.
  • the sample data group includes a sample sentence sequence A and the correct sentence score "sentence score 1".
  • the input and output pairs created by the terminal are: (sample sentence sequence A) -> (sentence score 1); where (sample sentence sequence A) Is the input parameter, and (sentence score 1) is the output parameter.
  • the input and output pairs are represented by feature vectors.
  • the calculation loss is represented by cross-entropy (English: cross-entropy).
  • the terminal calculates the calculated loss H(p, q) through the following formula:
  • p(x) and q(x) are discrete distribution vectors of equal length, p(x) represents the training result; q(x) represents the output parameter; x is the training result or a vector of the output parameters.
  • a sentence scoring model is obtained by using error back propagation algorithm training.
  • the terminal determines the gradient direction of the sentence scoring model according to the calculation loss through a backpropagation algorithm, and updates the model parameters in the sentence scoring model layer by layer from the output layer of the sentence scoring model forward.
  • the electronic device inputs the candidate sentence sequence into the sentence scoring model to calculate the sentence score.
  • the sentence scoring model is trained based on at least one set of sample data sets, and each set of sample data sets includes: a sequence of sample sentences and pre-marked correct sentence scores.
  • the sentence score is used to indicate the sentence quality of the candidate sentence sequence.
  • sentence quality includes sentence fluency.
  • the sentence score has a negative correlation with the sentence quality of the candidate sentence sequence, that is, if the sentence score is lower, the sentence quality of the candidate sentence sequence is higher, and the sentence fluency is higher; if the sentence score is higher, the sentence quality is higher.
  • the sentence score of the candidate sentence sequence when the sentence score of the candidate sentence sequence is lower than the score threshold, it is used to indicate that the candidate sentence sequence is a natural sentence.
  • the scoring threshold is a user-defined setting or a terminal default setting, which is not limited in this embodiment.
  • the electronic device determines the lowest sentence score among the sentence scores corresponding to each of the second predetermined number of candidate sentence sequences;
  • the candidate sentence sequence is determined as the output sequence.
  • the input sequence and the corresponding output sequence are displayed on the electronic device.
  • BS is the second predetermined number
  • C includes the sentence feature vector corresponding to the input sequence
  • rsp is used to represent the output sequence
  • socre lm (hyp) is the sentence score
  • lm th is the score threshold
  • hyp is used to represent the candidate sentence sequence
  • R is used to represent the set of candidate sentence sequences
  • K-means is used to represent the K-means clustering algorithm.
  • the electronic device obtains the input sequence a, encodes the input sequence a to obtain the sentence feature vector A, and the electronic device decodes the sentence feature vector A for the first time to obtain 8 candidate sentence sequences, as shown in Figure 5
  • Figure 5 shows the second decoding process of the electronic device, in which the white circle represents the candidate sentence sequence of the first sentence feature type (such as a safe candidate sentence sequence), and the black circle represents the candidate sentence of the second sentence feature type Sequence (such as a fluent and targeted candidate sentence sequence). 1.
  • the electronic device reorganizes and expands to obtain 16 candidate sentence sequences according to the sentence feature vector and the 8 candidate sentence sequences obtained from the first decoding. 2.
  • the electronic device clusters the 16 candidate sentence sequences to obtain two kinds of sentence sequence sets, namely the first kind of sentence sequence set and the second kind of sentence sequence set.
  • the first kind of sentence sequence set includes 8 safe candidate sentence sequences.
  • the second type of sentence sequence set includes 8 fluent and targeted candidate sentence sequences. 3.
  • the electronic device screens out 4 safe candidate sentence sequences from the first-type sentence sequence set, and selects 4 fluent and targeted candidate sentence sequences from the second-type sentence sequence set to obtain 8 candidate sentences sequence. 4.
  • the electronic device performs the next decoding according to the obtained 8 candidate sentence sequences until the specified end condition is received. Among them, the next decoding can refer to the above-mentioned second decoding process for analogy.
  • the embodiment of the present application also obtains a sentence score model through an electronic device. For each candidate sentence sequence in the second predetermined number of candidate sentence sequences, input the sentence score model to obtain a sentence score, based on the respective corresponding to the multiple candidate sentence sequences Sentence scoring generates an output sequence; because the sentence scoring model is used to express the sentence evaluation rules obtained by training based on the sample sentence sequence, the determined sentence score can accurately reflect the sentence quality of the candidate sentence sequence, thereby ensuring the generated output The sentence quality of the sequence.
  • the sentence generation method includes:
  • Step 601 The electronic device obtains a sentence to be answered input through the dialogue application.
  • the sentence to be answered in the form of voice or text is received.
  • the dialogue application is an application with a human-machine dialogue function installed in an electronic device.
  • the dialog application is used to reply to the input sentence to be answered.
  • Step 602 The electronic device generates an input sequence according to the sentence to be answered.
  • the sentence to be replied is input in text form
  • the sentence to be replied is determined as the input sequence.
  • the voice recognition algorithm is used to convert the sentence to be replied into text data, and the converted text data is determined as the input sequence.
  • Step 603 The electronic device encodes the input sequence to obtain a sentence feature vector.
  • Step 604 The electronic device decodes the sentence feature vector to obtain a first predetermined number of candidate sentence sequences.
  • Step 605 The electronic device clusters the first predetermined number of candidate sentence sequences to obtain at least two types of sentence sequence sets.
  • Step 606 The electronic device screens out a second predetermined number of candidate sentence sequences from the set of at least two types of sentence sequences.
  • the second predetermined number of candidate sentence sequences includes at least two sentence feature types, and the second predetermined number is less than the first predetermined number.
  • Step 607 The electronic device determines the output sequence corresponding to the input sequence according to the second predetermined number of candidate sentence sequences.
  • step 608 the electronic device generates a reply sentence according to the output sequence, and displays the reply sentence through the dialog application.
  • the electronic device determines the output sequence as a reply sentence, and displays the reply sentence in text or voice on the dialog interface of the dialog application.
  • Step 701 The electronic device obtains the sentence to be translated input through the translation application.
  • the sentence to be translated input in the form of voice or text is received.
  • the translation application is an application with translation function installed in an electronic device.
  • the translation application is used to translate the input sentence to be translated.
  • the sentence to be translated is the sentence of the first language type to be translated.
  • Step 702 The electronic device generates an input sequence according to the sentence to be translated.
  • the sentence to be translated is input in text form
  • the sentence to be translated is determined as the input sequence.
  • a speech recognition algorithm is used to convert the sentence to be translated into text data, and the converted text data is determined as the input sequence.
  • step 608 can be replaced and implemented as the following steps:
  • step 708 the electronic device generates a translated sentence according to the output sequence, and displays the translated sentence through a dialogue application.
  • the translated sentence is a sentence of the second language type after translation corresponding to the sentence of the first language type to be translated, wherein the first language type is different from the second language type.
  • the first language type is English
  • the second language type is Chinese.
  • the electronic device determines the output sequence as a translated sentence, and displays the translated sentence in text or voice on the translation interface of the translation application.
  • FIG. 8 shows a schematic structural diagram of a sentence generating apparatus provided by an exemplary embodiment of the present application.
  • the sentence generation device can be realized by a dedicated hardware circuit, or a combination of software and hardware, to become all or part of the electronic device in FIG. 1 or FIG. 2.
  • the sentence generation device includes: an acquisition module 810, an encoding module 820, a decoding module 830, The clustering module 840, the screening module 850, and the determining module 860.
  • the obtaining module 810 is configured to execute the above step 301 or 401.
  • the encoding module 820 is configured to execute the above step 302 or 402.
  • the decoding module 830 is configured to perform step 303 above.
  • the clustering module 840 is configured to execute the above step 304 or 404.
  • the screening module 850 is configured to perform the above step 305 or 405.
  • the determining module 860 is configured to execute the above step 306.
  • the decoding module 830 is further configured to perform step 403 above.
  • the determining module 860 is also used to perform one of the above steps 406 and 407, and step 408.
  • the clustering module 840 is further configured to perform clustering for the first predetermined number of candidate sentence sequences by using a specified clustering algorithm to obtain at least two types of sentence sequence sets, and sentences corresponding to the at least two types of sentence sequence sets.
  • the feature types are different;
  • the designated clustering algorithm includes at least one of K-means clustering algorithm, mean shift clustering algorithm, density-based clustering algorithm, maximum expectation clustering algorithm using Gaussian mixture model, and agglomerative hierarchical clustering algorithm.
  • the sentence feature type includes at least one of a first sentence feature type, a second sentence feature type, and a third sentence feature type;
  • the first sentence feature type is used to indicate that the candidate sentence sequence is a safe sentence sequence
  • the second sentence feature type is used to indicate that the candidate sentence sequence is an inconsistent sentence sequence
  • the third sentence feature type is used to indicate that the candidate sentence sequence is a fluent and targeted sentence sequence.
  • the determining module 860 is also used to obtain a sentence scoring model.
  • the sentence scoring model is used to represent the sentence evaluation rule obtained by training based on the sample sentence sequence; for each candidate in the second predetermined number of candidate sentence sequences Sentence sequence, the sentence score is obtained by inputting the sentence scoring model, and the sentence score is used to indicate the sentence quality of the candidate sentence sequence; the output sequence is determined according to the sentence score corresponding to each of the second predetermined number of candidate sentence sequences.
  • the sentence score has a negative correlation with the sentence quality of the candidate sentence sequence.
  • the determining module 860 is also used to determine the lowest sentence score among the sentence scores corresponding to each of the second predetermined number of candidate sentence sequences; The candidate sentence sequence corresponding to the score is determined as the output sequence.
  • the determining module 860 is further configured to obtain a training sample set.
  • the training sample set includes at least one set of sample data sets, and each set of sample data sets includes: a sample sentence sequence and a pre-labeled correct sentence score; Set the sample data set, and use the error back propagation algorithm to train the original parameter model to obtain the sentence scoring model.
  • the screening module 850 is further configured to sort multiple candidate sentence sequences in the sentence sequence set for each type of sentence sequence set in the sentence sequence set of at least two types;
  • the device further includes: a deduplication module.
  • the deduplication module is used to perform deduplication processing on a first predetermined number of candidate sentence sequences, and the deduplication processing is used to remove duplicate words in the candidate sentence sequences.
  • the input sequence is the sentence to be responded to, and the output sequence is the reply sentence;
  • the input sequence is the sentence of the first language type to be translated
  • the output sequence is the sentence of the second language type after translation, where the first language type is different from the second language type
  • the input sequence is a question sentence
  • the output sequence is an answer sentence
  • the input sequence is the subject sentence and the output sequence is the content sentence;
  • the input sequence is the question sentence and the output sequence is the answer sentence.
  • the obtaining module 810 is also used to obtain the sentence to be answered input through the dialogue application; generate an input sequence according to the sentence to be responded;
  • the device also includes: a display module, which is used to generate a reply sentence according to the output sequence; and display the reply sentence through a dialogue application.
  • the obtaining module 810 is also used to implement any other implicit or public functions related to the obtaining step in the above method embodiment;
  • the encoding module 820 is also used to implement any other implicit or public and coding steps in the above method embodiment Related functions;
  • the decoding module 830 is also used to implement any other implicit or disclosed functions related to the decoding step in the foregoing method embodiments;
  • the clustering module 840 is also used to implement any other implicit or disclosed functions in the foregoing method embodiments Functions related to the clustering step;
  • the screening module 850 is also used to implement any other implicit or public functions related to the screening step in the above method embodiment;
  • the determining module 860 is also used to implement any other implicit in the above method embodiment Or disclosed functions related to the determination step.
  • the device provided in the above embodiment when implementing its functions, only uses the division of the above functional modules for illustration. In practical applications, the above functions can be allocated by different functional modules as required, namely The internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the apparatus and method embodiments provided in the above embodiments belong to the same concept, and the specific implementation process is detailed in the method embodiments, which will not be repeated here.
  • FIG. 9 shows a structural block diagram of a terminal 900 provided by an exemplary embodiment of the present application.
  • the terminal 900 can be: a smartphone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, moving picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, moving picture expert compressing standard audio Level 4) Player, laptop or desktop computer.
  • the terminal 900 may also be called user equipment, portable terminal, laptop terminal, desktop terminal and other names.
  • the terminal 900 includes: one or more processors 901 and a memory 902.
  • the one or more processors 901 may include one or more processing cores, such as a 4-core one or more processors, an 8-core one or more processors, and so on.
  • One or more processors 901 can adopt at least one of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array). A kind of hardware form to realize.
  • the one or more processors 901 may also include one or more main processors and one or more sub-processors. The main one or more processors are one or more for processing data in the awake state.
  • the processor is also called a CPU (Central Processing Unit, central one or more processors); the one or more processors are one or more processors with low power consumption for processing data in a standby state.
  • one or more processors 901 may be integrated with a GPU (Graphics Processing Unit, one or more image processors), and the GPU is responsible for rendering and drawing content to be displayed on the display screen.
  • the one or more processors 901 may further include one or more AI (Artificial Intelligence) processors, and the one or more AI processors are used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 902 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 902 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 902 is used to store at least one computer-readable instruction, and the at least one computer-readable instruction is used to be executed by one or more processors 901 to implement The sentence generation method provided in the method embodiment of this application.
  • the terminal 900 may optionally further include: a peripheral device interface 903 and at least one peripheral device.
  • a peripheral device interface 903 and at least one peripheral device.
  • One or more processors 901, memory 902, and peripheral device interface 903 may be connected through a bus or signal line.
  • Each peripheral device can be connected to the peripheral device interface 903 through a bus, a signal line, or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 904, a touch display screen 905, a camera 906, an audio circuit 907, a positioning component 908, and a power supply 909.
  • the peripheral device interface 903 can be used to connect at least one peripheral device related to I/O (Input/Output) to one or more processors 901 and memory 902.
  • processors 901, memory 902, and peripheral device interface 903 are integrated on the same chip or circuit board; in some other embodiments, one or more processors 901, memory 902, and peripherals Any one or two of the device interfaces 903 can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the radio frequency circuit 904 is used for receiving and transmitting RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • RF Radio Frequency, radio frequency
  • the display screen 905 is used to display a UI (User Interface, user interface).
  • the UI can include graphics, text, icons, videos, and any combination thereof.
  • the display screen 905 also has the ability to collect touch signals on or above the surface of the display screen 905.
  • the touch signal may be input as a control signal to one or more processors 901 for processing.
  • the display screen 905 may also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the camera assembly 906 is used to capture images or videos.
  • the camera assembly 906 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • the audio circuit 907 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to one or more processors 901 for processing, or input to the radio frequency circuit 904 to implement voice communication.
  • the positioning component 908 is used to locate the current geographic location of the terminal 900 to implement navigation or LBS (Location Based Service, location-based service).
  • LBS Location Based Service, location-based service
  • the power supply 909 is used to supply power to various components in the terminal 900.
  • the power source 909 may be alternating current, direct current, disposable batteries, or rechargeable batteries.
  • the terminal 900 further includes one or more sensors 910.
  • the one or more sensors 910 include, but are not limited to: an acceleration sensor 911, a gyroscope sensor 912, a pressure sensor 913, a fingerprint sensor 914, an optical sensor 915, and a proximity sensor 916.
  • the acceleration sensor 911 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 900.
  • the gyroscope sensor 912 can detect the body direction and rotation angle of the terminal 900, and the gyroscope sensor 912 can cooperate with the acceleration sensor 911 to collect the user's 3D actions on the terminal 900.
  • the pressure sensor 913 may be disposed on the side frame of the terminal 900 and/or the lower layer of the touch screen 905.
  • the pressure sensor 913 When the pressure sensor 913 is arranged on the side frame of the terminal 900, the user's holding signal of the terminal 900 can be detected, and the one or more processors 901 perform left and right hand recognition or quick operation according to the holding signal collected by the pressure sensor 913.
  • the pressure sensor 913 When the pressure sensor 913 is disposed at the lower layer of the touch display screen 905, one or more processors 901 operate according to the user's pressure on the touch display screen 905 to control the operability controls on the UI interface.
  • the fingerprint sensor 914 is used to collect the user's fingerprint.
  • the one or more processors 901 can identify the user's identity according to the fingerprint collected by the fingerprint sensor 914, or the fingerprint sensor 914 can identify the user's identity according to the collected fingerprints.
  • the optical sensor 915 is used to collect the ambient light intensity.
  • the one or more processors 901 may control the display brightness of the touch screen 905 according to the ambient light intensity collected by the optical sensor 915.
  • the proximity sensor 916 also called a distance sensor, is usually provided on the front panel of the terminal 900.
  • the proximity sensor 916 is used to collect the distance between the user and the front of the terminal 900.
  • FIG. 9 does not constitute a limitation on the terminal 900, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
  • FIG. 10 shows a schematic structural diagram of a terminal 1000 according to an exemplary embodiment of the present application.
  • the server 1000 includes a central processing unit (CPU) 1001, a system memory 1004 including a random access memory (RAM) 1002 and a read only memory (ROM) 1003, and a system bus connecting the system memory 1004 and the central processing unit 1001 1005.
  • the server 1000 also includes a basic input/output system (I/O system) 1006 that helps various devices in the computer transfer information, and a mass storage device 1007 for storing the operating system 1013, application programs 1014, and other program modules 1015.
  • I/O system basic input/output system
  • the basic input/output system 1006 includes a display 1008 for displaying information and an input device 1009 such as a mouse and a keyboard for the user to input information.
  • the display 1008 and the input device 1009 are both connected to the central processing unit 1001 through the input and output controller 1010 connected to the system bus 1005.
  • the basic input/output system 1006 may also include an input and output controller 1010 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 1010 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 1007 is connected to the central processing unit 1001 through a mass storage controller (not shown) connected to the system bus 1005.
  • the mass storage device 1007 and its associated computer readable medium provide non-volatile storage for the server 1000. That is, the mass storage device 1007 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROI drive.
  • Computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer-readable computer-readable instructions, data structures, program modules, or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the server 1000 may also be connected to a remote computer on the network to run through a network such as the Internet. That is, the server 1000 can be connected to the network 1012 through the network interface unit 1011 connected to the system bus 1005, or in other words, the network interface unit 1011 can also be used to connect to other types of networks or remote computer systems (not shown).
  • the memory stores at least one computer readable instruction, at least one program, code set, or computer readable instruction set, and at least one computer readable instruction, at least one program, code set, or computer readable instruction set It is loaded and executed by one or more processors to implement the statement generation method provided by the foregoing method embodiments.
  • An embodiment of the present application also provides an electronic device.
  • the electronic device may be the terminal 900 provided in FIG. 9 above, or the server 1000 provided in FIG. 10 described above.
  • the present application also provides a computer-readable storage medium that stores at least one computer-readable instruction, and the at least one computer-readable instruction is used to be executed by one or more processors to implement the foregoing method embodiments
  • the provided statement generation method is also provided.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种语句生成方法、装置、设备及存储介质,包括:获取输入序列;对输入序列进行编码处理得到语句特征向量;对语句特征向量进行解码得到第一预定数量的候选语句序列;对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合;从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型;根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。

Description

语句生成方法、装置、设备及存储介质
本申请要求于2019年01月24日提交中国专利局,申请号为2019100689873,申请名称为“语句生成方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及人工智能领域,特别涉及一种语句生成方法、装置、设备及存储介质。
背景技术
语句生成方法可以用于任何功能的对话系统、机器翻译系统、问答系统、自动写作系统、阅读理解系统中,尤其适用于需要大信息量以及多样性的对话系统中。
基于深度学习的语句生成方法是当前发展的方向,在获取到用户输入的语句序列后,其生成输出序列的方法包括:将输入的语句序列编码成向量;对向量进行解码得到输出序列。
上述方法在生成输出序列的过程中,还不能有效的处理输入的语句序列,导致生成的语句不够准确。
发明内容
根据本申请提供的各种实施例,提供了一种语句生成方法、装置、设备及存储介质。具体技术方案如下:
一种语句生成方法,由电子设备执行,方法包括:
获取输入序列;
对输入序列进行编码处理得到语句特征向量,语句特征向量为输入序列的表示;
对语句特征向量进行解码得到第一预定数量的候选语句序列;
对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合;
从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量;及
根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。
一种语句生成装置,装置包括:
获取模块,用于获取输入序列;
编码模块,用于对输入序列进行编码处理得到语句特征向量,语句特征向量为输入序列的表示;
解码模块,用于对语句特征向量进行解码得到第一预定数量的候选语句序列;
聚类模块,用于对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合;
筛选模块,用于从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量;及
确定模块,用于根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。
一种电子设备,电子设备包括一个或多个处理器和存储器,存储器中存储有至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集,至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集由一个或多个处理器加载并执行以实现如上述第一方面的语句生成方法。
一个或多个计算机可读存储介质,计算机可读存储介质中存储有至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集,至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集由一个或多个处理器加载并执行以实现如上述第一方面的语句生成方法。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。基于本申请的说明书、附图以及权利要求书,本申请的其它特征、目的和优点将变得更加明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个示意性实施例提供的应用场景的结构示意图;
图2是本申请一个示意性实施例提供的电子设备的硬件结构示意图;
图3是本申请一个示意性实施例提供的语句生成方法的流程图;
图4是本申请另一个示意性实施例提供的语句生成方法的流程图;
图5是本申请一个示意性实施例提供的语句生成方法涉及的原理示意图;
图6是本申请另一个示意性实施例提供的语句生成方法的流程图;
图7是本申请另一个示意性实施例提供的语句生成方法的流程图;
图8是本申请一个示意性实施例提供的语句生成装置的结构示意图;
图9是本申请一个示意性实施例提供的终端的结构示意图;及
图10是本申请一个示意性实施例提供的服务器的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
首先,对本申请实施例涉及到的一些名词进行解释:
解码:在自然语言处理中是根据输入数据逐字生成语句的处理过程。
聚类:是采用聚类算法将多个数据聚合成至少两个不同类别的集合的处理过程。
在一个实施例中,聚类算法包括K均值聚类算法、均值漂移聚类算法、基于密度的聚类算法、用高斯混合模型的最大期望聚类算法、凝聚层次聚类算法中的至少一种。
语句评分模型:是一种用于根据输入的语句序列确定该语句序列的语句评分的数学模型。
在一个实施例中,该语句评分模型用于衡量一个语句序列是否是自然语言。
在一个实施例中,语句评分模型包括但不限于:深度神经网络(Deep Neural Network,DNN)模型、循环神经网络(Recurrent Neural Networks,RNN)模型、嵌入(embedding)模型、梯度提升决策树(Gradient Boosting Decision Tree,GBDT)模型、逻辑回归(Logistic Regression,LR)模型中的至少一种。
DNN模型是一种深度学习框架。DNN模型包括输入层、至少一层隐层(或称,中间层)和输出层。可选地,输入层、至少一层隐层(或称,中间层)和输出层均包括至少一个神经元,神经元用于对接收到的数据进行处理。可选地,不同层之间的神经元的数量可以相同;或者,也可以不同。
RNN模型是一种具有反馈结构的神经网络。在RNN模型中,神经元的输出可以在下一个时间戳直接作用到自身,即,第i层神经元在m时刻的输入,除了(i-1)层神经元在该时刻的输出外,还包括其自身在(m-1)时刻的输出。
embedding模型是基于实体和关系分布式向量表示,将每个三元组实例中的关系看作从实体头到实体尾的翻译。其中,三元组实例包括主体、关系、客体,三元组实例可以表示成(主体,关系,客体);主体为实体头,客体为实体尾。比如:小张的爸爸是大张,则通过三元组实例表示为(小张,爸爸,大张)。
GBDT模型是一种迭代的决策树算法,该算法由多棵决策树组成,所有树的结果累加起来作为最终结果。决策树的每个节点都会得到一个预测值,以年龄为例,预测值为属于年龄对应的节点的所有人年龄的平均值。
LR模型是指在线性回归的基础上,套用一个逻辑函数建立的模型。
集束搜索(英文:beam search):是一种启发式图搜索算法。在自然语言解码过程中,集束搜索为搜索当前已得到的语句序列集合(也称:语句束),来得到最终生成的输出序列的过程。
集束大小(beams size,BS):为beam search算法中限制语句束的个数。
目前的解码技术都是基于beam search进行的,并没有体现语句内容的差异性,从而在多次解码后往往会使得所有的候选语句序列趋向于同一类别,通常都是安全的输出序列,即语句通顺但信息量缺乏的输出序列,比如“呵呵”、“说得对”等等输出序列。
而本申请实施例提供了一种语句生成方法、装置、设备及存储介质,通过对输入序列进行编码处理得到语句特征向量,对语句特征向量进行解码处理得到第一预定数量的候选语句序列,对第一预定数量的候选语句序列进行聚类和筛选得到第二预定数量的候选语句序列,使得产生的多个候选语句序列包括至少两种语句特征类型,从而使得基于第二预定数量的候选语句序列 生成的输出序列存在较大的多样性,避免了相关技术中对话系统输出的输出序列均为安全的输出序列的情况,能够有效地满足用户需求,提高语句生成的准确性。
为便于对本申请实施例提供的技术方案的理解,首先结合图1介绍一下本申请一个示意性实施例提供的应用场景的结构示意图。
该应用场景包括输入对象100和基于深度学习的电子设备200(后文简称电子设备),其中电子设备200用于执行下述语句生成过程:获取输入对象100的输入序列,然后对该输入序列进行响应,生成输出序列,并将输出序列呈现给该输入对象100。
在一个实施例中,输入序列为输入的待处理的语句序列,输出序列为输出的处理完成的语句序列。
在一个实施例中,该语句生成方法应用于对话系统、机器翻译系统、问答系统、自动写作系统或者阅读理解系统中。对话系统是从互联网或本地数据库中获取与用户输入的待回复语句所对应的回复语句。机器翻译系统是从互联网或本地数据库中获取与用户输入的待翻译语句所对应的翻译语句。问答系统是从互联网或本地数据库中获取与用户输入的问题语句所对应的答案语句。自动写作系统是在互联网或本地数据库中获取与用户输入的用于描述主题的主题语句所对应的内容语句。阅读理解系统是在用户提供的阅读材料中进行查询以获取与用户输入的题目语句所对应的答案语句。
当语句生成方法应用于对话系统中时,输入序列为待回复语句,输出序列为回复语句。
当语句生成方法应用于机器翻译系统中时,输入序列为待翻译的第一语言类型的语句,输出序列为翻译后的第二语言类型的语句,其中第一语言类型不同于第二语言类型。示意性的,第一语言类型为英文,第二语言类型为中文。
当语句生成方法应用于问答系统中时,输入序列为问题语句,输出序列为答案语句。
当语句生成方法应用于自动写作系统中时,输入序列为主题语句,输出序列为内容语句。
当语句生成方法应用于阅读理解系统中时,输入序列为题目语句,输出 序列为答案语句。
在一种实现方式中,输入对象100可以是人,电子设备200可以是手机、电脑等终端,人与终端之间实现上述语句生成过程。
在一个实施例中,电子设备200中安装有第一应用程序,第一应用程序是具有语句生成功能的应用程序。示意性的,第一应用程序为具有问答、信息自动回复、机器翻译等功能的应用程序。
比如,用户通过文字或者语音输入向第一应用程序提问(输入序列),第一应用程序根据用户的问题生成答案(输出序列)并显示出来。
在另一种实现方式中,输入对象100可以是客户端,电子设备200是服务器,客户端和服务器之间实现上述语句生成过程。其中,客户端包括但不限于手机、电脑等,服务器可以是能够提供各种不同服务的服务器,服务器包括但不限于天气查询、业务咨询、智能客服(用于机票服务或餐馆服务等)等。
图2为本申请一个示意性实施例提供的电子设备的硬件结构示意图。如图2所示,电子设备包括一个或多个处理器10、存储器20以及通信接口30。本领域技术人员可以理解,图2中示出的结构并不构成对该电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。其中:
一个或多个处理器10是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器20内的软件程序和/或模块,以及调用存储在存储器20内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体控制。一个或多个处理器10可以由CPU实现,也可以由图形一个或多个处理器(英文Graphics Processing Unit,简写GPU)实现。
存储器20可用于存储软件程序以及模块。一个或多个处理器10通过运行存储在存储器20的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器20可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统21、获取模块22、编码模块23、解码模块24、聚类模块25、筛选模块26、确定模块27和至少一个功能所需的应用程序28(比如神经网络训练等)等;存储数据区可存储根据电子设备的使用所创建的数据等。存储 器20可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random Access Memory,简称SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,简称EEPROM),可擦除可编程只读存储器(Erasable Programmable Read Only Memory,简称EPROM),可编程只读存储器(Programmable Read-Only Memory,简称PROM),只读存储器(Read Only Memory,简称ROM),磁存储器,快闪存储器,磁盘或光盘。相应地,存储器20还可以包括存储器控制器,以提供一个或多个处理器10对存储器20的访问。
其中,一个或多个处理器20通过运行获取模块22执行以下功能:获取输入序列;一个或多个处理器20通过运行编码模块23执行以下功能:对输入序列进行编码处理得到语句特征向量,语句特征向量为输入序列的表示;一个或多个处理器20通过运行解码模块24执行以下功能:对语句特征向量进行解码得到第一预定数量的候选语句序列;一个或多个处理器20通过运行聚类模块25执行以下功能:对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合;一个或多个处理器20通过运行筛选模块26执行以下功能:从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量;一个或多个处理器20通过运行确定模块27执行以下功能:根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。
图3是本申请一个示意性实施例提供的语句生成方法的流程图,该方法可以采用前述应用场景中的电子设备实现,参见图3,该语句生成方法包括以下步骤:
步骤301,获取输入序列。
在一个实施例中,输入序列为输入的文本数据,或者是根据输入的语音数据或者图片数据识别得到的文本数据。
电子设备获取输入序列可以包括:电子设备接收文本数据(字、词语或句子),并将文本数据确定为输入序列。或者,电子设备接收语音数据,对语音数据进行语音识别得到文本数据,并将经语音识别得到的文本数据确定为输入序列。或者,电子设备接收图片数据,对图片数据进行光学字符识别得到文本数据,并将经识别的文本数据确定为输入序列。
步骤302,对输入序列进行编码处理得到语句特征向量,语句特征向量为输入序列的表示。
在一个实施例中,语句特征向量为向量序列或者单个向量。
在一种实施例中,电子设备对输入序列进行编码处理得到语句特征向量包括:电子设备将输入序列编码成向量序列,该向量序列包括至少一个向量。
示意性的,电子设备在编码成向量序列时,先对输入序列进行分词处理,得到至少一个词;然后将分词处理得到的每个词分别编码为一个向量,组成向量序列。
在另一种实施例中,电子设备将输入序列编码成单个向量。
电子设备可以采用编码器将输入序列编码成向量,编码器编码得到的向量包含了输入序列各个方面的信息,比如意图(是确认、询问等等)和具体的命名实体(如地点时间等等)。
电子设备将输入序列编码成单个向量时,在后续对输入序列的处理即转化为对该向量的处理,相比于对一个向量序列进行处理而言,可以大大降低后续处理的复杂程度,同时采用一个向量来表示输入序列能够提高语意的完整性。
需要说明的是,电子设备在采用向量表示输入序列时,为了能够表达输入序列的意思,需要使用一个维数较高的向量,例如5000维;而采用一个向量序列表示输入序列时,向量序列中的每个向量只用表示一个词语,因此每个向量可以使用一个低维数的向量。
步骤303,对语句特征向量进行解码得到第一预定数量的候选语句序列。
在一个实施例中,电子设备对语句特征向量进行解码得到第一预定数量的候选语句序列。其中,候选语句序列包括至少一个解码词。
第一预定数量为预先设置的数值,在一个实施例中,第一预定数值为用户自定义设置的,或者是终端默认设置的。比如,第一预定数量为16或24。
需要说明的是,由于输出序列是逐词生成的,因此输出序列的生成过程包括多次解码处理,每次解码处理包括解码、聚类和筛选。
在一个实施例中,在本申请实施例中,解码也称重组扩展,即解码过程为基于第二预定数量的候选语句序列扩展解码词,将扩展出的解码词与第二预定数量的候选语句序列进行重组得到第一预定数量的候选语句序列的处理 过程,第一预定数量大于第二预定数量。
聚类包括将解码后得到的第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合的处理过程。
筛选包括从聚类得到的至少两类语句序列集合中选择第二预定数量的候选语句序列的处理过程。
步骤304,对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合。
在一个实施例中,电子设备对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合。其中,语句序列集合包括至少一个候选语句序列。
在一个实施例中,至少两类语句序列集合各自对应的语句特征类型是不同的。
在一个实施例中,语句特征类型用于指示候选语句序列的语句通顺度和/或候选语句序列与输入序列之间的关联度。
步骤305,从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量。
在一个实施例中,电子设备从至少两类语句序列集合中筛选出第二预定数量的候选语句序列。
在一个实施例中,对于至少两类语句序列集合的每类语句序列集合,电子设备从该语句序列集合中筛选出至少一个候选语句序列,从而组成第二预定数量的候选语句序列。
步骤306,根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。
在一个实施例中,电子设备从第二预定数量的候选语句序列中选择一个候选语句序列作为输入序列对应的输出序列。
在一个实施例中,电子设备从第二预定数量的候选语句序列中按照预设选择策略,或者随机选择一个候选语句序列作为输入序列对应的输出序列。本实施例对此不加以限定。
综上,本申请实施例通过对输入序列进行编码处理得到语句特征向量,对语句特征向量进行解码得到第一预定数量的候选语句序列,对第一预定数 量的候选语句序列进行聚类和筛选得到第二预定数量的候选语句序列,由于经过聚类和筛选得到的第二预定数量的候选语句序列包括至少两种语句特征类型,使得根据第二预定数量的候选语句序列确定出的输出序列存在较大的多样性,能够有效地满足用户需求,提高了语句生成效果。
请参考图4,其示出了本申请另一个示意性实施例提供的语句生成方法的流程图。该方法可以采用前述应用场景中的电子设备实现,参见图4,该语句生成方法包括:
步骤401,获取输入序列。
在一个实施例中,电子设备通过第一应用程序获取输入的语句,根据输入的语句生成输入序列。
步骤402,对输入序列进行编码处理得到语句特征向量,语句特征向量为输入序列的表示。
电子设备对输入序列进行编码处理得到语句特征向量的过程可参考上述实施例中的相关细节,在此不再赘述。
步骤403,对语句特征向量进行第i次解码得到第一预定数量的候选语句序列,候选语句序列包括i个解码词,i的初始值为1。
在一个实施例中,电子设备对语句特征向量进行第1次解码得到第二预定数量的候选语句序列。每个候选语句序列包括1个解码词。
在一个实施例中,当i大于1时,电子设备对语句特征向量进行第i次解码得到第二预定数量的候选语句序列,包括:在第i次解码时,根据语句特征向量和第i-1次解码得到的第二预定数量的候选语句序列,进行重组扩展得到第一预定数量的候选语句序列,第一预定数量大于第二预定数量。
在一个实施例中,在第i次解码时,对于第i-1次解码得到的第二预定数量的候选语句序列中的至少一个候选语句序列,电子设备将该候选语句序列进行重组扩展得到扩展后的多个候选语句序列。
在一个实施例中,第一预定数量为预设的大于第二预定数量的数值,示意性的,第一预定数量为第二预定数量的m倍,m为大于1的正整数。
步骤404,对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合。
电子设备对第一预定数量的候选语句序列进行聚类得到至少两类语句序 列集合之前,还可以包括:对第一预定数量的候选语句序列进行去重处理,去重处理用于去除候选语句序列中重复的字词。
在一个实施例中,电子设备对第一预定数量的候选语句序列进行聚类得到至少两类序列集,包括:对于第一预定数量的候选语句序列,采用指定聚类算法进行聚类得到至少两类语句序列集合。
其中,指定聚类算法包括K均值聚类算法、均值漂移聚类算法、基于密度的聚类算法、用高斯混合模型的最大期望聚类算法、凝聚层次聚类算法中的至少一种。
需要说明的是,本实施例对终端采用的指定聚类算法的类型不加以限定,下面仅以指定聚类算法为K均值聚类算法为例进行说明。
至少两类语句序列集合各自对应的语句特征类型是不同的。
在一个实施例中,语句特征类型用于指示候选语句序列的语句通顺度和/或候选语句序列与输入序列之间的关联度。
在一个实施例中,语句特征类型包括第一语句特征类型、第二语句特征类型和第三语句特征类型中的至少一种。
第一语句特征类型用于指示候选语句序列为安全的输出序列,安全的输出序列也称通顺且安全的输出序列。即该候选语句序列的语句通顺度高于通顺阈值,而该候选语句序列与输入序列之间的关联度低于或者等于关联阈值。
第二语句特征类型用于指示候选语句序列为不通顺的输出序列,即该候选语句序列的语句通顺度低于或者等于通顺阈值。
第三语句特征类型用于指示候选语句序列为通顺且具有针对性的输出序列,即该候选语句序列的语句通顺度高于通顺阈值,且该候选语句序列与输入序列之间的关联度高于关联阈值。
在一个实施例中,通顺阈值或者关联阈值为用户自定义设置的,或者是终端默认设置的。本实施例对此不加以限定。
需要说明的是,电子设备在聚类时使用的语句特征类型,聚类得到的语句序列集合的数量均可以调整,本实施例对此不加以限定。
比如,至少两类语句序列集合包括三类语句序列集合,第一类语句序列集合包括多个第一语句特征类型的候选语句序列,第一语句特征类型用于指示候选语句序列为安全的语句序列;第二类语句序列集合包括多个第二语句 特征类型的候选语句序列,第二语句特征类型用于指示候选语句序列为不通顺的语句序列;第三类语句序列集合包括多个第三语句特征类型的候选语句序列,第三语句特征类型用于指示候选语句序列为通顺且具有针对性的语句序列。
步骤405,对从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量。
在一个实施例中,电子设备对从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,包括:对于至少两类语句序列集合中的每类语句序列集合,将语句序列集合中的多个候选语句序列进行排序;获取语句序列集合中排序后位于前N个的候选语句序列,N为正整数。
在一个实施例中,对于至少两类语句序列集合中的每类语句序列集合,电子设备按照预设指标对该语句序列集合中的多个候选语句序列进行排序。示意性的,预设指标包括信息熵。
在一个实施例中,电子设备在聚类得到K类语句序列集合之后,从K类语句序列集合中的每类语句序列集合中获取排序后位于前N个的候选语句序列,得到K*N个候选语句序列,其中,K*N为第二预定数量。
步骤406,当第i次解码得到的解码词未包括预测的终止词时将i加1,继续执行对语句特征向量进行第i次解码得到第一预定数量的候选语句序列的步骤。
在一个实施例中,预测的终止词为设置的用于终止解码的关键词。示意性的,终止词为“end”。
当第i次解码得到的解码词未包括预测的终止词时,电子设备将在第i次得到的第二预定数量的候选语句(即,经第i次解码、聚类和筛选出的第二预定数量的候选语句序列)作为下一次(即当前次的下一次)解码的输入,并将当前的i加1作为新的第i次,继续执行上述步骤403至步骤405。
步骤407,当第i次解码得到的解码词包括预测的终止词时,获取第i次解码、聚类和筛选后的第二预定数量的候选语句序列。
当第i次解码得到的解码词包括预测的终止词时,电子设备获取第i次解码、聚类和筛选后的第二预定数量的候选语句序列,并执行步骤408。
步骤408,根据获取到的第二预定数量的候选语句序列,确定输出序列。
可以理解,步骤408中的第二预定数量的候选语句序列,即为对最后一次解码得到的第一预定数量的候选语句序列,执行步骤404和步骤405后得到。
在一个实施例中,电子设备根据获取到的第二预定数量的候选语句序列,确定输出序列包括:获取语句评分模型,语句评分模型用于表示基于样本语句序列进行训练得到的语句评价规律;对于第二预定数量的候选语句序列中的每个候选语句序列,输入语句评分模型得到语句评分,语句评分用于指示候选语句序列的语句质量;根据第二预定数量的候选语句序列各自对应的语句评分,确定输出序列。
在一个实施例中,语句评分模型为基于样本语句序列对神经网络进行训练得到的模型。该语句评分模型用于衡量一个语句序列的语句质量。示意性的,语句质量包括语句通顺度。
在一个实施例中,该语句评分模型用于衡量一个语句序列是否是自然语言。
当电子设备为终端时,语句评分模型可以是终端预先训练好并自身存储的,也可以是服务器预先训练好后发送至终端的。
当电子设备为服务器时,语句评分模型为服务器预先训练好并存储在服务器中的。本实施例对此不加以限定。下面仅以服务器训练语句评分模型为例介绍模型训练过程。
服务器训练语句评分模型的过程包括:获取训练样本集,训练样本集包括至少一组样本数据组;对至少一组样本数据组采用误差反向传播算法进行训练,得到语句评分模型。其中,每组样本数据组包括:样本语句序列和预先标定的正确语句评分。
服务器对至少一组样本数据组采用误差反向传播算法进行训练,得到语句评分模型,包括但不限于以下几个步骤:
1、对于至少一组样本数据组中的每组样本数据组,将样本语句序列输入原始参数模型,得到训练结果。
在一个实施例中,原始参数模型是根据神经网络模型建立的,比如:原始参数模型包括但不限于:CNN模型、DNN模型、RNN模型、嵌入模型、 GBDT模型、LR模型中的至少一种。
示意性的,对于每组样本数据组,服务器创建该组样本数据组对应的输入输出对,输入输出对的输入参数为该组样本数据组中的样本语句序列,输出参数为该组样本数据组中的正确语句评分;服务器将输入参数输入原始参数模型,得到训练结果。
比如,样本数据组包括样本语句序列A和正确语句评分“语句评分1”,终端创建的输入输出对为:(样本语句序列A)->(语句评分1);其中,(样本语句序列A)为输入参数,(语句评分1)为输出参数。
在一个实施例中,输入输出对通过特征向量表示。
2、对于每组样本数据组,将训练结果与正确语句评分进行比较,得到计算损失,计算损失用于指示训练结果与正确语句评分之间的误差。
在一个实施例中,计算损失通过交叉熵(英文:cross-entropy)来表示。
在一个实施例中,终端通过下述公式计算得到计算损失H(p,q):
Figure PCTCN2020073407-appb-000001
其中,p(x)和q(x)是长度相等的离散分布向量,p(x)表示表示训练结果;q(x)表示输出参数;x为训练结果或输出参数中的一个向量。
3、根据至少一组样本数据组各自对应的计算损失,采用误差反向传播算法训练得到语句评分模型。
在一个实施例中,终端通过反向传播算法根据计算损失确定语句评分模型的梯度方向,从语句评分模型的输出层逐层向前更新语句评分模型中的模型参数。
在一个实施例中,对于第二预定数量的候选语句序列中的每个候选语句序列,电子设备将候选语句序列输入至语句评分模型中计算得到语句评分。
其中,语句评分模型是根据至少一组样本数据组训练得到的,每组样本数据组包括:样本语句序列和预先标注的正确语句评分。
在一个实施例中,语句评分用于指示候选语句序列的语句质量。示意性的,语句质量包括语句流畅度。
在一个实施例中,语句评分与候选语句序列的语句质量呈负相关关系,即若语句评分越低则该候选语句序列的语句质量越高,语句流畅度越高;若 语句评分越高则该候选语句序列的语句质量越低,语句流畅度越低。
在一个实施例中,当候选语句序列的语句评分低于评分阈值时,用于指示该候选语句序列为自然语句。
评分阈值为用户自定义设置的,或者是终端默认设置的,本实施例对此不加以限定。
在一个实施例中,当语句评分与候选语句序列的语句质量呈负相关关系时,电子设备确定第二预定数量的候选语句序列各自对应的语句评分中的最低语句评分;将最低语句评分对应的候选语句序列确定为输出序列。
在一个实施例中,电子设备生成输出序列之后,在电子设备上显示输入序列和对应的输出序列。
示意性的,上述实施例提供的语句生成方法对应的算法如下:
Figure PCTCN2020073407-appb-000002
Figure PCTCN2020073407-appb-000003
其中,BS为第二预定数量,C包括输入序列对应的语句特征向量,rsp用于表示输出序列,socre lm(hyp)为语句评分,lm th为评分阈值,hyp用于表示候选语句序列,K为语句序列集合的数量,R用于表示候选语句序列的集合,K-means用于表示K均值聚类算法。
在一个示意性的例子中,电子设备获取输入序列a,对输入序列a进行编码处理得到语句特征向量A,电子设备对语句特征向量A进行第1次解码得到8个候选语句序列,如图5所示,其示出了电子设备第2次解码的过程,其中,白色圆圈代表第一语句特征类型的候选语句序列(比如安全的候选语句序列),黑色圆圈代表第二语句特征类型的候选语句序列(比如通顺且具有针对性的候选语句序列)。1、电子设备根据语句特征向量和第1次解码得到的8个候选语句序列,进行重组扩展得到16个候选语句序列。2、电子设备对16个候选语句序列进行聚类得到两类语句序列集合,即第一类语句序列集合和第二类语句序列集合,第一类语句序列集合包括8个安全的候选语句序列,第二类语句序列集合包括8个通顺且具有针对性的候选语句序列。3、电子设备从第一类语句序列集合中筛选出4个安全的候选语句序列,并从第二类语句序列集合中筛选出4个通顺且具有针对性的候选语句序列,得到8个候选语句序列。4、电子设备根据得到的8个候选语句序列,进行下一次解码,直到接收到指定结束条件。其中,下一次解码可类比参考上述第2次解码的过程。
综上,本申请实施例还通过电子设备获取语句评分模型,对于第二预定 数量的候选语句序列中的每个候选语句序列,输入语句评分模型得到语句评分,基于多个候选语句序列各自对应的语句评分生成输出序列;由于语句评分模型用于表示基于样本语句序列进行训练得到的语句评价规律,使得确定出的语句评分能够准确地反应出该候选语句序列的语句质量,进而保证了生成的输出序列的语句质量。
当上述的语句生成方法应用于对话系统中时,参见图6,该语句生成方法包括:
步骤601,电子设备获取通过对话应用程序输入的待回复语句。
在一个实施例中,当对话应用程序处于前台运行时,接收以语音形式或者文本形式输入的待回复语句。
其中,对话应用程序是安装在电子设备中的具有人机对话功能的应用程序。在一个实施例中,对话应用程序用于对输入的待回复语句进行回复。
步骤602,电子设备根据待回复语句生成输入序列。
在一种实施例中,当待回复语句是以文本形式输入的时,将待回复语句确定为输入序列。
在另一种实施例中,当待回复语句是以语音形式输入的时,采用语音识别算法将待回复语句转化为文本数据,将转化后的文本数据确定为输入序列。
步骤603,电子设备对输入序列进行编码处理得到语句特征向量。
步骤604,电子设备对语句特征向量进行解码得到第一预定数量的候选语句序列。
步骤605,电子设备对第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合。
步骤606,电子设备从至少两类语句序列集合中筛选出第二预定数量的候选语句序列,第二预定数量的候选语句序列包括至少两种语句特征类型,第二预定数量小于第一预定数量。
步骤607,电子设备根据第二预定数量的候选语句序列,确定输入序列对应的输出序列。
需要说明的是,电子设备执行步骤603至步骤607的过程可参考上述实施例中的相关细节,在此不再赘述。
步骤608,电子设备根据输出序列生成回复语句,通过对话应用程序展 示回复语句。
在一个实施例中,电子设备将输出序列确定为回复语句,在对话应用程序的对话界面上以文本形式或者语音形式展示该回复语句。
当上述的语句生成方法应用于机器翻译系统中时,参见图7,上述步骤601和602可以被替换实现成为如下几个步骤:
步骤701,电子设备获取通过翻译应用程序输入的待翻译语句。
在一个实施例中,当翻译应用程序处于前台运行时,接收以语音形式或者文本形式输入的待翻译语句。
其中,翻译应用程序是安装在电子设备中的具有翻译功能的应用程序。在一个实施例中,翻译应用程序用于对输入的待翻译语句进行翻译。
其中,待翻译语句为待翻译的第一语言类型的语句。
步骤702,电子设备根据待翻译语句生成输入序列。
在一种实施例中,当待翻译语句是以文本形式输入的时,将待翻译语句确定为输入序列。
在另一种实施例中,当待翻译语句是以语音形式输入的时,采用语音识别算法将待翻译语句转化为文本数据,将转化后的文本数据确定为输入序列。
对应的,上述步骤608可以被替换实现成为如下步骤:
步骤708,电子设备根据输出序列生成翻译语句,通过对话应用程序展示翻译语句。
其中,翻译语句为待翻译的第一语言类型的语句所对应的翻译后的第二语言类型的语句,其中第一语言类型不同于第二语言类型。示意性的,第一语言类型为英文,第二语言类型为中文。
在一个实施例中,电子设备将输出序列确定为翻译语句,在翻译应用程序的翻译界面上以文本形式或者语音形式展示该翻译语句。
需要说明的是,当语句生成方法应用于问答系统、自动写作系统或者阅读理解系统中时,本领域技术人员可类比参考上述当语句生成方法应用于对话系统或机器翻译系统中时的相关步骤,在此不再赘述。
用于实现本申请各实施例中的语句生成方法的系统,在DSTC7(7 th Dialog System Technology Challenge,第七届对话系统技术挑战赛)中获得了第一名。具体数据如表1和表2所示。其中,表1是自动化评估结果。表2是人工评 估结果。
Figure PCTCN2020073407-appb-000004
(表1)
表1中,一共有2208个测试样本。DSTC7的组织者提供了三个基线(对照组):(1)恒定:始终回答:“我不知道你的意思。”;(2)随机:从训练数据中随机选择一个答案;(3)seq2seq(序列到序列):用Vanilla Keras序列到序列模型训练。团队C/E和团队G,是此次竞赛的其他两组队伍所使用的系统。为了进行正式评估,我们提交了两个系统,一个系统以K均值波束搜索为主要系统,另一个系统是不使用K均值波束搜索的辅助系统。此外,还加了人(Human)的响应进行对比。所有响应输出均使用以下指标进行评分,这些指标分别是NIST(Dod-dington,于2002年提出的机器翻译评价指标),BLEU(Papineni等,于2002年提出)、Me-teor(Denkowski和Lavie,于2014年提出)、DIV-1、DIV-2(也称为distinct-1和distinct-2)(由Li等人于2016年提出)和Entropy1-4(Zhang等人于2018年提出)。
如表1所示,我们的系统使用NIST-4,BLEU-4和Meteor这些主要指标上均取得了最佳结果。此外,使用K均值波束搜索可以有效地提高几乎所有主要算法和所有分集指标的性能。就平均响应长度而言,我们的系统产生的响应比seq2seq基线更长。此外,与不使用K均值波束搜索相比,使用K均值波束搜索的系统,响应时间更长。平均而言,人的响应时间要比我们的系统长,而G团队平均使用22个令牌生成的响应时间甚至更长。就前100k词汇表未涵盖的输出OOV(集外词)的能力而言,我们的系统分别使用K均值波束搜索和传统波束搜索在提交的测试响应中生成了97和57个唯一的OOV(集外词)。与传统的波束搜索相比,K均值波束搜索可以复制更多的OOV(集外词)。
Figure PCTCN2020073407-appb-000005
(表2)
表2中,是由DSTC7组织者精心选择了1k个测试样本进行比赛测试,进而由人工对结果进行评估。如表2所示,人工评估会从“相关性和适当性”和“兴趣和信息性”这两个类别进行评估。与seq2seq的基线相比,我们的系统在95%的置信区间水平下明显超过了基线。此外,与第二名的团队相比,我们的系统在“兴趣和信息量”类别中以95%的置信区间获得了最佳结果。总体而言,我们的系统在竞争中排名第一。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图8,其示出了本申请一个示意性实施例提供的语句生成装置的结构示意图。该语句生成装置可以通过专用硬件电路,或者,软硬件的结合实现成为图1或图2中的电子设备的全部或一部分,该语句生成装置包括:获取模块810、编码模块820、解码模块830、聚类模块840、筛选模块850和确定模块860。
获取模块810,用于执行上述步骤301或401。
编码模块820,用于执行上述步骤302或402。
解码模块830,用于执行上述步骤303。
聚类模块840,用于执行上述步骤304或404。
筛选模块850,用于执行上述步骤305或405。
确定模块860,用于执行上述步骤306。
在一个实施例中,解码模块830,还用于执行上述步骤403。
确定模块860,还用于执行上述步骤406和步骤407中的一个,以及步骤408。
在一个实施例中,聚类模块840,还用于对于第一预定数量的候选语句序列,采用指定聚类算法进行聚类得到至少两类语句序列集合,至少两类语句序列集合各自对应的语句特征类型是不同的;
其中,指定聚类算法包括K均值聚类算法、均值漂移聚类算法、基于密度的聚类算法、用高斯混合模型的最大期望聚类算法、凝聚层次聚类算法中的至少一种。
在一个实施例中,语句特征类型包括第一语句特征类型、第二语句特征类型和第三语句特征类型中的至少一种;
第一语句特征类型用于指示候选语句序列为安全的语句序列;
第二语句特征类型用于指示候选语句序列为不通顺的语句序列;
第三语句特征类型用于指示候选语句序列为通顺且具有针对性的语句序列。
在一个实施例中,确定模块860,还用于获取语句评分模型,语句评分模型用于表示基于样本语句序列进行训练得到的语句评价规律;对于第二预定数量的候选语句序列中的每个候选语句序列,输入语句评分模型得到语句评分,语句评分用于指示候选语句序列的语句质量;根据第二预定数量的候选语句序列各自对应的语句评分,确定输出序列。
在一个实施例中,语句评分与候选语句序列的语句质量呈负相关关系,确定模块860,还用于确定第二预定数量的候选语句序列各自对应的语句评分中的最低语句评分;将最低语句评分对应的候选语句序列确定为输出序列。
在一个实施例中,确定模块860,还用于获取训练样本集,训练样本集包括至少一组样本数据组,每组样本数据组包括:样本语句序列和预先标注的正确语句评分;根据至少一组样本数据组,采用误差反向传播算法对原始参数模型进行训练,得到语句评分模型。
在一个实施例中,筛选模块850,还用于对于至少两类语句序列集合中的每类语句序列集合,将语句序列集合中的多个候选语句序列进行排序;
获取语句序列集合中排序后位于前N个的候选语句序列,N为正整数。
在一个实施例中,该装置还包括:去重模块。该去重模块,用于对第一预定数量的候选语句序列进行去重处理,去重处理用于去除候选语句序列中重复的字词。
在一个实施例中,当语句生成方法应用于对话系统中时,输入序列为待回复语句,输出序列为回复语句;
当语句生成方法应用于机器翻译系统中时,输入序列为待翻译的第一语言类型的语句,输出序列为翻译后的第二语言类型的语句,其中第一语言类型不同于第二语言类型;
当语句生成方法应用于问答系统中时,输入序列为问题语句,输出序列为答案语句;
当语句生成方法应用于自动写作系统中时,输入序列为主题语句,输出序列为内容语句;
当语句生成方法应用于阅读理解系统中时,输入序列为题目语句,输出序列为答案语句。
在一个实施例中,获取模块810,还用于获取通过对话应用程序输入的待回复语句;根据待回复语句生成输入序列;
该装置还包括:展示模块,展示模块用于根据输出序列生成回复语句;通过对话应用程序展示回复语句。
相关细节可结合参考图3至图7所示的方法实施例。其中,获取模块810还用于实现上述方法实施例中其他任意隐含或公开的与获取步骤相关的功能;编码模块820还用于实现上述方法实施例中其他任意隐含或公开的与编码步骤相关的功能;解码模块830还用于实现上述方法实施例中其他任意隐含或公开的与解码步骤相关的功能;聚类模块840还用于实现上述方法实施例中其他任意隐含或公开的与聚类步骤相关的功能;筛选模块850还用于实现上述方法实施例中其他任意隐含或公开的与筛选步骤相关的功能;确定模块860还用于实现上述方法实施例中其他任意隐含或公开的与确定步骤相关的功能。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图9示出了本申请一个示意性实施例提供的终端900的结构框图。该终 端900可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。终端900还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,终端900包括有:一个或多个处理器901和存储器902。
一个或多个处理器901可以包括一个或多个处理核心,比如4核心一个或多个处理器、8核心一个或多个处理器等。一个或多个处理器901可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。一个或多个处理器901也可以包括主一个或多个处理器和协一个或多个处理器,主一个或多个处理器是用于对在唤醒状态下的数据进行处理的一个或多个处理器,也称CPU(Central Processing Unit,中央一个或多个处理器);协一个或多个处理器是用于对在待机状态下的数据进行处理的低功耗一个或多个处理器。在一些实施例中,一个或多个处理器901可以在集成有GPU(Graphics Processing Unit,图像一个或多个处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,一个或多个处理器901还可以包括AI(Artificial Intelligence,人工智能)一个或多个处理器,该AI一个或多个处理器用于处理有关机器学习的计算操作。
存储器902可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器902还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器902中的非暂态的计算机可读存储介质用于存储至少一个计算机可读指令,该至少一个计算机可读指令用于被一个或多个处理器901所执行以实现本申请中方法实施例提供的语句生成方法。
在一些实施例中,终端900还可选包括有:外围设备接口903和至少一个外围设备。一个或多个处理器901、存储器902和外围设备接口903之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口903相连。具体地,外围设备包括:射频电路904、触摸显 示屏905、摄像头906、音频电路907、定位组件908和电源909中的至少一种。
外围设备接口903可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到一个或多个处理器901和存储器902。在一些实施例中,一个或多个处理器901、存储器902和外围设备接口903被集成在同一芯片或电路板上;在一些其他实施例中,一个或多个处理器901、存储器902和外围设备接口903中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路904用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。
显示屏905用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏905是触摸显示屏时,显示屏905还具有采集在显示屏905的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至一个或多个处理器901进行处理。此时,显示屏905还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。
摄像头组件906用于采集图像或视频。可选地,摄像头组件906包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。
音频电路907可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至一个或多个处理器901进行处理,或者输入至射频电路904以实现语音通信。
定位组件908用于定位终端900的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。
电源909用于为终端900中的各个组件进行供电。电源909可以是交流电、直流电、一次性电池或可充电电池。
在一些实施例中,终端900还包括有一个或多个传感器910。该一个或多个传感器910包括但不限于:加速度传感器911、陀螺仪传感器912、压力传感器913、指纹传感器914、光学传感器915以及接近传感器916。
加速度传感器911可以检测以终端900建立的坐标系的三个坐标轴上的 加速度大小。
陀螺仪传感器912可以检测终端900的机体方向及转动角度,陀螺仪传感器912可以与加速度传感器911协同采集用户对终端900的3D动作。
压力传感器913可以设置在终端900的侧边框和/或触摸显示屏905的下层。当压力传感器913设置在终端900的侧边框时,可以检测用户对终端900的握持信号,由一个或多个处理器901根据压力传感器913采集的握持信号进行左右手识别或快捷操作。当压力传感器913设置在触摸显示屏905的下层时,由一个或多个处理器901根据用户对触摸显示屏905的压力操作,实现对UI界面上的可操作性控件进行控制。
指纹传感器914用于采集用户的指纹,由一个或多个处理器901根据指纹传感器914采集到的指纹识别用户的身份,或者,由指纹传感器914根据采集到的指纹识别用户的身份。
光学传感器915用于采集环境光强度。在一个实施例中,一个或多个处理器901可以根据光学传感器915采集的环境光强度,控制触摸显示屏905的显示亮度。
接近传感器916,也称距离传感器,通常设置在终端900的前面板。接近传感器916用于采集用户与终端900的正面之间的距离。
本领域技术人员可以理解,图9中示出的结构并不构成对终端900的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
请参考图10,其示出了本申请一个示意性实施例提供的终端1000的结构示意图。具体来讲:服务器1000包括中央处理单元(CPU)1001、包括随机存取存储器(RAM)1002和只读存储器(ROM)1003的系统存储器1004,以及连接系统存储器1004和中央处理单元1001的系统总线1005。服务器1000还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O系统)1006,和用于存储操作系统1013、应用程序1014和其他程序模块1015的大容量存储设备1007。
基本输入/输出系统1006包括有用于显示信息的显示器1008和用于用户输入信息的诸如鼠标、键盘之类的输入设备1009。其中显示器1008和输入设备1009都通过连接到系统总线1005的输入输出控制器1010连接到中央处 理单元1001。基本输入/输出系统1006还可以包括输入输出控制器1010以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1010还提供输出到显示屏、打印机或其他类型的输出设备。
大容量存储设备1007通过连接到系统总线1005的大容量存储控制器(未示出)连接到中央处理单元1001。大容量存储设备1007及其相关联的计算机可读介质为服务器1000提供非易失性存储。也就是说,大容量存储设备1007可以包括诸如硬盘或者CD-ROI驱动器之类的计算机可读介质(未示出)。
不失一般性,计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其他固态存储其技术,CD-ROM、DVD或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知计算机存储介质不局限于上述几种。上述的系统存储器1004和大容量存储设备1007可以统称为存储器。
根据本申请的各种实施例,服务器1000还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即服务器1000可以通过连接在系统总线1005上的网络接口单元1011连接到网络1012,或者说,也可以使用网络接口单元1011来连接到其他类型的网络或远程计算机系统(未示出)。
在一个实施例中,该存储器中存储有至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集,至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集由一个或多个处理器加载并执行以实现上述各个方法实施例提供的语句生成方法。
本申请实施例还提供一种电子设备,该电子设备可以是上述图9提供的终端900,也可以是上述图10提供的服务器1000。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质存储有至少一条计算机可读指令,至少一条计算机可读指令用于被一个或多个处理器执行以实现上述各个方法实施例提供的语句生成方法。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本领域普通技术人员可以理解实现上述实施例的语句生成方法中全部或部分步骤可以通过硬件来完成,也可以通过程序来计算机可读指令相关的硬件完成,的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (16)

  1. 一种语句生成方法,由电子设备执行,包括:
    获取输入序列;
    对所述输入序列进行编码处理得到语句特征向量;
    对所述语句特征向量进行解码得到第一预定数量的候选语句序列;
    对所述第一预定数量的候选语句序列进行聚类,得到至少两类语句序列集合;
    从所述至少两类语句序列集合中筛选出第二预定数量的候选语句序列,所述第二预定数量的候选语句序列包括至少两种语句特征类型;及
    根据所述第二预定数量的候选语句序列,确定所述输入序列对应的输出序列。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述语句特征向量进行解码得到第一预定数量的候选语句序列,包括:
    对所述语句特征向量进行第i次解码得到所述第一预定数量的候选语句序列,所述候选语句序列包括i个解码词,所述i的初始值为1;
    在所述从所述至少两类语句序列集合中筛选出第二预定数量的候选语句序列之后,所述方法还包括:
    当所述第i次解码得到的解码词未包括预测的终止词时,将所述第二预定数量的候选语句序列作为第i+1次解码的输入,并将所述第i+1次作为第i次,以继续执行对所述语句特征向量进行第i次解码得到所述第一预定数量的候选语句序列的步骤;
    当所述第i次解码得到的解码词包括所述预测的终止词时,则执行所述根据所述第二预定数量的候选语句序列,确定所述输入序列对应的输出序列的步骤。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述第二预定数量的候选语句序列作为第i+1次解码的输入,并将所述第i+1次作为第i次,以继续执行对所述语句特征向量进行第i次解码得到所述第一预定数量的候选语句序列包括:在第i次解码中,根据所述语句特征向量和第i次的上一次解码得到的所述第二预定数量的候选语句序列,进行重组扩展得到所述第一预定数量的候选语句序列。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述语句特征向量和第i次的上一次解码得到的所述第二预定数量的候选语句序列,进行重组扩展得到所述第一预定数量的候选语句序列包括:
    基于第i次的上一次解码得到的所述第二预定数量的候选语句序列,扩展解码词;
    将扩展出的解码词与所述第二预定数量的候选语句序列进行重组,得到第一预定数量的候选语句序列。
  5. 根据权利要求1所述的方法,其特征在于,所述对所述第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合,包括:
    对于所述第一预定数量的候选语句序列,进行聚类得到所述至少两类语句序列集合,所述至少两类语句序列集合各自对应的语句特征类型不同。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述语句特征类型包括第一语句特征类型、第二语句特征类型和第三语句特征类型中的至少一种;
    所述第一语句特征类型用于指示所述候选语句序列为安全的语句序列;
    所述第二语句特征类型用于指示所述候选语句序列为不通顺的语句序列;
    所述第三语句特征类型用于指示所述候选语句序列为通顺且具有针对性的语句序列。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述第二预定数量的候选语句序列,确定所述输入序列对应的输出序列,包括:
    获取语句评分模型,所述语句评分模型用于表示基于样本语句序列进行训练得到的语句评价规律;
    将所述第二预定数量的候选语句序列中的每个所述候选语句序列,输入所述语句评分模型得到语句评分,所述语句评分用于指示所述候选语句序列的语句质量;
    根据所述第二预定数量的候选语句序列各自对应的语句评分,确定所述输出序列。
  8. 根据权利要求7所述的方法,其特征在于,所述语句评分与所述候选语句序列的语句质量呈负相关关系;所述根据所述第二预定数量的候选语句 序列各自对应的语句评分,确定所述输出序列,包括:
    确定所述第二预定数量的候选语句序列各自对应的语句评分中的最低语句评分;
    将所述最低语句评分对应的候选语句序列确定为所述输出序列。
  9. 根据权利要求7所述的方法,其特征在于,所述获取语句评分模型,包括:
    获取训练样本集,所述训练样本集包括所述至少一组样本数据组,每组所述样本数据组包括:样本语句序列和预先标注的正确语句评分;
    根据所述至少一组样本数据组,采用误差反向传播算法对原始参数模型进行训练,得到所述语句评分模型。
  10. 根据权利要求1所述的方法,其特征在于,所述从所述至少两类语句序列集合中筛选出第二预定数量的候选语句序列,包括:
    对于所述至少两类语句序列集合中的每类所述语句序列集合,将所述语句序列集合中的多个候选语句序列进行排序;
    获取所述语句序列集合中排序后位于前预设数量的候选语句序列。
  11. 根据权利要求1所述的方法,其特征在于,在所述对所述第一预定数量的候选语句序列进行聚类得到至少两类语句序列集合之前,还包括:
    对所述第一预定数量的候选语句序列进行去重处理,所述去重处理用于去除所述候选语句序列中重复的字词。
  12. 根据权利要求1所述的方法,其特征在于,
    当所述语句生成方法应用于对话系统中时,所述输入序列为待回复语句,所述输出序列为回复语句;
    当所述语句生成方法应用于机器翻译系统中时,所述输入序列为待翻译的第一语言类型的语句,所述输出序列为翻译后的第二语言类型的语句,其中第一语言类型不同于第二语言类型;
    当所述语句生成方法应用于问答系统中时,所述输入序列为问题语句,所述输出序列为答案语句;
    当所述语句生成方法应用于自动写作系统中时,所述输入序列为主题语句,所述输出序列为内容语句;
    当所述语句生成方法应用于阅读理解系统中时,所述输入序列为题目语 句,所述输出序列为答案语句。
  13. 根据权利要求1所述的方法,其特征在于,所述获取输入序列,包括:
    获取通过对话应用程序输入的待回复语句;
    根据所述待回复语句生成所述输入序列;
    所述方法还包括:
    根据所述输出序列生成回复语句;
    通过所述对话应用程序展示所述回复语句。
  14. 一种语句生成装置,设置于电子设备中,其特征在于,所述装置包括:
    获取模块,用于获取输入序列;
    编码模块,用于对所述输入序列进行编码处理得到语句特征向量;
    解码模块,用于对所述语句特征向量进行解码得到第一预定数量的候选语句序列;
    聚类模块,用于对所述第一预定数量的候选语句序列进行聚类,得到至少两类语句序列集合;
    筛选模块,用于从所述至少两类语句序列集合中筛选出第二预定数量的候选语句序列,所述第二预定数量的候选语句序列包括至少两种语句特征类型;及
    确定模块,用于根据所述第二预定数量的候选语句序列,确定所述输入序列对应的输出序列。
  15. 一种电子设备,其特征在于,所述电子设备包括一个或多个处理器和存储器,所述存储器中存储有至少一条计算机可读指令、至少一段程序、代码集或计算机可读指令集,所述至少一条计算机可读指令、所述至少一段程序、所述代码集或计算机可读指令集由所述一个或多个处理器加载并执行以实现如权利要求1至13任一所述的语句生成方法。
  16. 一个或多个计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条计算机可读指令、至少一段程序、代码集或计算机 可读指令集,所述至少一条计算机可读指令、所述至少一段程序、所述代码集或计算机可读指令集由所述一个或多个处理器加载并执行以实现如权利要求1至13任一所述的语句生成方法。
PCT/CN2020/073407 2019-01-24 2020-01-21 语句生成方法、装置、设备及存储介质 WO2020151690A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021540365A JP7290730B2 (ja) 2019-01-24 2020-01-21 文生成方法と装置、電子機器及びプログラム
US17/230,985 US20210232751A1 (en) 2019-01-24 2021-04-14 Sentence generation method and apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910068987.3 2019-01-24
CN201910068987.3A CN110162604B (zh) 2019-01-24 2019-01-24 语句生成方法、装置、设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/230,985 Continuation US20210232751A1 (en) 2019-01-24 2021-04-14 Sentence generation method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2020151690A1 true WO2020151690A1 (zh) 2020-07-30

Family

ID=67644826

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/073407 WO2020151690A1 (zh) 2019-01-24 2020-01-21 语句生成方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US20210232751A1 (zh)
JP (1) JP7290730B2 (zh)
CN (1) CN110162604B (zh)
WO (1) WO2020151690A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308313A (zh) * 2020-10-29 2021-02-02 中国城市规划设计研究院 一种学校连续点选址方法、装置、介质及计算机设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162604B (zh) * 2019-01-24 2023-09-12 腾讯科技(深圳)有限公司 语句生成方法、装置、设备及存储介质
CN110827085A (zh) * 2019-11-06 2020-02-21 北京字节跳动网络技术有限公司 文本处理方法、装置及设备
CN110990697A (zh) * 2019-11-28 2020-04-10 腾讯科技(深圳)有限公司 内容推荐方法、装置、设备和存储介质
CN113807074A (zh) * 2021-03-12 2021-12-17 京东科技控股股份有限公司 基于预训练语言模型的相似语句生成方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1790332A (zh) * 2005-12-28 2006-06-21 刘文印 一种问题答案的阅读浏览显示方法及其系统
US20110246465A1 (en) * 2010-03-31 2011-10-06 Salesforce.Com, Inc. Methods and sysems for performing real-time recommendation processing
CN104778256A (zh) * 2015-04-20 2015-07-15 江苏科技大学 一种领域问答系统咨询的快速可增量聚类方法
US20170323204A1 (en) * 2016-05-03 2017-11-09 International Business Machines Corporation Text Simplification for a Question and Answer System
CN107368547A (zh) * 2017-06-28 2017-11-21 西安交通大学 一种基于深度学习的智能医疗自动问答方法
CN109145099A (zh) * 2018-08-17 2019-01-04 百度在线网络技术(北京)有限公司 基于人工智能的问答方法和装置
CN110162604A (zh) * 2019-01-24 2019-08-23 腾讯科技(深圳)有限公司 语句生成方法、装置、设备及存储介质

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7225120B2 (en) * 2001-05-30 2007-05-29 Hewlett-Packard Development Company, L.P. Method of extracting important terms, phrases, and sentences
KR100911621B1 (ko) * 2007-12-18 2009-08-12 한국전자통신연구원 한영 자동번역 방법 및 장치
JP5431532B2 (ja) * 2012-06-08 2014-03-05 日本電信電話株式会社 質問応答装置、モデル学習装置、方法、及びプログラム
JP6414956B2 (ja) * 2014-08-21 2018-10-31 国立研究開発法人情報通信研究機構 質問文生成装置及びコンピュータプログラム
US9886501B2 (en) * 2016-06-20 2018-02-06 International Business Machines Corporation Contextual content graph for automatic, unsupervised summarization of content
US9881082B2 (en) * 2016-06-20 2018-01-30 International Business Machines Corporation System and method for automatic, unsupervised contextualized content summarization of single and multiple documents
KR102565274B1 (ko) * 2016-07-07 2023-08-09 삼성전자주식회사 자동 통역 방법 및 장치, 및 기계 번역 방법 및 장치
KR102565275B1 (ko) * 2016-08-10 2023-08-09 삼성전자주식회사 병렬 처리에 기초한 번역 방법 및 장치
US10275515B2 (en) * 2017-02-21 2019-04-30 International Business Machines Corporation Question-answer pair generation
US10579725B2 (en) * 2017-03-15 2020-03-03 International Business Machines Corporation Automated document authoring assistant through cognitive computing
JP6709748B2 (ja) * 2017-04-13 2020-06-17 日本電信電話株式会社 クラスタリング装置、回答候補生成装置、方法、及びプログラム
US11409749B2 (en) * 2017-11-09 2022-08-09 Microsoft Technology Licensing, Llc Machine reading comprehension system for answering queries related to a document
CN108021705B (zh) * 2017-12-27 2020-10-23 鼎富智能科技有限公司 一种答案生成方法及装置
US10497366B2 (en) * 2018-03-23 2019-12-03 Servicenow, Inc. Hybrid learning system for natural language understanding
US11042713B1 (en) * 2018-06-28 2021-06-22 Narrative Scienc Inc. Applied artificial intelligence technology for using natural language processing to train a natural language generation system
CN108897872B (zh) * 2018-06-29 2022-09-27 北京百度网讯科技有限公司 对话处理方法、装置、计算机设备和存储介质
US11036941B2 (en) * 2019-03-25 2021-06-15 International Business Machines Corporation Generating a plurality of document plans to generate questions from source text
CN110619123B (zh) * 2019-09-19 2021-01-26 电子科技大学 一种机器阅读理解方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1790332A (zh) * 2005-12-28 2006-06-21 刘文印 一种问题答案的阅读浏览显示方法及其系统
US20110246465A1 (en) * 2010-03-31 2011-10-06 Salesforce.Com, Inc. Methods and sysems for performing real-time recommendation processing
CN104778256A (zh) * 2015-04-20 2015-07-15 江苏科技大学 一种领域问答系统咨询的快速可增量聚类方法
US20170323204A1 (en) * 2016-05-03 2017-11-09 International Business Machines Corporation Text Simplification for a Question and Answer System
CN107368547A (zh) * 2017-06-28 2017-11-21 西安交通大学 一种基于深度学习的智能医疗自动问答方法
CN109145099A (zh) * 2018-08-17 2019-01-04 百度在线网络技术(北京)有限公司 基于人工智能的问答方法和装置
CN110162604A (zh) * 2019-01-24 2019-08-23 腾讯科技(深圳)有限公司 语句生成方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308313A (zh) * 2020-10-29 2021-02-02 中国城市规划设计研究院 一种学校连续点选址方法、装置、介质及计算机设备
CN112308313B (zh) * 2020-10-29 2023-06-16 中国城市规划设计研究院 一种学校连续点选址方法、装置、介质及计算机设备

Also Published As

Publication number Publication date
US20210232751A1 (en) 2021-07-29
JP2022500808A (ja) 2022-01-04
JP7290730B2 (ja) 2023-06-13
CN110162604B (zh) 2023-09-12
CN110162604A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2020151690A1 (zh) 语句生成方法、装置、设备及存储介质
US11537884B2 (en) Machine learning model training method and device, and expression image classification method and device
US20220180882A1 (en) Training method and device for audio separation network, audio separation method and device, and medium
EP3549069B1 (en) Neural network data entry system
US10885900B2 (en) Domain adaptation in speech recognition via teacher-student learning
CN111444329B (zh) 智能对话方法、装置和电子设备
CN108170749B (zh) 基于人工智能的对话方法、装置及计算机可读介质
US20210256390A1 (en) Computationally efficient neural network architecture search
WO2020147428A1 (zh) 交互内容生成方法、装置、计算机设备及存储介质
CN111339246B (zh) 查询语句模板的生成方法、装置、设备及介质
US10095684B2 (en) Trained data input system
US10831796B2 (en) Tone optimization for digital content
CN110148416A (zh) 语音识别方法、装置、设备和存储介质
CN111400470A (zh) 问题处理方法、装置、计算机设备和存储介质
CN112069309B (zh) 信息获取方法、装置、计算机设备及存储介质
CN110795542A (zh) 对话方法及相关装置、设备
US20180150143A1 (en) Data input system with online learning
CN108345612A (zh) 一种问题处理方法和装置、一种用于问题处理的装置
WO2023040516A1 (zh) 一种事件整合方法、装置、电子设备、计算机可读存储介质及计算机程序产品
CN113407850A (zh) 一种虚拟形象的确定和获取方法、装置以及电子设备
CN111444321B (zh) 问答方法、装置、电子设备和存储介质
WO2021129411A1 (zh) 文本处理方法及装置
CN116821324A (zh) 模型训练方法、装置、电子设备及存储介质
WO2021082570A1 (zh) 基于人工智能的语义识别方法、装置和语义识别设备
CN116579350B (zh) 对话理解模型的鲁棒性分析方法、装置和计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20745589

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021540365

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20745589

Country of ref document: EP

Kind code of ref document: A1