WO2020119069A1 - 基于自编码神经网络的文本生成方法、装置、终端及介质 - Google Patents

基于自编码神经网络的文本生成方法、装置、终端及介质 Download PDF

Info

Publication number
WO2020119069A1
WO2020119069A1 PCT/CN2019/092957 CN2019092957W WO2020119069A1 WO 2020119069 A1 WO2020119069 A1 WO 2020119069A1 CN 2019092957 W CN2019092957 W CN 2019092957W WO 2020119069 A1 WO2020119069 A1 WO 2020119069A1
Authority
WO
WIPO (PCT)
Prior art keywords
hidden
neural network
self
layer
network model
Prior art date
Application number
PCT/CN2019/092957
Other languages
English (en)
French (fr)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to SG11202001765TA priority Critical patent/SG11202001765TA/en
Priority to US16/637,274 priority patent/US11487952B2/en
Publication of WO2020119069A1 publication Critical patent/WO2020119069A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the technical field of natural language understanding, in particular to a text generation method, device, terminal and medium based on a self-coding neural network.
  • Automatic text generation is an important research direction in the field of natural language processing, and the realization of automatic text generation is also an important sign of the maturity of artificial intelligence.
  • the application of text generation can be divided into supervised and unsupervised text generation.
  • supervised text generation such as machine translation, intelligent question answering systems, dialogue systems, and text summaries.
  • unsupervised text generation by learning the original distribution of the data, you can then generate samples similar to the original data, such as poetry creation and music creation.
  • Using text generation more intelligent and natural human-computer interaction can be realized, and the automatic writing and publishing of news can be realized by replacing the editor with an automatic text generation system.
  • the main purpose of this application is to provide a text generation method, device, terminal and medium based on self-coding neural network, aiming to solve the problem that the prior art text generation model requires larger data annotation resources or consumes more resource modeling Technical issues.
  • the present application provides a text generation method based on a self-coding neural network, including the following steps:
  • the text word vector is reversely input into the trained self-coding neural network model to obtain the hidden features of the middle hidden layer of the self-coding neural network model;
  • the modified hidden feature is used as an intermediate hidden layer of the self-coding neural network model, and the word vector corresponding to the input layer of the self-coding neural network model is generated inversely from the intermediate hidden layer;
  • the corresponding text is generated.
  • the present application also provides a text generation device based on a self-coding neural network, including:
  • the acquisition module is used to acquire the text word vector of the sentence to be input and the classification requirements
  • the input module is used to reversely input the text word vector into the trained self-coding neural network model to obtain the hidden features of the middle hidden layer of the self-coding neural network model;
  • a modification module configured to modify the hidden feature according to a preset classification scale and the classification requirement
  • a decoding module configured to use the modified hidden feature as an intermediate hidden layer of the self-coding neural network model, and reversely generate a word vector corresponding to the input layer of the self-coding neural network model from the intermediate hidden layer;
  • the generation module is used to generate corresponding text according to the generated word vector.
  • the present application also provides a terminal, the terminal includes: a memory, a processor, and a text generation computer readable based on a self-encoding neural network that is stored on the memory and can run on the processor Instructions, the self-coding neural network-based text generation computer-readable instructions are configured to implement the steps of the self-coding neural network-based text generation method described above.
  • the present application also provides a storage medium on which a computer-readable instruction for text generation based on a self-encoding neural network is stored, and the computer-readable instruction for text generation based on a self-encoding neural network is processed
  • the device When the device is executed, it implements the steps of the text generation method based on the self-coding neural network.
  • This application obtains the hidden features of the middle hidden layer of the self-coding neural network model by obtaining the text word vectors of the sentences to be input and the classification requirements, and inputting the text word vectors into the trained self-coding neural network model in reverse.
  • Preset classification scales and the classification requirements modify the hidden features, use the modified hidden features as the intermediate hidden layer of the self-coding neural network model, and generate the self-coding neural inversely from the intermediate hidden layer
  • the word vector corresponding to the input layer of the network model generates the corresponding text according to the generated word vector, and adjusts the text generation style through the preset classification scale and the classification requirements without consuming a large amount of data annotation resources and a large amount of data.
  • FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment involved in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a text generation method based on a self-encoding neural network of the present application
  • FIG. 3 is a schematic structural diagram of an embodiment of a self-coding neural network learning model of the present application.
  • FIG. 4 is a schematic flowchart of a second embodiment of a text generation method based on a self-encoding neural network of the present application
  • FIG. 5 is a schematic flowchart of a third embodiment of a text generation method based on a self-coding neural network of the present application
  • FIG. 6 is a schematic flowchart of a fourth embodiment of a text generation method based on a self-coding neural network according to this application;
  • FIG. 7 is a schematic flowchart of a fifth embodiment of a text generation method based on a self-coding neural network according to this application.
  • FIG. 8 is a structural block diagram of a first embodiment of a text generation device based on a self-encoding neural network of the present application.
  • FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment involved in a solution according to an embodiment of the present application.
  • the terminal may include: a processor 1001, such as a central processing unit (Central Processing) Unit, CPU), communication bus 1002, user interface 1003, network interface 1004, memory 1005.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display screen (Display), an input module such as a keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a wireless fidelity (WIreless-FIdelity, WI-FI) interface).
  • WIreless-FIdelity WI-FI
  • the memory 1005 may be a high-speed random access memory (Random Access Memory (RAM) memory can also be a stable non-volatile memory (Non-Volatile Memory, NVM), such as disk storage.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • FIG. 1 does not constitute a limitation on the terminal, and may include more or less components than those illustrated, or combine certain components, or arrange different components.
  • the memory 1005 as a storage medium may include an operating system, a data storage module, a network communication module, a user interface module, and text-generating computer-readable instructions based on a self-coding neural network.
  • the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 in the terminal of the present application, and the memory 1005 may be provided in the terminal ,
  • the terminal calls the text based on the self-encoding neural network stored in the memory 1005 to generate computer-readable instructions through the processor 1001, and executes the text generation method based on the self-encoding neural network provided by the embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a first embodiment of a text generation method based on a self-coding neural network in the present application.
  • the self-coding neural network-based text generation method includes the following steps:
  • Step S10 Obtain the text word vector of the sentence to be input and the classification requirements
  • the execution subject of the method of this embodiment is a terminal, and the classification requirements usually refer to the desired classification category.
  • the evaluation text is used as an example.
  • the evaluation can be divided into positive evaluation and negative evaluation, and the classification requirement can be Desired output of negative evaluations can also be expected to output positive evaluations; for example, emotional texts can be divided into positive emotions and negative emotions; for example, friendly texts can be divided into very friendly, relatively friendly, and generally friendly , Unfriendly.
  • the classification requirement may be obtained by user-defined input, or may be preset, and no specific limitation is made here.
  • the steps of the text word vector of the sentence to be input specifically include: acquiring the input sentence and preprocessing the input sentence; acquiring the text word vector of the pre-processed input sentence.
  • the preprocessing of input sentences usually includes: removing stop words, that is, a large number of words in the text that do not have much effect on the text, such as " ⁇ ", “ ⁇ ”, “ ⁇ ” in Chinese, or web data Concentrate html tags, scripting language, etc.
  • the corresponding text word vectors are ⁇ 1 , ⁇ 2 , ... ⁇ i ⁇ , and ⁇ i is the word vector of the i-th word in the sentence.
  • Step S20 Reversely input the text word vector into the trained self-coding neural network model to obtain the hidden features of the middle hidden layer of the self-coding neural network model;
  • the reverse input of the text word vector into the trained self-coding neural network model refers to the use of the text word vector as the output of the trained self-coding neural network model to reversely obtain the hidden Containing features (please refer to Figure 3, Figure 3 is the case where the middle hidden layer is one layer, the text word vector is input from the output layer to obtain the hidden features of the hidden layer located between the input layer and the output layer).
  • the hidden feature obtained by the middlemost hidden layer is taken as the hidden feature of the intermediate hidden layer.
  • the hidden features obtained by the middle second layer are taken as the hidden features of the intermediate hidden layer, and for example, when the number of intermediate hidden layers is 2, then two intermediate hidden layers are taken.
  • the average value of the hidden features of the layer is taken as the hidden features of the middle hidden layer, and so on, when the number of the middle hidden layer is an odd layer, the hidden feature obtained by the middlemost hidden layer is taken as the hidden feature of the middle hidden layer Contained features. When the number of intermediate hidden layers is even, the average of the hidden features of the two middle hidden layers is taken as the hidden feature of the intermediate hidden layer.
  • the reverse input of the text word vector into the trained self-coding neural network model to obtain the hidden features of the middle hidden layer of the self-coding neural network model includes:
  • the text word vector is input from the output layer of the trained self-coding neural network model, and the hidden features of the middle hidden layer of the self-coding neural network model are reversely generated from the output layer as the self-coding neural network
  • the hidden features of the middle hidden layer of the network model where,
  • the hidden feature corresponding to the middle middle hidden layer is taken as the hidden feature of the middle hidden layer of the self-coding neural network model
  • the average value of the hidden features corresponding to the two middle hidden layers in the middle is taken as the hidden feature of the intermediate hidden layer of the self-coding neural network model.
  • the training process of the self-coding neural network model includes:
  • Pre-training using training samples without class labels, forward training the first hidden layer of the self-encoding neural network model to obtain the parameters of the first layer (W 1 , b 1 ).
  • a hidden layer converts the original input into a vector composed of hidden unit activation values, and then uses this vector as the input of the second hidden layer, and continues training to obtain the parameters of the second layer (W 2 , b 2 ), and repeatedly executes the previous one
  • the output of the layer is used as the input of the next layer to train in sequence.
  • the parameters of the other layers remain unchanged. It is also possible to adjust the parameters of all layers at the same time through the back propagation algorithm after the pre-training is completed to improve the results.
  • Step S30 Modify the hidden feature according to the preset classification scale and the classification requirement
  • classification scale usually refers to the scale between the classifications.
  • text classification can be divided into two categories, namely positive evaluation and negative evaluation, and the scale from positive evaluation to negative evaluation is the classification scale.
  • the classification scale may be pre-defined or calculated based on samples.
  • the evaluation can be divided into positive evaluation and negative evaluation.
  • the feature average, h 2i is the hidden feature average of the i-th feature negative evaluation sample.
  • Step S40 Use the modified hidden feature as an intermediate hidden layer of the self-coding neural network model, and reversely generate a word vector corresponding to the input layer of the self-coding neural network model from the intermediate hidden layer;
  • the modified hidden feature is used as the intermediate hidden layer of the self-encoding neural network model, and the word vector corresponding to the input layer of the self-encoding neural network model is generated inversely from the intermediate hidden layer.
  • the modified hidden features are decoded as the input of the self-coding neural network model (as shown in Figure 3, taking the middle hidden layer as a single layer as an example, the input layer is obtained by decoding from the hidden layer, and the corresponding word vector is obtained).
  • Step S50 Generate corresponding text according to the generated word vector.
  • the step of generating corresponding text according to the generated word vector is to form words corresponding to the generated word vector into text.
  • the way to form the text can be to directly connect the words together to form the text, or to form the text according to certain rules.
  • the step of generating the corresponding text according to the generated word vector includes:
  • Step S51 Match the generated word vector with the pre-trained word vector library to generate words corresponding to each word vector;
  • the pre-trained word vector library is a correspondence relationship library between words and word vectors established in advance according to certain rules.
  • Step S52 Connect the generated words together to generate corresponding text.
  • the way to generate the text may be to directly connect the words together to form the text, or to form the text according to certain rules.
  • This application obtains the hidden features of the middle hidden layer of the self-coding neural network model by obtaining the text word vectors of the sentences to be input and the classification requirements, and inputting the text word vectors into the trained self-coding neural network model in reverse.
  • Preset classification scales and the classification requirements modify the hidden features, use the modified hidden features as the intermediate hidden layer of the self-coding neural network model, and generate the self-coding neural inversely from the intermediate hidden layer
  • the word vector corresponding to the input layer of the network model, corresponding text is generated according to the generated word vector, and the text generation style is adjusted through the preset classification scale and the classification requirements, such as for the user to adjust the style scale of the dialogue of the customer service robot, including Scales such as positive and negative emotions, friendship, etc., do not need to spend a lot of data annotation resources, and do not need to spend a lot of resources to model.
  • FIG. 4 is a schematic flowchart of a second embodiment of a text generation method based on a self-encoding neural network of the present application.
  • the method before the step S10, the method includes the following steps:
  • Step S101 obtaining labeled multi-class training samples and generating corresponding classification word vectors
  • multi-class training samples refer to the training samples divided into multiple categories. Taking the evaluation text as an example, the evaluation sample is divided into two categories: positive evaluation text and negative evaluation text.
  • the labeled multi-type training sample refers to It is a multi-class training sample with labels (for example, labels with positive or negative evaluation).
  • Step S102 Forward input the classification word vector into the trained self-coding neural network model to obtain hidden features of multiple types of samples;
  • forward inputting the classification word vector into the trained self-coding neural network model refers to using the classification word vector as the input of the trained self-coding neural network model to obtain the intermediate
  • the hidden features of the hidden layer as the hidden features of multiple types of samples (see Figure 3, Figure 3 is the case where the middle hidden layer is one layer, the classification word vector is input from the output layer to obtain the input layer and the output layer
  • Hidden features of the middle hidden layer when the number of middle hidden layers is multiple, the hidden features obtained from the middle hidden layer are taken as the hidden features of multiple types of samples. For example, when the number of intermediate hidden layers is 3, then the hidden features obtained from the middle second layer are taken as the hidden features of multiple types of samples. For example, when the number of intermediate hidden layers is 2, the two hidden layers are used.
  • the average value of the hidden features is used as the hidden features of multiple samples.
  • Step S103 Calculate the vector difference of the hidden features of multi-class samples, and use as the classification scale of multi-class samples.
  • the vector difference of the hidden features of the multi-class samples is calculated and used as the classification scale of the multi-class samples.
  • the evaluation can be divided into positive evaluation and negative evaluation.
  • FIG. 5 is a schematic flowchart of a third embodiment of a text generation method based on a self-encoding neural network of the present application.
  • the step S30 specifically includes:
  • Step S31 Determine the adjustment vector corresponding to the hidden feature according to the preset classification scale and classification requirements
  • the adjustment vector b can be determined according to the degree of negative evaluation, usually the adjustment vector b.
  • Step S32 Correct the hidden feature according to the adjustment vector.
  • correcting the hidden feature according to the determined adjustment vector may be the vector difference between the hidden feature and the adjustment vector, or may be used as a weight so that the modified hidden feature is pressed after decoding Classification needs output.
  • FIG. 6 is a schematic flowchart of a fourth embodiment of a text generation method based on a self-encoding neural network of the present application.
  • the step S32 specifically includes:
  • Step S321 The vector difference between the hidden feature and the adjustment vector is used as the modified hidden feature.
  • the hidden feature before modification is h before and the adjustment vector is b
  • the hidden feature after modification h after h before- b.
  • FIG. 7 is a schematic flowchart of a fourth embodiment of a text generation method based on a self-encoding neural network of the present application.
  • the method before the step S10, the method further includes:
  • Step S104 Establish a self-coding neural network model
  • the self-coding neural network model is an unsupervised learning neural network that reconstructs the input signal as much as possible.
  • the self-coding neural network model can be a multiple intermediate hidden layer or a single intermediate hidden layer self-coding network model ( (See Figure 3).
  • Step S105 Obtain training samples without category labels and generate corresponding word vectors
  • a training sample without a category label means that the training sample does not mark its category.
  • Step S106 input the word vector forward to train the self-coding neural network model.
  • the first hidden layer of the self-encoding neural network model is trained forward using training samples without class labels to obtain the parameters of the first layer (W 1 , b 1 ).
  • the hidden layer is multi-layer
  • the network The first hidden layer converts the original input into a vector composed of hidden unit activation values, and then uses this vector as the input of the second hidden layer, and continues training to obtain the parameters of the second layer (W 2 , b 2 ).
  • the output of one layer is used as the input of the next layer to train sequentially, while the parameters of each layer are trained, the parameters of the other layers remain unchanged. It is also possible to adjust the parameters of all layers at the same time through the back propagation algorithm after the pre-training is completed to improve the results.
  • the embodiments of the present application also provide a computer-readable storage medium, and the computer-readable storage medium may be a non-volatile readable storage medium.
  • Computer-readable instructions are stored on the computer-readable storage medium. When the computer-readable instructions are executed by the processor, the steps of the self-coding neural network-based text generation method as described above are implemented.
  • FIG. 8 is a structural block diagram of a first embodiment of a text generation device based on a self-encoding neural network of the present application.
  • the text generation device based on the self-coding neural network proposed in the embodiment of the present application includes:
  • the obtaining module 801 is used to obtain the text word vector of the sentence to be input and the classification requirements;
  • classification requirements generally refer to desired classification categories. For example, in the case of evaluation texts, evaluations can be divided into positive and negative categories, while classification requirements can be expected to output negative evaluations or expected to output positive evaluations.
  • the classification requirement may be obtained by user-defined input, or may be preset, and no specific limitation is made here.
  • the steps of the text word vector of the sentence to be input specifically include: acquiring the input sentence and preprocessing the input sentence; acquiring the text word vector of the pre-processed input sentence.
  • the preprocessing of input sentences usually includes: removing stop words, that is, a large number of words in the text that do not have much effect on the text, such as " ⁇ ", “ ⁇ ”, “ ⁇ ” in Chinese, or web data Concentrate html tags, scripting language, etc.
  • the corresponding text word vectors are ⁇ 1 , ⁇ 2 , ... ⁇ i ⁇ , and ⁇ i is the word vector of the i-th word in the sentence.
  • the input module 802 is used to reversely input the text word vector into the trained self-coding neural network model to obtain the hidden features of the middle hidden layer of the self-coding neural network model;
  • the reverse input of the text word vector into the trained self-coding neural network model refers to the use of the text word vector as the output of the trained self-coding neural network model to reversely obtain the hidden Containing features (please refer to Figure 3, Figure 3 is the case where the middle hidden layer is one layer, the text word vector is input from the output layer to obtain the hidden features of the hidden layer located between the input layer and the output layer).
  • the hidden feature obtained by the middlemost hidden layer is taken as the hidden feature of the intermediate hidden layer.
  • the hidden features obtained by the middle second layer are taken as the hidden features of the intermediate hidden layer, and for example, when the number of intermediate hidden layers is 2, then two intermediate hidden layers are taken.
  • the average value of the hidden features of the layer is taken as the hidden features of the middle hidden layer.
  • the training process of the self-coding neural network model includes:
  • Pre-training using training samples without class labels, forward training the first hidden layer of the self-encoding neural network model to obtain the parameters of the first layer (W 1 , b 1 ).
  • a hidden layer converts the original input into a vector composed of hidden unit activation values, and then uses this vector as the input of the second hidden layer, and continues training to obtain the parameters of the second layer (W 2 , b 2 ), and repeatedly executes the previous one
  • the output of the layer is used as the input of the next layer to train in sequence.
  • the parameters of the other layers remain unchanged. It is also possible to adjust the parameters of all layers at the same time through the back propagation algorithm after the pre-training is completed to improve the results.
  • the modification module 803 is configured to modify the hidden feature according to a preset classification scale and the classification requirement
  • classification scale usually refers to the scale between the classifications.
  • text classification can be divided into two categories, namely positive evaluation and negative evaluation, and the scale from positive evaluation to negative evaluation is the classification scale.
  • the classification scale may be pre-defined or calculated based on samples.
  • the evaluation can be divided into positive evaluation and negative evaluation.
  • the average value of the features, h 2i is the average value of the hidden features of the negative evaluation sample of the i-th feature.
  • the decoding module 804 is configured to use the modified hidden feature as an intermediate hidden layer of the self-coding neural network model, and reversely generate a word vector corresponding to the input layer of the self-coding neural network model from the intermediate hidden layer;
  • the modified hidden feature is used as the intermediate hidden layer of the self-encoding neural network model, and the word vector corresponding to the input layer of the self-encoding neural network model is generated inversely from the intermediate hidden layer.
  • the decoding is the input of the self-coding neural network model (as shown in FIG. 3, taking the middle hidden layer as a single layer as an example, the input layer is obtained by decoding from the hidden layer, and the corresponding word vector is obtained).
  • the generating module 805 is configured to generate corresponding text according to the generated word vector.
  • the step of generating corresponding text according to the generated word vector is to form words corresponding to the generated word vector into text.
  • the way to form the text can be to directly connect the words together to form the text, or to form the text according to certain rules.
  • This application obtains the hidden features of the middle hidden layer of the self-coding neural network model by obtaining the text word vectors of the sentences to be input and the classification requirements, and inputting the text word vectors into the trained self-coding neural network model in reverse.
  • Preset classification scales and the classification requirements modify the hidden features, use the modified hidden features as the intermediate hidden layer of the self-coding neural network model, and generate the self-coding neural inversely from the intermediate hidden layer
  • the word vector corresponding to the input layer of the network model, corresponding text is generated according to the generated word vector, and the text generation style is adjusted through the preset classification scale and the classification requirements, such as for the user to adjust the style scale of the dialogue of the customer service robot, including Scales such as positive and negative emotions, friendship, etc., do not need to spend a lot of data annotation resources, and do not need to spend a lot of resources to model.
  • the embodiment method can be implemented by means of software plus the necessary general hardware platform, of course Hardware, but in many cases the former is a better implementation.
  • the technical solution of the present application can essentially be embodied in the form of a software product that contributes to the existing technology, and the computer software product is stored in a storage medium (such as read-only memory/random access) Memory, disk, optical Disk), including several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the embodiments of the present application.
  • a terminal device which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于自编码神经网络的文本生成方法、装置、终端及介质,属于自然语言处理技术领域,该方法通过获取待输入语句的文本词向量以及分类需求(S10),将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征(S20),根据预设分类尺度以及所述分类需求,修正所述隐含特征(S30),将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量(S40),根据生成的词向量,生成对应的文本(S50)。通过预设分类尺度以及所述分类需求调节文本生成风格,而不要耗费大量的数据标注资源,不需要耗费大量的资源建模。

Description

基于自编码神经网络的文本生成方法、装置、终端及介质
本申请要求于2018年12月13日提交中国专利局、申请号为201811526185.4、发明名称为“基于自编码神经网络的文本生成方法、装置、终端及介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及自然语言理解技术领域,尤其涉及一种基于自编码神经网络的文本生成方法、装置、终端及介质。
背景技术
文本自动生成是自然语言处理领域的一个重要研究方向,实现文本自动生成也是人工智能走向成熟的一个重要标志。文本生成的应用可以分为监督式和无监督式的文本生成。对于监督的文本生成,例如机器翻译、智能问答系统、对话系统以及文本摘要。对于无监督的文本生成,通过学习到数据的原本分布,然后可以生成与原本数据类似的样本,例如诗歌创作、音乐创作等。利用文本生成,可以实现更加智能和自然的人机交互,通过文本自动生成系统替代编辑实现新闻的自动撰写与发布。
然而,现有的文本生成模型,例如对抗生成模型,一方面需要较大的数据标注资源,另一方面会耗费较大的资源建模。
发明内容
本申请的主要目的在于提供了一种基于自编码神经网络的文本生成方法、装置、终端及介质,旨在解决现有技术文本生成模型需要较大的数据标注资源或耗费较大的资源建模的技术问题。
为实现上述目的,本申请提供了一种基于自编码神经网络的文本生成方法,包括如下步骤:
获取待输入语句的文本词向量以及分类需求;
将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
根据预设分类尺度以及所述分类需求,修正所述隐含特征;
将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
根据生成的词向量,生成对应的文本。
为了实现上述目的,本申请还提供一种基于自编码神经网络的文本生成装置,包括:
获取模块,用于获取待输入语句的文本词向量以及分类需求;
输入模块,用于将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
修正模块,用于根据预设分类尺度以及所述分类需求,修正所述隐含特征;
解码模块,用于将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
生成模块,用于根据生成的词向量,生成对应的文本。
为了实现上述目的,本申请还提供一种终端,所述终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的基于自编码神经网络的文本生成计算机可读指令,所述基于自编码神经网络的文本生成计算机可读指令配置为实现上述的基于自编码神经网络的文本生成方法的步骤。
为了实现上述目的,本申请还提供一种存储介质,所述存储介质上存储有基于自编码神经网络的文本生成计算机可读指令,所述基于自编码神经网络的文本生成计算机可读指令被处理器执行时实现上述的基于自编码神经网络的文本生成方法的步骤。
本申请通过获取待输入语句的文本词向量以及分类需求,将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征,根据预设分类尺度以及所述分类需求,修正所述隐含特征,将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量,根据生成的词向量,生成对应的文本,通过预设分类尺度以及所述分类需求调节文本生成风格,而不要耗费大量的数据标注资源,不需要耗费大量的资源建模。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的终端的结构示意图;
图2为本申请基于自编码神经网络的文本生成方法第一实施例的流程示意图;
图3为本申请自编码神经网络学习模型一实施例的结构示意图;
图4为本申请基于自编码神经网络的文本生成方法第二实施例的流程示意图;
图5为本申请基于自编码神经网络的文本生成方法第三实施例的流程示意图;
图6为本申请基于自编码神经网络的文本生成方法第四实施例的流程示意图;
图7为本申请基于自编码神经网络的文本生成方法第五实施例的流程示意图;
图8为本申请基于自编码神经网络的文本生成装置第一实施例的结构框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的终端结构示意图。
如图1所示,该终端可以包括:处理器1001,例如中央处理器(Central Processing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入模块比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真(WIreless-FIdelity,WI-FI)接口)。存储器1005可以是高速的随机存取存储器(Random Access Memory,RAM)存储器,也可以是稳定的非易失性存储器(Non-Volatile Memory,NVM),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、数据存储模块、网络通信模块、用户接口模块以及基于自编码神经网络的文本生成计算机可读指令。
在图1所示的终端中,网络接口1004主要用于与网络服务器进行数据通信;用户接口1003主要用于与用户进行数据交互;本申请终端中的处理器1001、存储器1005可以设置在终端中,所述终端通过处理器1001调用存储器1005中存储的基于自编码神经网络的文本生成计算机可读指令,并执行本申请实施例提供的基于自编码神经网络的文本生成方法。
本申请实施例提供了一种基于自编码神经网络的文本生成方法,参照图2,图2为本申请基于自编码神经网络的文本生成方法第一实施例的流程示意图。
本实施例中,所述基于自编码神经网络的文本生成方法包括如下步骤:
步骤S10:获取待输入语句的文本词向量以及分类需求;
应该理解的是,本实施例方法的执行主体为终端,分类需求通常指的是期望的分类类别,例如以评价文本为例,评价可分为正面评价和负面评价多类,而分类需求可以是期望输出负面评价,也可以是期望输出正面评价;再例如以情绪文本为例,可以分为正面情绪、负面情绪;又例如以友好程度文本为例,可以分为非常友好、较友好、一般友好、不友好。所述分类需求可以是用户自定义输入得到的,也可以是预先设定的,在此不做具体限制。
具体实现时,所述待输入语句的文本词向量的步骤具体包括:获取输入语句,并对输入语句进行预处理;获取预处理后的输入语句的文本词向量。
对输入语句进行预处理通常包括:去除停用词,即文本中大量出现的对文本没有太大作用的词,例如汉语中“的”、“地”、“得”等,也可以是网页数据集中html标签,脚本语言等。
例如输入的文本为doc,则对应的文本词向量为{ω1、ω2、…ωi},ωi为语句中第i个词的词向量。
步骤S20:将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
应该理解的是,将所述文本词向量逆向输入已训练的自编码神经网络模型指的是,将所述文本词向量作为已训练的自编码神经网络模型的输出,逆向得到中间隐层的隐含特征(请参阅图3,图3为中间隐层为1层的情况,将文本词向量从输出层输入,得到位于输入层和输出层中间的隐层的隐含特征)。当中间隐层数为多层时,取最中间的隐层得到的隐含特征作为所述中间隐层的隐含特征。例如中间隐层数为3层时,则取中间的第二层得到的隐含特征作为所述中间隐层的隐含特征,又例如中间隐层数为2层时,则取两个中间隐层的隐含特征的平均值作为中间隐层的隐含特征,以此类推,当中间隐层数为奇数层时,取最中间的隐层得到的隐含特征作为所述中间隐层的隐含特征,当中间隐层数为偶数层时,则取最中间的两个中间隐层的隐含特征的平均值作为中间隐层的隐含特征。
具体实现时,所述将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征包括:
将所述文本词向量从已训练的自编码神经网络模型的输出层输入,自所述输出层逆向生成所述自编码神经网络模型的中间隐层的的隐含特征,作为所述自编码神经网络模型的中间隐层的隐含特征,其中,
当所述中间隐层为奇数层时,取最中间的中间隐层对应的隐含特征作为所述自编码神经网络模型的中间隐层的隐含特征;
当所述中间隐层为偶数层时,取最中间的两个中间隐层对应的隐含特征的平均值作为所述自编码神经网络模型的中间隐层的隐含特征。
自编码神经网路模型的训练过程包括:
预训练,利用不带类别标签的训练样本,正向训练自编码神经网络模型的第一隐层,得到第一层的参数(W1,b1),在隐层为多层时,网络第一隐层将原始输入转化成由隐藏单元激活值组成的向量,接着把该向量作为第二隐层的输入,继续训练得到第二层的参数(W2,b2),重复执行将前一层的输出作为下一层输入依次训练,在训练每一层参数的时候,其他各层的参数保持不变。也可以是在预训练完成后,通过反向传播算法同时调整所有层的参数,以完善结果。
步骤S30:根据预设分类尺度以及所述分类需求,修正所述隐含特征;
应该理解的是,分类尺度通常指的是各分类之间尺度,例如文本分类可以分为两类,分别为正面评价、负面评价,而由正面评价到负面评价之间的尺度即分类尺度。分类尺度可以是预先定义的,也可以是根据样本计算得到的。
以评价文本为例,评价可分为正面评价和负面评价,第i维特征分类尺度表示为Li=|h1i-h2i|,其中, h1i为第i维特征正面评价样本的隐含特征平均值,h2i为第i维特征负面评价样本的隐含特征平均值。
步骤S40:将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
应该理解的是,所述将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量是对修正后的隐含特征,解码为自编码神经网络模型的输入(如图3中,以中间隐层为单层为例,自隐层解码得到输入层,得到对应的词向量)。
步骤S50:根据生成的词向量,生成对应的文本。
应该理解的,所述根据生成的词向量,生成对应的文本的步骤是将生成的词向量对应的词语,形成文本。形成文本的方式可以是直接将各词语连接在一起,形成文本,也可以是按照一定的规则将各词语组成文本。
具体实现时,所述根据生成的词向量,生成对应的文本的步骤包括:
步骤S51:将生成的词向量与预训练的词向量库匹配,生成每一个词向量对应的词语;
应该理解的是,预训练的词向量库,是预先按照一定规则建立的词语与词向量之间的对应关系库。
步骤S52:将生成的词语连接在一起,生成对应的文本。
应该理解的是,生成文本的方式可以是直接将各词语连接在一起,形成文本,也可以是按照一定的规则将各词语组成文本。
本申请通过获取待输入语句的文本词向量以及分类需求,将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征,根据预设分类尺度以及所述分类需求,修正所述隐含特征,将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量,根据生成的词向量,生成对应的文本,通过预设分类尺度以及所述分类需求调节文本生成风格,如供用户调节客服机器人的对话方面的风格尺度,包括正负面情绪、友好程度等尺度,而不要耗费大量的数据标注资源,不需要耗费大量的资源建模。
参考图4,图4为本申请基于自编码神经网络的文本生成方法第二实施例的流程示意图。
基于上述第一实施例,在本实施例中,所述步骤S10之前,所述方法包括如下步骤:
步骤S101:获取带标签的多类训练样本,并生成对应的分类词向量;
应该理解的是,多类训练样本指的是训练样本分为多种类别,以评价文本为例,评价样本分为正面评价文本和负面评价文本两种类别,带标签的多类训练样本指的是多类训练样本分别带有标签(例如带有正面评价或负面评价的标签)。
步骤S102:将所述分类词向量正向输入所述已训练的自编码神经网络模型,得到多类样本的隐含特征;
应该理解的是,将所述分类词向量正向输入所述已训练的自编码神经网络模型指的是,将所述分类词向量作为已训练的自编码神经网络模型的输入,正向得到中间隐层的的隐含特征,作为多类样本的隐含特征(请参阅图3,图3为中间隐层为1层的情况,将分类词向量从输出层输入,得到位于输入层和输出层中间的隐层的隐含特征),当中间隐层数为多层时,取最中间的隐层得到的隐含特征作为多类样本的隐含特征。例如中间隐层数为3层时,则取中间的第二层得到的隐含特征作为多类样本的隐含特征,又例如中间隐层数为2层时,则取两个中间隐层的隐含特征的平均值作为多类样本的隐含特征。
步骤S103:计算多类样本的所述隐含特征的向量差,并作为多类样本的所述分类尺度。
应该理解的是,所述计算多类样本的所述隐含特征的向量差,并作为多类样本的所述分类尺度,以评价文本为例,评价可分为正面评价和负面评价,第i维特征分类尺度表示为Li=|h1i-h2i|,其中, h1i为第i维特征正面评价样本的隐含特征平均值,h2i为第i维特征负面评价样本的隐含特征平均值。
参考图5,图5为本申请基于自编码神经网络的文本生成方法第三实施例的流程示意图。
基于上述第二实施例,在本实施例中,所述步骤S30,具体包括:
步骤S31:根据所述预设分类尺度以及分类需求,确定所述隐含特征对应的调节向量;
具体实现时,假设分类尺度为L,分类需求为输出负面评价文本,则可以根据负面评价的程度确定调节向量b,通常调节向量b。
步骤S32:根据所述调节向量,修正所述隐含特征。
应该理解的是,根据确定的调节向量,修正所述隐含特征,可以是取隐含特征与调节向量的向量差,也可以是作为权值,以使修正后的隐含特征在解码后按分类需求输出。
参考图6,图6为本申请基于自编码神经网络的文本生成方法第四实施例的流程示意图。
基于上述第三实施例,在本实施例中,所述步骤S32,具体包括:
步骤S321:将所述隐含特征与所述调节向量的向量差,作为修正后的隐含特征。
具体实现时,修正前的隐含特征为h前,调节向量为b,则修正后的隐含特征h=h-b。
参考图7,图7为本申请基于自编码神经网络的文本生成方法第四实施例的流程示意图。
基于上述第一实施例,在本实施例中,所述步骤S10之前,所述方法还包括:
步骤S104:建立自编码神经网络模型;
应该理解的是,自编码神经网络模型是一种尽可能重构输入信号的无监督学习神经网络,自编码神经网络模型可以是多中间隐层,也可以单中间隐层的自编码网络模型(参见图3)。
步骤S105:获取不带类别标签的训练样本,并生成对应的词向量;
应该理解的是,不带类别标签的训练样本即该训练样本并未标记其类别。
步骤S106:将所述词向量正向输入,训练所述自编码神经网络模型。
具体实现时,利用不带类别标签的训练样本,正向训练自编码神经网络模型的第一隐层,得到第一层的参数(W1,b1),在隐层为多层时,网络第一隐层将原始输入转化成由隐藏单元激活值组成的向量,接着把该向量作为第二隐层的输入,继续训练得到第二层的参数(W2,b2),重复执行将前一层的输出作为下一层输入依次训练,在训练每一层参数的时候,其他各层的参数保持不变。也可以是在预训练完成后,通过反向传播算法同时调整所有层的参数,以完善结果。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质可以为非易失性可读存储介质。
所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上文所述的基于自编码神经网络的文本生成方法的步骤。
参照图8,图8为本申请基于自编码神经网络的文本生成装置第一实施例的结构框图。
如图8所示,本申请实施例提出的基于自编码神经网络的文本生成装置包括:
获取模块801,用于获取待输入语句的文本词向量以及分类需求;
应该理解的是,分类需求通常指的期望的分类类别,例如以评价文本为例,评价可分为正面和负面两类,而分类需求可以是期望输出负面评价,也可以是期望输出正面评价。所述分类需求可以是用户自定义输入得到的,也可以是预先设定的,在此不做具体限制。
具体实现时,所述待输入语句的文本词向量的步骤具体包括:获取输入语句,并对输入语句进行预处理;获取预处理后的输入语句的文本词向量。
对输入语句进行预处理通常包括:去除停用词,即文本中大量出现的对文本没有太大作用的词,例如汉语中“的”、“地”、“得”等,也可以是网页数据集中html标签,脚本语言等。
例如输入的文本为doc,则对应的文本词向量为{ω1、ω2、…ωi},ωi为语句中第i个词的词向量。
输入模块802,用于将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
应该理解的是,将所述文本词向量逆向输入已训练的自编码神经网络模型指的是,将所述文本词向量作为已训练的自编码神经网络模型的输出,逆向得到中间隐层的隐含特征(请参阅图3,图3为中间隐层为1层的情况,将文本词向量从输出层输入,得到位于输入层和输出层中间的隐层的隐含特征)。当中间隐层数为多层时,取最中间的隐层得到的隐含特征作为所述中间隐层的隐含特征。例如中间隐层数为3层时,则取中间的第二层得到的隐含特征作为所述中间隐层的隐含特征,又例如中间隐层数为2层时,则取两个中间隐层的隐含特征的平均值作为中间隐层的隐含特征。
具体实现时,自编码神经网路模型的训练过程包括:
预训练,利用不带类别标签的训练样本,正向训练自编码神经网络模型的第一隐层,得到第一层的参数(W1,b1),在隐层为多层时,网络第一隐层将原始输入转化成由隐藏单元激活值组成的向量,接着把该向量作为第二隐层的输入,继续训练得到第二层的参数(W2,b2),重复执行将前一层的输出作为下一层输入依次训练,在训练每一层参数的时候,其他各层的参数保持不变。也可以是在预训练完成后,通过反向传播算法同时调整所有层的参数,以完善结果。
修正模块803,用于根据预设分类尺度以及所述分类需求,修正所述隐含特征;
应该理解的是,分类尺度通常指的是各分类之间尺度,例如文本分类可以分为两类,分别为正面评价、负面评价,而由正面评价到负面评价之间的尺度即分类尺度。分类尺度可以是预先定义的,也可以是根据样本计算得到的。
以评价文本为例,评价可分为正面评价和负面评价,第i维特征分类尺度表示为Li=|h1i-h2i|,其中, h1i为第i维特征正面评价样本的隐含特征的平均值,h2i为第i维特征负面评价样本的隐含特征的平均值。
解码模块804,用于将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
应该理解的是,所述将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量是根据修正后的所述隐含特征,解码为自编码神经网络模型的输入(如图3中,以中间隐层为单层为例,自隐层解码得到输入层,得到对应的词向量)。
生成模块805,用于根据生成的词向量,生成对应的文本。
应该理解的,所述根据生成的词向量,生成对应的文本的步骤是将生成的词向量对应的词语,形成文本。形成文本的方式可以是直接将各词语连接在一起,形成文本,也可以是按照一定的规则将各词语组成文本。
本申请通过获取待输入语句的文本词向量以及分类需求,将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征,根据预设分类尺度以及所述分类需求,修正所述隐含特征,将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量,根据生成的词向量,生成对应的文本,通过预设分类尺度以及所述分类需求调节文本生成风格,如供用户调节客服机器人的对话方面的风格尺度,包括正负面情绪、友好程度等尺度,而不要耗费大量的数据标注资源,不需要耗费大量的资源建模。
本申请基于自编码神经网络的文本生成装置的其他实施例或具体实现方式可参照上述各方法实施例,此处不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述 实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器/随机存取存储器、磁碟、光 盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于自编码神经网络的文本生成方法,其特征在于,包括如下步骤:
    获取待输入语句的文本词向量以及分类需求;
    将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
    根据预设分类尺度以及所述分类需求,修正所述隐含特征;
    将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
    根据生成的词向量,生成对应的文本。
  2. 如权利要求1所述的基于自编码神经网络的文本生成方法,其特征在于,所述获取待输入语句的文本词向量以及分类需求的步骤之前,包括如下步骤:
    获取带标签的多类训练样本,并生成对应的分类词向量;
    将所述分类词向量正向输入所述已训练的自编码神经网络模型,得到多类样本的隐含特征;
    计算多类样本的所述隐含特征的向量差,并作为多类样本的所述分类尺度。
  3. 如权利要求2所述的基于自编码神经网络的文本生成方法,其特征在于,所述根据预设分类尺度以及所述分类需求,修正所述隐含特征的步骤,包括:
    根据所述预设分类尺度以及分类需求,确定所述隐含特征对应的调节向量;
    根据所述调节向量,修正所述隐含特征。
  4. 如权利要求3所述的基于自编码神经网络的文本生成方法,其特征在于,所述根据所述调节向量,修正所述隐含特征的步骤,包括:
    将所述隐含特征与所述调节向量的向量差,作为修正后的隐含特征,其中修正前的隐含特征为h,调节向量为b,则修正后的隐含特征h=h-b。
  5. 如权利要求1所述的基于自编码神经网络的文本生成方法,其特征在于,所述自编码神经网络模型的中间隐层为多层时;
    相应地,所述将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征的步骤,包括:
    将所述文本词向量从已训练的自编码神经网络模型的输出层输入,自所述输出层逆向生成所述自编码神经网络模型的中间隐层的的隐含特征,作为所述自编码神经网络模型的中间隐层的隐含特征,其中,
    当所述中间隐层为奇数层时,取最中间的中间隐层对应的隐含特征作为所述自编码神经网络模型的中间隐层的隐含特征;
    当所述中间隐层为偶数层时,取最中间的两个中间隐层对应的隐含特征的平均值作为所述自编码神经网络模型的中间隐层的隐含特征。
  6. 如权利要求1所述的基于自编码神经网络的文本生成方法,其特征在于,所述获取待输入语句的文本词向量以及分类需求的步骤之前,所述方法还包括如下步骤:
    建立自编码神经网络模型;
    获取不带类别标签的训练样本,并生成对应的词向量;
    将所述词向量正向输入,训练所述自编码神经网络模型,其中,训练过程为:
    将所述词向量正向输入,正向训练所述自编码神经网络模型的第一隐层,在隐层为多层时,将第一隐层由原始输入转化成由隐藏单元激活值组成的向量,将该向量作为第二隐层的输入,继续训练得到第二层的参数,重复执行将前一层的输出作为下一层输入依次训练,在训练每一层参数的时候,其他各层的参数保持不变。
  7. 如权利要求1所述的基于自编码神经网络的文本生成方法,其特征在于,所述根据生成的词向量,生成对应的文本的步骤,包括:
    将生成的词向量与预训练的词向量库匹配,生成每一个词向量对应的词语;
    将生成的词语连接在一起,生成对应的文本。
  8. 一种基于自编码神经网络的文本生成装置,其特征在于,包括:
    获取模块,用于获取待输入语句的文本词向量以及分类需求;
    输入模块,用于将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
    修正模块,用于根据预设分类尺度以及所述分类需求,修正所述隐含特征;
    解码模块,用于将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
    生成模块,用于根据生成的词向量,生成对应的文本。
  9. 一种终端,其特征在于,所述终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的基于自编码神经网络的文本生成计算机可读指令,所述基于自编码神经网络的文本生成计算机可读指令配置为实现以下步骤:
    获取待输入语句的文本词向量以及分类需求;
    将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
    根据预设分类尺度以及所述分类需求,修正所述隐含特征;
    将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
    根据生成的词向量,生成对应的文本。
  10. 如权利要求9所述的终端,其特征在于,所述获取待输入语句的文本词向量以及分类需求的步骤之前,包括如下步骤:
    获取带标签的多类训练样本,并生成对应的分类词向量;
    将所述分类词向量正向输入所述已训练的自编码神经网络模型,得到多类样本的隐含特征;
    计算多类样本的所述隐含特征的向量差,并作为多类样本的所述分类尺度。
  11. 如权利要求10所述的终端,其特征在于,所述根据预设分类尺度以及所述分类需求,修正所述隐含特征的步骤,包括:
    根据所述预设分类尺度以及分类需求,确定所述隐含特征对应的调节向量;
    根据所述调节向量,修正所述隐含特征。
  12. 如权利要求11所述的终端,其特征在于,所述根据所述调节向量,修正所述隐含特征的步骤,包括:
    将所述隐含特征与所述调节向量的向量差,作为修正后的隐含特征,其中修正前的隐含特征为h,调节向量为b,则修正后的隐含特征h=h-b。
  13. 如权利要求9所述的终端,其特征在于,所述自编码神经网络模型的中间隐层为多层时;
    相应地,所述将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征的步骤,包括:
    将所述文本词向量从已训练的自编码神经网络模型的输出层输入,自所述输出层逆向生成所述自编码神经网络模型的中间隐层的的隐含特征,作为所述自编码神经网络模型的中间隐层的隐含特征,其中,
    当所述中间隐层为奇数层时,取最中间的中间隐层对应的隐含特征作为所述自编码神经网络模型的中间隐层的隐含特征;
    当所述中间隐层为偶数层时,取最中间的两个中间隐层对应的隐含特征的平均值作为所述自编码神经网络模型的中间隐层的隐含特征。
  14. 如权利要求9所述的终端,其特征在于,所述获取待输入语句的文本词向量以及分类需求的步骤之前,所述方法还包括如下步骤:
    建立自编码神经网络模型;
    获取不带类别标签的训练样本,并生成对应的词向量;
    将所述词向量正向输入,训练所述自编码神经网络模型,其中,训练过程为:
    将所述词向量正向输入,正向训练所述自编码神经网络模型的第一隐层,在隐层为多层时,将第一隐层由原始输入转化成由隐藏单元激活值组成的向量,将该向量作为第二隐层的输入,继续训练得到第二层的参数,重复执行将前一层的输出作为下一层输入依次训练,在训练每一层参数的时候,其他各层的参数保持不变。
  15. 如权利要求9所述的终端,其特征在于,所述根据生成的词向量,生成对应的文本的步骤,包括:
    将生成的词向量与预训练的词向量库匹配,生成每一个词向量对应的词语;
    将生成的词语连接在一起,生成对应的文本。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有基于自编码神经网络的文本生成计算机可读指令,所述基于自编码神经网络的文本生成计算机可读指令被处理器执行时实现以下步骤:
    获取待输入语句的文本词向量以及分类需求;
    将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征;
    根据预设分类尺度以及所述分类需求,修正所述隐含特征;
    将修正后的隐含特征作为所述自编码神经网络模型的中间隐层,自所述中间隐层逆向生成所述自编码神经网络模型的输入层对应的词向量;
    根据生成的词向量,生成对应的文本。
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述获取待输入语句的文本词向量以及分类需求的步骤之前,包括如下步骤:
    获取带标签的多类训练样本,并生成对应的分类词向量;
    将所述分类词向量正向输入所述已训练的自编码神经网络模型,得到多类样本的隐含特征;
    计算多类样本的所述隐含特征的向量差,并作为多类样本的所述分类尺度。
  18. 如权利要求17所述的计算机可读存储介质,其特征在于,所述根据预设分类尺度以及所述分类需求,修正所述隐含特征的步骤,包括:
    根据所述预设分类尺度以及分类需求,确定所述隐含特征对应的调节向量;
    根据所述调节向量,修正所述隐含特征。
  19. 如权利要求18所述的计算机可读存储介质,其特征在于,所述根据所述调节向量,修正所述隐含特征的步骤,包括:
    将所述隐含特征与所述调节向量的向量差,作为修正后的隐含特征,其中修正前的隐含特征为h,调节向量为b,则修正后的隐含特征h=h-b。
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,所述自编码神经网络模型的中间隐层为多层时;
    相应地,所述将所述文本词向量逆向输入已训练的自编码神经网络模型,得到所述自编码神经网络模型的中间隐层的隐含特征的步骤,包括:
    将所述文本词向量从已训练的自编码神经网络模型的输出层输入,自所述输出层逆向生成所述自编码神经网络模型的中间隐层的的隐含特征,作为所述自编码神经网络模型的中间隐层的隐含特征,其中,
    当所述中间隐层为奇数层时,取最中间的中间隐层对应的隐含特征作为所述自编码神经网络模型的中间隐层的隐含特征;
    当所述中间隐层为偶数层时,取最中间的两个中间隐层对应的隐含特征的平均值作为所述自编码神经网络模型的中间隐层的隐含特征。
PCT/CN2019/092957 2018-12-13 2019-06-26 基于自编码神经网络的文本生成方法、装置、终端及介质 WO2020119069A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
SG11202001765TA SG11202001765TA (en) 2018-12-13 2019-06-26 Method, device, and terminal for generating a text based on self-encoding neural network, and medium
US16/637,274 US11487952B2 (en) 2018-12-13 2019-06-26 Method and terminal for generating a text based on self-encoding neural network, and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811526185.4A CN109783603B (zh) 2018-12-13 2018-12-13 基于自编码神经网络的文本生成方法、装置、终端及介质
CN201811526185.4 2018-12-13

Publications (1)

Publication Number Publication Date
WO2020119069A1 true WO2020119069A1 (zh) 2020-06-18

Family

ID=66496922

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/092957 WO2020119069A1 (zh) 2018-12-13 2019-06-26 基于自编码神经网络的文本生成方法、装置、终端及介质

Country Status (4)

Country Link
US (1) US11487952B2 (zh)
CN (1) CN109783603B (zh)
SG (1) SG11202001765TA (zh)
WO (1) WO2020119069A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800750A (zh) * 2021-01-26 2021-05-14 浙江香侬慧语科技有限责任公司 一种无监督的非自回归古诗生成方法、装置及存储介质
CN112966112A (zh) * 2021-03-25 2021-06-15 支付宝(杭州)信息技术有限公司 基于对抗学习的文本分类模型训练和文本分类方法及装置
CN113553052A (zh) * 2021-06-09 2021-10-26 麒麟软件有限公司 使用Attention编码表示自动识别与安全相关的代码提交的方法
CN116010669A (zh) * 2023-01-18 2023-04-25 深存科技(无锡)有限公司 向量库重训练的触发方法、装置、检索服务器及存储介质
CN112800750B (zh) * 2021-01-26 2024-06-07 浙江香侬慧语科技有限责任公司 一种无监督的非自回归古诗生成方法、装置及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783603B (zh) * 2018-12-13 2023-05-26 平安科技(深圳)有限公司 基于自编码神经网络的文本生成方法、装置、终端及介质
CN110321929A (zh) * 2019-06-04 2019-10-11 平安科技(深圳)有限公司 一种提取文本特征的方法、装置及存储介质
CN111414733B (zh) * 2020-03-18 2022-08-19 联想(北京)有限公司 一种数据处理方法、装置及电子设备
CN112035628A (zh) * 2020-08-03 2020-12-04 北京小米松果电子有限公司 对话数据清洗方法、装置及存储介质
CN113780450B (zh) * 2021-09-16 2023-07-28 郑州云智信安安全技术有限公司 基于自编码神经网络的分布式存储方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930314A (zh) * 2016-04-14 2016-09-07 清华大学 基于编码-解码深度神经网络的文本摘要生成系统及方法
CN107562718A (zh) * 2017-07-24 2018-01-09 科大讯飞股份有限公司 文本规整方法及装置、存储介质、电子设备
CN107844469A (zh) * 2017-10-26 2018-03-27 北京大学 基于词向量查询模型的文本简化方法
CN108334497A (zh) * 2018-02-06 2018-07-27 北京航空航天大学 自动生成文本的方法和装置
CN108763191A (zh) * 2018-04-16 2018-11-06 华南师范大学 一种文本摘要生成方法及系统
WO2018213840A1 (en) * 2017-05-19 2018-11-22 Google Llc Depthwise separable convolutions for neural machine translation
CN109783603A (zh) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 基于自编码神经网络的文本生成方法、装置、终端及介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5461699A (en) * 1993-10-25 1995-10-24 International Business Machines Corporation Forecasting using a neural network and a statistical forecast
CN101930561A (zh) * 2010-05-21 2010-12-29 电子科技大学 一种基于N-Gram分词模型的反向神经网络垃圾邮件过滤装置
CN106815194A (zh) * 2015-11-27 2017-06-09 北京国双科技有限公司 模型训练方法及装置和关键词识别方法及装置
US20180075014A1 (en) * 2016-09-11 2018-03-15 Xiaojiang Duan Conversational artificial intelligence system and method using advanced language elements
CN108628868B (zh) * 2017-03-16 2021-08-10 北京京东尚科信息技术有限公司 文本分类方法和装置
US11243944B2 (en) * 2017-06-29 2022-02-08 Futurewei Technologies, Inc. Dynamic semantic networks for language understanding and question answering

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930314A (zh) * 2016-04-14 2016-09-07 清华大学 基于编码-解码深度神经网络的文本摘要生成系统及方法
WO2018213840A1 (en) * 2017-05-19 2018-11-22 Google Llc Depthwise separable convolutions for neural machine translation
CN107562718A (zh) * 2017-07-24 2018-01-09 科大讯飞股份有限公司 文本规整方法及装置、存储介质、电子设备
CN107844469A (zh) * 2017-10-26 2018-03-27 北京大学 基于词向量查询模型的文本简化方法
CN108334497A (zh) * 2018-02-06 2018-07-27 北京航空航天大学 自动生成文本的方法和装置
CN108763191A (zh) * 2018-04-16 2018-11-06 华南师范大学 一种文本摘要生成方法及系统
CN109783603A (zh) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 基于自编码神经网络的文本生成方法、装置、终端及介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800750A (zh) * 2021-01-26 2021-05-14 浙江香侬慧语科技有限责任公司 一种无监督的非自回归古诗生成方法、装置及存储介质
CN112800750B (zh) * 2021-01-26 2024-06-07 浙江香侬慧语科技有限责任公司 一种无监督的非自回归古诗生成方法、装置及存储介质
CN112966112A (zh) * 2021-03-25 2021-06-15 支付宝(杭州)信息技术有限公司 基于对抗学习的文本分类模型训练和文本分类方法及装置
CN112966112B (zh) * 2021-03-25 2023-08-08 支付宝(杭州)信息技术有限公司 基于对抗学习的文本分类模型训练和文本分类方法及装置
CN113553052A (zh) * 2021-06-09 2021-10-26 麒麟软件有限公司 使用Attention编码表示自动识别与安全相关的代码提交的方法
CN116010669A (zh) * 2023-01-18 2023-04-25 深存科技(无锡)有限公司 向量库重训练的触发方法、装置、检索服务器及存储介质
CN116010669B (zh) * 2023-01-18 2023-12-08 深存科技(无锡)有限公司 向量库重训练的触发方法、装置、检索服务器及存储介质

Also Published As

Publication number Publication date
US11487952B2 (en) 2022-11-01
SG11202001765TA (en) 2020-07-29
CN109783603B (zh) 2023-05-26
US20210165970A1 (en) 2021-06-03
CN109783603A (zh) 2019-05-21

Similar Documents

Publication Publication Date Title
WO2020119069A1 (zh) 基于自编码神经网络的文本生成方法、装置、终端及介质
WO2020107761A1 (zh) 广告文案处理方法、装置、设备及计算机可读存储介质
WO2020107762A1 (zh) Ctr预估方法、装置及计算机可读存储介质
WO2020164267A1 (zh) 文本分类模型构建方法、装置、终端及存储介质
WO2016112558A1 (zh) 智能交互系统中的问题匹配方法和系统
WO2021132927A1 (en) Computing device and method of classifying category of data
WO2020015067A1 (zh) 数据采集方法、装置、设备及存储介质
WO2020233077A1 (zh) 系统服务的监控方法、装置、设备及存储介质
WO2020143322A1 (zh) 用户请求的检测方法、装置、计算机设备及存储介质
WO2018164378A1 (en) Electronic apparatus for compressing language model, electronic apparatus for providing recommendation word and operation methods thereof
WO2020253115A1 (zh) 基于语音识别的产品推荐方法、装置、设备和存储介质
WO2021141419A1 (en) Method and apparatus for generating customized content based on user intent
WO2020107591A1 (zh) 重复投保限制方法、装置、设备及可读存储介质
WO2019209040A1 (en) Multi-models that understand natural language phrases
WO2020233089A1 (zh) 测试用例生成方法、装置、终端及计算机可读存储介质
WO2020191934A1 (zh) 终端喇叭的控制方法、设备及计算机可读存储介质
WO2021010744A1 (ko) 음성 인식 기반의 세일즈 대화 분석 방법 및 장치
EP3756145A1 (en) Electronic apparatus and control method thereof
WO2020087981A1 (zh) 风控审核模型生成方法、装置、设备及可读存储介质
WO2019019350A1 (zh) 开户页面的生成方法、装置、设备及计算机可读存储介质
WO2022085958A1 (ko) 전자 장치 및 그 동작방법
EP3577571A1 (en) Electronic apparatus for compressing language model, electronic apparatus for providing recommendation word and operation methods thereof
WO2019112117A1 (ko) 텍스트 콘텐츠 작성자의 메타정보를 추론하는 방법 및 컴퓨터 프로그램
WO2020114184A1 (zh) 联合建模方法、装置、设备以及计算机可读存储介质
WO2020199599A1 (zh) 工作队列的信息展示方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.10.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19897059

Country of ref document: EP

Kind code of ref document: A1