WO2023207690A1 - 一种文本生成方法、装置、电子设备及介质 - Google Patents

一种文本生成方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2023207690A1
WO2023207690A1 PCT/CN2023/089101 CN2023089101W WO2023207690A1 WO 2023207690 A1 WO2023207690 A1 WO 2023207690A1 CN 2023089101 W CN2023089101 W CN 2023089101W WO 2023207690 A1 WO2023207690 A1 WO 2023207690A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
screening
candidate
evaluation
candidate set
Prior art date
Application number
PCT/CN2023/089101
Other languages
English (en)
French (fr)
Inventor
王涛
赵程绮
王明轩
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2023207690A1 publication Critical patent/WO2023207690A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/51Translation evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks

Definitions

  • the embodiments of the present disclosure relate to the field of data processing technology, such as a text generation method, device, electronic equipment, and media.
  • Text generation is an important part of Natural Language Processing (NLP), which aims to use machines to generate natural language text.
  • NLP Natural Language Processing
  • the text generation model refers to the system model for text generation, which can include an encoder and a decoder.
  • the encoder encodes the input into a vector representation
  • the decoder relies on the vector representation of the encoder to generate word-by-word data according to the specified decoding algorithm. Generate the required text.
  • Embodiments of the present disclosure provide a text generation method, device, electronic device, and medium to improve the problem of poor results generated by a given text generation model.
  • embodiments of the present disclosure provide a text generation method, including:
  • the text candidate set includes a plurality of candidate texts, and the candidate texts are texts obtained after processing the text to be processed;
  • the text candidate set is processed by at least two screening methods to obtain the target text;
  • the at least two screening methods are screening methods with different functions
  • the output of each screening is used as the input of the next screening method
  • the result of each screening output is determined based on the evaluation score of the screened candidate text
  • the screened candidate The evaluation score of a text is determined based on the evaluation metrics of the filtering method used.
  • embodiments of the present disclosure also provide a text generation device, including:
  • the acquisition module is configured to acquire a text candidate set, the text candidate set includes a plurality of candidate texts, and the candidate text is a text obtained after processing the text to be processed;
  • the processing module is configured to refer to the screening strategy of minimum Bayesian risk decoding, and after processing the text candidate set at least twice with the screening method, the target text is obtained;
  • the at least two screening methods are screening methods with different functions
  • the output of each screening is used as the input of the next screening method
  • the result of each screening output is determined based on the evaluation score of the screened candidate text
  • the screened candidate The evaluation score of a text is determined based on the evaluation metrics of the filtering method used.
  • embodiments of the present disclosure also provide an electronic device, including:
  • a storage device for storing one or more programs
  • the one or more programs are executed by the one or more processing devices, so that the one or more processing devices implement the processing method provided by the embodiment of the present disclosure.
  • embodiments of the present disclosure also provide a computer-readable medium.
  • a computer program is stored on the computer-readable medium.
  • the processing method provided by the embodiment of the disclosure is implemented.
  • Figure 1 is a schematic flowchart of a text generation method provided by Embodiment 1 of the present disclosure
  • Figure 2 is a schematic flowchart of a text generation method provided in Embodiment 2 of the present disclosure
  • Figure 3 is a schematic diagram of the implementation of a text generation method provided by Embodiment 2 of the present disclosure
  • Figure 4 is a schematic structural diagram of a text generation device provided in Embodiment 3 of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present disclosure.
  • the term “include” and its variations are open-ended, ie, “including but not limited to.”
  • the term “based on” means “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; and the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • Text generation is an important part of NLP, which aims to use machines to generate natural language text. Text generation is an important part of many NLP tasks, including: (1) Machine Translation (MT), which refers to translating one language into another language; (2) Dialog generation, which refers to generating text based on user input. The input utterance generates corresponding answers; (3) Summary generation refers to generating a brief summary based on a long text.
  • MT Machine Translation
  • Dialog generation which refers to generating text based on user input. The input utterance generates corresponding answers
  • Summary generation refers to generating a brief summary based on a long text.
  • Figure 1 is a schematic flowchart of a text generation method provided by Embodiment 1 of the present disclosure. This method can be applied to the situation of generating target text. This method can be executed by a text generation device, where the device can be implemented by software and/or hardware. , and are generally integrated on electronic devices.
  • electronic devices include: computers, laptops, smart phones, tablets and other devices.
  • a text generation method provided by Embodiment 1 of the present disclosure includes the following steps:
  • the text candidate set can be understood as a set of multiple results (the results may include candidate texts) obtained by inputting the text to be processed into the corresponding text generation model.
  • the text candidate set may include multiple candidate texts, and each candidate text may be understood as a text obtained after processing the text to be processed.
  • the text to be processed can be understood as the text content waiting to be processed.
  • the text to be processed can be the text waiting to be translated, such as an English sentence or a Chinese sentence
  • the text to be processed can be a paragraph in the dialogue.
  • the summary generation scenario the text to be processed can be text waiting to be summarized, such as a paragraph of text content.
  • the text generation model may adopt an end-to-end structure based on a deep neural network.
  • the end-to-end neural network model ie, text generation model
  • the encoder encodes the input into a vector representation.
  • the decoder relies on the vector of the encoder. Indicates that the required text is generated word by word according to the specified decoding algorithm.
  • the decoding algorithm can refer to the end-to-end model's strategy for selecting words during the decoding process. Decoding algorithms can be mainly divided into Greedy Decoding method, Beam Search method and Sampling method.
  • the greedy decoding method can refer to selecting the word with the highest probability as the result at each step.
  • the beam search method can refer to recording the first few paths with the highest probability at each step to avoid missing high-probability words that may appear in the next few steps.
  • the sampling decoding method can refer to random selection from candidate words according to probability distribution at each step, and the result is random; the selection of candidate words can vary according to the strategy, but it is often the first n (n is smaller than the vocabulary size) words.
  • the text generation model can process the text to be processed based on the greedy decoding method to obtain the corresponding text candidate set; the text generation model can also process the text to be processed based on the beam search method to obtain the corresponding text candidate set; or the text generation model can also The text to be processed is processed based on the sampling decoding method to obtain the corresponding text candidate set.
  • S120 Referring to the screening strategy of minimum Bayesian risk decoding, perform at least two screening methods on the text candidate set to obtain the target text.
  • minimum Bayesian risk decoding is also a decoding algorithm that can be used in speech recognition and statistical machine translation.
  • Minimum Bayesian risk decoding can be understood as evaluating and scoring each candidate text, and the scored score can be called the evaluation score.
  • the screening strategy of minimum Bayesian risk decoding can be understood as a processing strategy that filters the candidate texts in the text candidate set to obtain the final result so that the evaluation score meets the requirements. For example, candidate texts are selected from the text candidate set so that the selected candidate texts are the required number of target candidate texts selected according to the level of the evaluation score.
  • the screening method can be understood as a screening strategy based on minimum Bayesian risk decoding for classifying documents.
  • the target text can be understood as the text result obtained after at least two screening methods are performed on the text candidate set.
  • each screening method for processing the text candidate set can be a screening method with different functions, so as to realize different screening functions.
  • Optional screening methods include one or more of the following: quality screening, style screening and keyword filtering; when the screening method used is quality screening, the evaluation index used to determine the evaluation score is the quality index; when the screening method used is quality screening When keyword filtering is used, the evaluation index used to determine the evaluation score is the keyword index; when the filtering method used is style filtering, the evaluation index used to determine the evaluation score is the style index.
  • Quality screening can be understood as a method of screening the quality of each text in the text candidate set.
  • Quality indicators can be understood as evaluation indicators corresponding to quality screening.
  • the quality indicators corresponding to quality screening may include at least the Bilingual Evaluation Understudy (BLEU) indicator and the Bilingual Evaluation Understudy with Representations from Transformers (BLEURT) indicator.
  • BLEU Bilingual Evaluation Understudy
  • BLEURT Bilingual Evaluation Understudy with Representations from Transformers
  • Quality can be understood as the amount used to characterize the similarity between the text in the text candidate set and the reference text in aspects such as semantics or word form.
  • the evaluation index can be understood as an index that automatically evaluates the score of each candidate text.
  • Style filtering can be understood as a method of filtering the style of each text in the text candidate set.
  • Style indicators can be understood as evaluation indicators corresponding to style screening.
  • Style can be understood as the amount of length or length used to characterize a text.
  • style filtering includes length filtering
  • style indicators include length indicators
  • the length indicator is considered to be the evaluation indicator corresponding to length screening, which refers to the indicator used to analyze and process the length of the text.
  • Keyword filtering can be understood as a method of filtering ambiguous or sensitive words in each text of the text candidate set.
  • the ambiguous or sensitive words can be set as keywords to be filtered in advance.
  • Keyword indicators can be understood as evaluation indicators corresponding to keyword filtering.
  • the set number can be multiple. For example, you can refer to the screening strategy of minimum Bayesian risk decoding to select the evaluation from the corresponding text candidate set. Multiple candidate texts with top scores are used as the output of this screening; when the last screening method is executed, the set number can be one, that is, referring to the screening strategy of minimum Bayesian risk decoding, the evaluation is selected from the corresponding text candidate set The candidate text with the highest score is used as the output target text.
  • the output of each screening can be used as the input of the next screening method, and the result of each screening output can be determined based on the evaluation score of the screened candidate text.
  • the screened candidate text can be understood as the candidate text to be processed by the current screening method.
  • the screened candidate text can be the text candidate set obtained after screening by the previous screening method; that is to say, each screening method can correspond to one Text candidate set, this text candidate set serves as the candidate text to be filtered, and the text candidate set corresponding to each screening method can be different, that is, it is the text candidate set obtained after the last screening method.
  • the top one or more candidate texts may be selected as the result of each filtering output in order of the evaluation scores of the filtered candidate texts from high to low.
  • the first screening method is used to process the candidate texts in the text candidate set to obtain each candidate text. Evaluation scores of candidate texts; in this process, all candidate texts in the text candidate set can be processed, or in order to reduce the amount of calculation, only some candidate texts in the text candidate set can be processed (such as selecting 1000 candidate texts among them for processing, Here, how to select some candidate texts from the text candidate set, and the number of selected candidate texts can be determined according to the actual situation); on this basis, the text candidate set can be screened based on the obtained evaluation scores.
  • the evaluation scores can be selected
  • the top set number of candidate texts (such as the top 100) are used as the output of the first screening method.
  • the input of the second screening method is the output of the first screening (such as the first 100 selected above).
  • the second screening method is used to process the corresponding text candidate set (that is, the output of the first screening) to obtain each The evaluation score corresponding to the candidate text, and the candidate text with the highest evaluation score is selected as the target text.
  • the output of the second screening can continue to be used as the output of the subsequent screening method (i.e., the third screening method)
  • the input is processed, and so on, until the target text is filtered.
  • the evaluation score of the filtered candidate text may be determined based on the evaluation metrics of the used screening method. Screening methods can be divided into two types, one is a method that requires reference for screening, and the other is a method that does not require reference for screening. When determining the evaluation score of the filtered candidate text based on the evaluation metric of the used screening method, the evaluation scores of the two types of screening methods may be scored differently.
  • the evaluation score of the target candidate text is the average of multiple reference evaluation scores of the target candidate text, and each reference evaluation score is the target candidate text based on the evaluation index corresponding to the reference.
  • the candidate text is a score determined by reference, and the reference candidate text and The target candidate text is located in the same text candidate set.
  • the target candidate text can be understood as the candidate text currently to be evaluated and scored in the text candidate set, where the text candidate set is the text candidate set corresponding to the screening method.
  • the reference candidate text can be understood as the text candidate set corresponding to the target candidate text, as the candidate text referenced by the target candidate text; for example, in the text candidate set corresponding to the target candidate text, the reference candidate text can be all candidate texts in the text candidate set, It may also be all candidate texts in the text candidate set except the target candidate text, or it may also be a subset included in the text candidate set.
  • the reference candidate text may be these 100 candidate texts; it may also be 99 other candidate texts except the target candidate text; or it may also be A subset of 100 candidate texts, such as 50 candidate texts among them.
  • the target candidate text of the target candidate text is 99 other candidate texts except the target candidate text
  • the target candidate text will be based on the evaluation index with each reference candidate text as The reference can determine a corresponding reference evaluation score, and 99 reference evaluation scores can be obtained; on this basis, the average of these 99 reference evaluation scores can be used as the evaluation score of the target candidate text.
  • the evaluation score of the target candidate text is a score based on the evaluation index.
  • Each screening method can correspond to an evaluation index.
  • the evaluation score of the target candidate text can be a score based on the evaluation index.
  • the result of the filtering output is a required number of target candidate texts selected from the filtered candidate texts according to the evaluation score of each target candidate text.
  • a required number of target candidate texts can be selected from the filtered candidate texts in order from high to low according to the evaluation scores of each target candidate text as the result of the screening output.
  • the required quantity can be one or more, which can be flexibly set according to actual needs.
  • a text generation method provided in Embodiment 1 of the present disclosure obtains a text candidate set.
  • the text candidate set includes multiple candidate texts, and each candidate text is a text obtained after processing the text to be processed; refer to the screening of minimum Bayesian risk decoding
  • the strategy is to process the text candidate set with at least two screening methods to obtain the target text; among them, each screening method is a screening method with different functions, the output of each screening is used as the input of the next screening method, and the result of each screening output
  • the evaluation score of the filtered candidate text is determined based on the evaluation score of the filtered candidate text, which is determined based on the evaluation index of the used screening method. This method can improve the problem of poor results generated by a given text generation model by subjecting the obtained text candidate set to at least two screening methods.
  • the filtering method for text generation can be based on the user input Text generation needs are determined.
  • the method further includes: obtaining demand information on a human-computer interaction interface; and determining a corresponding screening method based on the demand information.
  • the human-computer interaction interface can be understood as an interface that allows human-computer interaction, that is, the user can input or select relevant instructions through the interface to interact with the electronic device for text generation.
  • Requirement information can be understood as information about text generation requirements input by the user on the human-computer interaction interface.
  • the requirement information may include text quality requirements and/or text length requirements, etc.
  • the corresponding filtering method can be determined as a filtering method for generating high-quality text and a filtering method for generating condensed text. , that is to say, the obtained text candidate set is processed by the above two screening methods to obtain the corresponding target text.
  • FIG. 2 is a schematic flowchart of a text generation method provided in Embodiment 2 of the present disclosure. This second embodiment is refined based on the above embodiments. In this embodiment, the process of obtaining a text candidate set and performing at least two screening methods on the text candidate set to obtain the target text is described in detail. It should be noted that please refer to Embodiment 1 for details that are not yet detailed in this embodiment.
  • Embodiment 2 of the present disclosure provides a text generation method, which includes the following steps:
  • the processing model may refer to a model that can be used to encode and decode the text to be processed; for example, the processing model may be an end-to-end model based on a deep neural network and may include an encoder and a decoder, and the encoder may be used to
  • the text to be processed is encoded into a vector representation, and then the decoder can rely on the encoder's vector representation to generate the corresponding text candidate set according to the specified decoding algorithm.
  • the text to be processed can be input into the processing model, and the processing model can process the text to be processed based on the sampling decoding method (ie, the Sampling method) to obtain the corresponding text candidate set.
  • This embodiment uses a processing model based on the sampling decoding method to process the text to be processed, so that the text candidate set obtained after processing can have diversified characteristics.
  • processing model here can be understood as a text generation model.
  • S220 Process the text candidate set with the first screening method to obtain the first setting in the text candidate set. Evaluation scores for a number of target candidate texts.
  • the first screening method can be understood as the screening method selected during the first screening; the screening method may include quality screening, style screening and keyword screening.
  • the first set number can be understood as the number of target candidate texts processed when the first screening method is used to process the text candidate set. For example, when using the first screening method to process the text candidate set, some target candidate texts in the text candidate set may be selected for processing, or all target candidate texts in the text candidate set may be processed. The number of some target candidate texts or all target candidate texts may be a first set number. The first set number of target candidate texts may be regarded as the filtered candidate texts.
  • the text candidate set is processed by a first screening method to obtain evaluation scores corresponding to the first set number of target candidate texts in the text candidate set.
  • S230 Referring to the screening strategy of minimum Bayesian risk decoding, and in order of evaluation scores from high to low, select a second set number of target candidate texts from the text candidate set to form a selected text candidate set as the output of this screening. result.
  • the second set number may refer to a preset number of target candidate texts with top evaluation scores. Referring to the screening strategy of minimum Bayesian risk decoding, based on the obtained evaluation scores, according to the order of evaluation scores from high to low, a second set number of target candidate texts can be selected from the text candidate set to form the selected text candidate set. , and use the selected text candidate set as the result of this screening output.
  • the second set number may be smaller than the first set number.
  • the second screening method can be understood as the last screening method executed among the screening methods; if a total of 3 screening methods are used, then the last screening method executed, that is, the third screening method can be regarded as This is the second screening method.
  • the output of the first screening can be processed as the input of the next (that is, the second) screening method, and on this basis, the output of the second screening can be used as the next (that is, the third) screening method.
  • the input is processed, and so on, until the target text is obtained from the selected text candidate set through the second filtering method.
  • the number of subsequent screening methods can be one or more.
  • the corresponding screening method may be determined based on the demand information obtained at the human interaction interface, which may include at least two screening methods for processing the text candidate set to obtain the target text.
  • Embodiment 2 provides a text generation method.
  • This method processes the text to be processed through a sampling decoding method, and can obtain a diverse text candidate set. It also processes the text candidate set at least twice with a screening method, and through multiple Different filtering functions of the filtering method can improve the problem of poor results generated by a given text generation model.
  • the filtering method is quality filtering, style filtering and/or keyword filtering, it can improve the diversity of text generation results while improving , also improves the quality of text generation and/or controls the style of text generation.
  • Table 1 shows the correspondence between the two screening methods and the corresponding translations.
  • Table 1 there are exemplary results of English to Chinese translation using the beam search method and the sampling decoding method respectively. It can be seen that the quality of the results based on the beam search method is higher than the results based on sampling, but the words and sentence patterns are relatively similar.
  • the translation results based on the sampling decoding method are relatively diverse, some sentences do not conform to the semantics (for example, "boss" in the last translation result does not conform to the semantics).
  • Embodiments of the present disclosure provide a text generation method.
  • the length of the translation is controlled through length filtering, which can optimize the effect of bilingual subtitle display; and the filtering of dirty words in the translation is controlled through keyword filtering, which can Reduce the occurrence of indecent translations. While this method obtains high-quality results through quality screening, the text candidate set processed by the sampling decoding method can also maintain the diversity of candidate texts in the text candidate set, and control the style of the result output to a certain extent through style screening.
  • input x can represent the text to be processed
  • H i.e., text candidate set
  • the Bayesian risk decoding strategy is to select the most appropriate result h from the translation candidate set H, so that c(h) has the highest score, thereby minimizing the risk 1-c.
  • Evaluation indicators can be divided into two categories, one that requires reference to score, and one that can score without reference. Generally speaking, the indicators for evaluating quality often need to refer to the translation. Compare the translation (i.e., the target candidate text) and the reference translation (i.e., the reference candidate text), and give a score, that is, the evaluation score.
  • the corresponding formula can be expressed as:
  • E can represent expectations; r can represent the reference translation; c(h, r) can represent the evaluation score of any translation h in the translation candidate set with r as the reference; arg max can represent finding the parameter with the maximum value, that is It is understood that the most appropriate result h is selected based on the expectation of the evaluation score in c(h, r), so that c(h, r) has the highest score.
  • the top N candidate translations can be selected as output based on the evaluation scores obtained above, which can be expressed as:
  • top(N) can represent selecting the N candidate translations with top evaluation scores in order from high to low; y can represent each translation in the translation candidate set, that is, the reference candidate text; c(h, y) can represent Indicates the reference evaluation score of any translation h in the translation candidate set with y as the reference.
  • can represent the number of reference translations. It can represent the sum of all reference evaluation scores of any translation h with y as the reference. On this basis, the mean of all reference evaluation scores of any translation h (that is, the sum of all reference evaluation scores and the reference translation The ratio of the number
  • the translation with the highest evaluation score can be selected as the output, which can be expressed as:
  • h best can be expressed as the translation with the highest evaluation score.
  • the score can be directly scored (that is, the score based on the evaluation indicator) without other operations; similarly, in this case, at least two screening methods are performed on the translation candidate set.
  • the candidate translations can be sorted from high to low based on the evaluation scores obtained above, and then the top N candidate translations are selected as the output, which can be expressed as:
  • the translation with the highest evaluation score can be selected as the output, which can be expressed as:
  • the controllable diverse text generation method based on the hierarchical scoring system proposed in this disclosure first uses the The sampling decoding strategy generates a sufficient number of text candidate sets (generally hundreds to thousands), and then refers to the screening strategy of minimum Bayesian risk decoding, using multiple screening methods (that is, performing at least two screening methods) ).
  • the first decoding can usually use quality indicators that measure quality for initial scoring, and sort the scores from high to low to filter out results with too low scores;
  • the second decoding can use stylized style indicators. Style filtering is also sorted by score. Each filtering can reduce the number of filters for the next time and speed up the decoding.
  • Figure 3 is a schematic diagram of an implementation of a text generation method provided in Embodiment 2 of the present disclosure. As shown in Figure 3, first use the BLEU indicator (the most commonly used indicator to judge translation quality in machine translation) for initial scoring and screening, and then use the length indicator for secondary scoring and screening. Finally, you can get relatively high quality and content Concise (short length) or detailed (long length) translations, the two sentences basically convey similar semantics, but the length differs by about 37%.
  • BLEU indicator the most commonly used indicator to judge translation quality in machine translation
  • the evaluation scores of all previous screening methods are retained in the evaluation scores of subsequent screening methods.
  • the first evaluation score of the second style screening [36.7, 19], where 36.7 is obtained after quality screening.
  • Assessment scores. 19 is the evaluation score obtained by style screening.
  • the final output target text is determined based on the value of the last evaluation score.
  • the candidate text corresponding to the maximum value of the evaluation score of the last screening method is determined as the target text.
  • the candidate text corresponding to the maximum value of 19 in the evaluation score of style screening can be used as the target text.
  • the number and content of subsequent screening in Figure 3 can be set according to actual needs. It can be one screening or multiple screenings. It can be any type of screening method such as style screening, quality screening, keyword screening, etc. After style filtering, you can also directly obtain the target text without subsequent filtering.
  • a controllable diverse text generation method using a hierarchical scoring system using a hierarchical scoring system.
  • this disclosure can also adopt some other possible solutions, including:
  • BLEU+BLEURT As a quality evaluation index, BLEU only measures the correlation between the translation and the reference translation from the morphological level, while BLEURT scores from the semantic level and is closer to manual scoring. Use both at the same time to get a higher quality translation.
  • the execution order of the various screening methods of the present disclosure can be adjusted according to actual needs, and the specific order is determined based on the included content. For example, it can be based on keyword screening, quality screening and risk screening.
  • the grid filtering sequence is performed in sequence.
  • This disclosure proposes a text generation and decoding strategy for a neural network end-to-end model.
  • a hierarchical scoring system it can reduce the current situation that the translations of traditional decoding strategies are too similar or uncontrollable, and can obtain controllable and diverse translations.
  • the scoring system in Figure 3 includes quality screening, style screening and subsequent screening
  • different styles can be controlled, where styles include length, etc. For example, combining BLEU+length can make the overall length of the translation controllable; combining BLEU+BLEURT can improve manual scoring by about 3% on the evaluated test set.
  • Figure 4 is a schematic structural diagram of a text generation device provided in Embodiment 3 of the present disclosure, where the device can be implemented by software and/or hardware, and is generally integrated on an electronic device.
  • the device includes: an acquisition module 310 and a processing module 320;
  • the acquisition module 310 is configured to acquire a text candidate set, the text candidate set includes a plurality of candidate texts, and the candidate texts are texts obtained after processing the text to be processed;
  • the processing module 320 is configured to refer to the screening strategy of minimum Bayesian risk decoding, and perform at least two screening methods on the text candidate set to obtain the target text;
  • the at least two screening methods are screening methods with different functions
  • the output of each screening is used as the input of the next screening method
  • the result of each screening output is determined based on the evaluation score of the screened candidate text
  • the screened candidate The evaluation score of a text is determined based on the evaluation metrics of the filtering method used.
  • the device obtains a text candidate set through the acquisition module 310.
  • the text candidate set includes multiple candidate texts, and each candidate text is a text obtained after processing the text to be processed; through the processing module 320, the text candidate set is obtained with reference to the minimum Baye
  • the screening strategy of risk decoding is to obtain the target text after processing the text candidate set at least twice with screening methods; among them, at least two screening methods are screening methods with different functions, and the output of each screening is used as the input of the next screening method. , the result of each screening output is determined based on the evaluation score of the screened candidate text, and the evaluation score of the screened candidate text is determined based on the evaluation index of the screening method used.
  • the device can improve the problem of poor results generated by a given text generation model by subjecting the obtained text candidate set to at least two screening methods.
  • the acquisition module 310 is configured to acquire the text candidate set in the following manner:
  • the text to be processed is input into a processing model, and the processing model processes the text to be processed based on a sampling decoding method to obtain a text candidate set.
  • the screening method includes one or more of the following: quality screening, style screening and relationship screening. Keyword filtering;
  • the evaluation indicators used to determine the evaluation scores are quality indicators
  • the evaluation index used to determine the evaluation score is the keyword index
  • the evaluation index used to determine the evaluation score is the style index.
  • style filtering includes length filtering
  • style indicators include length indicators
  • Optional processing module 320 includes:
  • a processing unit configured to perform a first screening method on the text candidate set to obtain evaluation scores for a first set number of target candidate texts in the text candidate set, where the first set number of target candidate texts are the Screen candidate texts;
  • the selection unit is configured to refer to the screening strategy of minimum Bayesian risk decoding, and select a second set number of target candidate texts from the text candidate set in order from high to low according to the evaluation scores to form selected text candidates. Set as the result of this screening output;
  • the screening unit is configured to continue to process the selected text candidate set with subsequent screening methods with reference to minimum Bayesian risk decoding until the target text is screened from the selected text candidate set through the second screening method in the subsequent screening methods.
  • the second screening method is the last screening method performed among the at least two screening methods.
  • the evaluation score of the target candidate text is the mean of multiple reference evaluation scores of the target candidate text, and each reference evaluation score is the target candidate text based on the evaluation.
  • the index is a score determined with reference to the corresponding reference candidate text, which is located in the same text candidate set as the target candidate text.
  • the evaluation score of the target candidate text is a score based on the evaluation index.
  • the result of the filtering output is a required number of target candidate texts selected from the filtered candidate texts according to the evaluation scores of the target candidate texts.
  • the device also includes:
  • the information acquisition module is configured to obtain demand information on the human-computer interaction interface
  • a method determination module is configured to determine a corresponding screening method based on the demand information.
  • the above text generation device can execute the text generation method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects of the execution method.
  • FIG. 5 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present disclosure.
  • FIG. 5 shows a schematic structural diagram of an electronic device 400 suitable for implementing embodiments of the present disclosure.
  • the electronic device 400 in the embodiment of the present disclosure may include a mobile phone, a notebook computer, a digital broadcast receiver, a personal digital assistant (Personal Digital Assistant, PDA), a tablet computer (Portable Android Device, PAD), a portable multimedia player (Portable Media Mobile electronics such as Player, PMP), vehicle electronics (such as vehicle navigation electronics), and fixed electronics such as digital television (television, TV), desktop computers, etc.
  • PDA Personal Digital Assistant
  • PAD Portable Android Device
  • PMP Portable Multimedia Mobile electronics
  • vehicle electronics such as vehicle navigation electronics
  • fixed electronics such as digital television (television, TV), desktop computers, etc.
  • the electronic device 400 shown in FIG. 5 is only an example.
  • the electronic device 400 may include one or more processing devices (such as a central processing unit, a graphics processor, etc.) 401, which may process data according to a program stored in a read-only memory (Read-Only Memory, ROM) 402. Or the program loaded from the storage device 408 into the random access memory (Random Access Memory, RAM) 403 executes various appropriate actions and processes.
  • One or more processing devices 401 implement methods as provided by this disclosure.
  • RAM 403 various programs and data required for the operation of the electronic device 400 are also stored.
  • the processing device 401, ROM 402 and RAM 403 are connected to each other via a bus 404.
  • An input/output (I/O) interface 405 is also connected to bus 404.
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) , an output device 407 such as a speaker, a vibrator, etc.; a storage device 408 including, for example, a tape, a hard disk, etc., the storage device 408 is used to store one or more programs; and a communication device 409.
  • the communication device 409 may allow the electronic device 400 to communicate wirelessly or wiredly with other devices to exchange data.
  • FIG. 5 illustrates electronic device 400 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program including program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication device 409, or from storage device 408, or from ROM 402.
  • the processing device 401 When the computer program is executed by the processing device 401, the above-mentioned functions in the method of the embodiment of the present disclosure are performed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof.
  • the computer-readable storage medium may include an electrical connection having one or more conductors, Portable computer disk, hard disk, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM ), an optical storage device, a magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code contained on a computer-readable medium can be transmitted using any appropriate medium, including: wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • the client and server can communicate using any currently known or future developed network protocol, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium.
  • HTTP HyperText Transfer Protocol
  • Data communications e.g., communications network
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any current network for knowledge or future research and development.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet e.g., the Internet
  • end-to-end networks e.g., ad hoc end-to-end networks
  • the computer-readable medium may be included in the electronic device 400; it may also exist independently without being assembled into the electronic device 400.
  • the above-mentioned computer-readable medium stores one or more computer programs.
  • the above-mentioned one or more programs are executed by the processing device, the following method is implemented: the above-mentioned computer-readable medium carries one or more programs.
  • the electronic device 400 When executed by the electronic device, the electronic device 400:
  • Computer program code for performing the operations of the present disclosure may be written in one or more programming languages or a combination thereof, including object-oriented programming. Languages, such as Java, Smalltalk, C++, also include conventional procedural programming languages, such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer, such as through the Internet using an Internet service provider. ).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more executable instructions for implementing the specified logical function.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the modules involved in the embodiments of the present disclosure can be implemented in software or hardware.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Parts
  • SOC System on Chip
  • CPLD Complex Programming logic device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • Machine-readable storage media may include electrical connections based on one or more wires, portable computer disks, hard drives, RAM, ROM, erasable programmable read-only memory (EPROM or flash memory), fiber optics, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • Example 1 provides a text generation method, including:
  • the text candidate set includes a plurality of candidate texts, and the candidate texts are texts obtained after processing the text to be processed;
  • the text candidate set is processed by at least two screening methods to obtain the target text;
  • the at least two screening methods are screening methods with different functions
  • the output of each screening is used as the input of the next screening method
  • the result of each screening output is determined based on the evaluation score of the screened candidate text
  • the screened candidate The evaluation score of a text is determined based on the evaluation metrics of the filtering method used.
  • Example 2 is a method according to Example 1,
  • the obtaining text candidate set includes:
  • the text to be processed is input into a processing model, and the processing model processes the text to be processed based on a sampling decoding method to obtain a text candidate set.
  • Example 3 is according to the method of Example 1,
  • the screening method includes one or more of the following: quality screening, style screening and keyword filtering;
  • the evaluation indicators used to determine the evaluation scores are quality indicators
  • the evaluation index used to determine the evaluation score is the keyword index
  • the evaluation index used to determine the evaluation score is the style index.
  • Example 4 is according to the method of Example 3,
  • Style screening includes length screening, and style indicators include length indicators.
  • Example 5 is according to the method of Example 1,
  • the text candidate set is processed by at least two screening methods to obtain the target text, including:
  • the screening method is the screening method performed last among the at least two screening methods.
  • Example 6 is a method according to Example 1,
  • each reference evaluation score is the score determined by the target candidate text based on the evaluation index with the corresponding reference candidate text as a reference, so The reference candidate text and the target candidate text are located in the same text candidate set. .
  • Example 7 is a method according to Example 1,
  • the evaluation score of the target candidate text is a score based on the evaluation index.
  • Example 8 is according to the method of Example 6 or 8,
  • the result of the filtering output is a required number of target candidate texts selected from the filtered candidate texts according to the evaluation scores of the target candidate texts.
  • Example 9 is according to the method of Example 1,
  • the corresponding screening method is determined based on the demand information.
  • Example 10 provides a text generation device, including:
  • the acquisition module is configured to acquire a text candidate set, the text candidate set includes a plurality of candidate texts, and the candidate text is a text obtained after processing the text to be processed;
  • the processing module is configured to refer to the screening strategy of minimum Bayesian risk decoding, and after processing the text candidate set at least twice with the screening method, the target text is obtained;
  • the at least two screening methods are screening methods with different functions
  • the output of each screening is used as the input of the next screening method
  • the result of each screening output is determined based on the evaluation score of the screened candidate text
  • the screened candidate The evaluation score of a text is determined based on the evaluation metrics of the filtering method used.
  • Example 11 provides an electronic device, including:
  • a storage device for storing one or more programs
  • the one or more processing devices When the one or more programs are executed by the one or more processing devices, the one or more processing devices implement the method as described in any one of Examples 1-9.
  • Example 12 provides a computer-readable medium having a computer program stored thereon, which when executed by a processing device implements the method described in any one of Examples 1-9.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

本公开公开了一种文本生成方法、装置、电子设备及介质。所述方法包括:获取文本候选集,文本候选集包括多个候选文本,候选文本为对待处理文本进行处理后得到的文本;参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。

Description

一种文本生成方法、装置、电子设备及介质
本公开要求在2022年04月24日提交中国专利局、申请号为202210434655.4的中国专利申请的优先权,该申请的全部内容通过引用结合在本公开中。
技术领域
本公开实施例涉及数据处理技术领域,例如涉及一种文本生成方法、装置、电子设备及介质。
背景技术
文本生成是自然语言处理(Natural Language Processing,NLP)的重要组成部分,旨在使用机器生成自然语言文本。
目前,通常是给定一个文本生成模型,将待处理的文本输入至文本生成模型中生成对应的文本结果。其中,文本生成模型是指用于文本生成的系统模型,其中可包括编码器和解码器,编码器将输入编码为向量表示,解码器依靠编码器的向量表示,根据指定的解码算法,逐词生成需要的文本。
然而,采用给定的文本生成模型进行文本生成处理,所得到的结果较差,无法达到预期的结果需求。
发明内容
本公开实施例提供了一种文本生成方法、装置、电子设备及介质,以改善给定文本生成模型所生成结果较差的问题。
第一方面,本公开实施例提供了一种文本生成方法,包括:
获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
第二方面,本公开实施例还提供了一种文本生成装置,包括:
获取模块,设置为获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
处理模块,设置为参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
第三方面,本公开实施例还提供了一种电子设备,包括:
一个或多个处理装置;
存储装置,用于存储一个或多个程序;
所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现本公开实施例提供的处理方法。
第四方面,本公开实施例还提供了一种计算机可读介质,所述计算机可读介质上存储有计算机程序,所述计算机程序程序被处理装置执行时实现本公开实施例提供的处理方法。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开实施例一提供的一种文本生成方法的流程示意图;
图2为本公开实施例二提供的一种文本生成方法的流程示意图;
图3为本公开实施例二提供的一种文本生成方法的实现示意图;
图4为本公开实施例三提供的一种文本生成装置的结构示意图;
图5为本公开实施例四提供的一种电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,提供这 些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的。
下述各实施例中,每个实施例中同时提供了可选特征和示例,实施例中记载的各个特征可进行组合,形成多个可选方案,不应将每个编号的实施例仅视为一个技术方案。此外,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。
文本生成是NLP的重要组成部分,旨在使用机器生成自然语言文本。文本生成是众多NLP任务中的重要一环,包括:(1)机器翻译(Machine Translation,MT),是指将一种语言翻译为另一种语言;(2)对话生成,是指根据用户所输入的话语生成对应的回答;(3)摘要生成,是指根据一长段文本生成简要的概括内容。
实施例一
图1为本公开实施例一提供的一种文本生成方法的流程示意图,该方法可适用于生成目标文本的情况,该方法可以由文本生成装置来执行,其中该装置可由软件和/或硬件实现,并一般集成在电子设备上,在本实施例中电子设备包括:计算机、笔记本电脑、智能手机、或平板电脑等设备。
如图1所示,本公开实施例一提供的一种文本生成方法,包括如下步骤:
S110、获取文本候选集。
在本实施例中,文本候选集可以理解为将待处理文本输入相应的文本生成模型所得到的多个结果(该结果可以包括候选文本)构成的集合。文本候选集可包括多个候选文本,各个候选文本可理解为对待处理文本进行处理后所得到的文本。待处理文本可理解为等待处理的文本内容,例如在机器翻译场景中待处理文本可以为等待翻译的文本,如一个英文语句或中文语句;在对话生成场景中待处理文本可以为对话中的一段输入语音或一个输入文本语句;在摘要生成场景中待处理文本可以为等待提取摘要的文本,如一段文本内容。
本实施例中,文本生成模型可采用基于深度神经网络的端到端结构。神经网络端到端模型(即文本生成模型)可以是由一个编码器(Encoder)和一个解码器(Decoder)构成,编码器将输入编码为向量表示,在此基础上解码器依靠编码器的向量表示,根据指定的解码算法,逐词生成需要的文本。解码算法可以是指端到端模型在解码过程中选择词语的策略。解码算法主要可分为贪心解码(Greedy Decoding)方法、集束搜索(Beam Search)方法和采样解码(Sampling)方法。
其中,贪心解码方法可以是指每一步选择概率最高的词作为结果。集束搜索方法可以是指每一步记录概率最高的前几个路径,避免错过后几步可能出现的高概率词。采样解码方法可以是指每一步从候选词中按照概率分布进行随机选择,结果具有随机性;候选词的选择可根据策略各有不同,但往往也是前n(n小于词表规模)个词。
示例性的,文本生成模型可以基于贪心解码方法处理待处理文本得到对应的文本候选集;文本生成模型也可以基于集束搜索方法处理待处理文本得到对应的文本候选集;或者,文本生成模型还可以基于采样解码方法处理待处理文本得到对应的文本候选集。
S120、参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本。
在本实施例中,最小贝叶斯风险解码也是一种解码算法,可被用于语音识别和统计机器翻译中。最小贝叶斯风险解码可理解为是对每个候选文本进行评估和打分,打分的分值可称为评估分数。最小贝叶斯风险解码的筛选策略可理解为对文本候选集中的候选文本进行筛选以得到最终结果,使得评估分数满足要求的处理策略。如,从文本候选集中选取候选文本,使得选取的候选文本为按照评估分数的高低选取的所需数量个目标候选文本。
筛选方法可理解为一种参照最小贝叶斯风险解码的筛选策略,以用于将文 本候选集中的候选文本进行筛选的方法。目标文本可以理解为对文本候选集进行至少两次筛选方法处理后所得到的文本结果。
筛选方法可以为两个或多个。本实施例中对文本候选集进行处理的各筛选方法可以为不同功能的筛选方法,以用于实现不同的筛选功能。
可选的,筛选方法,包括如下一个或多个:质量筛选、风格筛选和关键词过滤;在所使用筛选方法为质量筛选时,确定评估分数的评估指标为质量指标;在所使用筛选方法为关键词过滤时,确定评估分数的评估指标为关键词指标;在所使用筛选方法为风格筛选时,确定评估分数的评估指标为风格指标。
质量筛选可以理解为对文本候选集中各个文本的质量进行筛选的方法。质量指标可理解为质量筛选所对应的评估指标。例如质量筛选对应的质量指标可以至少包括双语评估替补(Bilingual Evaluation Understudy,BLEU)指标和变形双语评估替补(Bilingual Evaluation Understudy with Representations from Transformers,BLEURT)指标等。质量可以理解为用于表征文本候选集中文本与参考文本之间在语义或词形等方面的相似度的量。
其中,评估指标可理解为对每个候选文本进行分数自动评估的指标。风格筛选可以理解为对文本候选集中各个文本的风格进行筛选的方法。风格指标可理解为风格筛选所对应的评估指标。风格可理解为用于表征文本的内容长短的量。
可选的,风格筛选包括长度筛选,风格指标包括长度指标。
其中,长度指标认为是长度筛选对应的评估指标,是指用于对文本的长短进行分析和处理的指标。
关键词过滤可以理解为对文本候选集各个文本中的歧义或敏感词语进行过滤的方法,如在该方法中可以预先将歧义或敏感词语设置为要过滤的关键词。关键词指标可理解为关键词过滤所对应的评估指标。
可理解的是,评估分数越高,则表明对应的候选文本的质量或风格更优;其中每个候选文本的评估分数可以是相同的,也可以是不同的
在对文本候选集进行至少两次筛选方法处理时,可参照最小贝叶斯风险解码的筛选策略,对文本候选集中的候选文本进行筛选;如可以在每次筛选方法处理时,选取设定数量(每次筛选方法对应的设定数量不同)个评估分数满足要求的候选文本作为筛选的结果,直至所有筛选方法执行完后筛选得到最终的目标文本。
其中,在最后一次筛选方法之前的筛选方法执行时,设定数量可以为多个,如可以参照最小贝叶斯风险解码的筛选策略,从对应的文本候选集中选取评估 分数靠前的多个候选文本作为本次筛选的输出;在最后一次筛选方法执行时,设定数量可以为一个,即参照最小贝叶斯风险解码的筛选策略,从对应的文本候选集中选取评估分数最高的候选文本作为输出的目标文本。
在将文本候选集进行至少两次筛选方法处理的过程中,每次筛选的输出可作为下次筛选方法的输入,每次筛选输出的结果可以基于所筛选候选文本的评估分数确定。其中,所筛选候选文本可理解为当次筛选方法所将要处理的候选文本,如所筛选候选文本可以是上一次筛选方法筛选后得到的文本候选集;也就是说,每次筛选方法可对应一个文本候选集,该文本候选集作为所筛选候选文本,且每次筛选方法所对应的文本候选集可以是不同的,即是上一次筛选方法筛选后得到的文本候选集。
在一实施例中,可以是按照所筛选候选文本的评估分数从高到低的顺序选取靠前的一个或多个候选文本作为每次筛选输出的结果。
示例性的,假设采取两次筛选方法处理文本候选集,在获取文本候选集(假设文本候选集中包括2000个候选文本)之后,第一次采取筛选方法处理文本候选集中的候选文本,以得到每个候选文本的评估分数;在此过程中可处理文本候选集中的所有候选文本,也可为了减小计算量,只处理文本候选集中的部分候选文本(如选取其中的1000个候选文本进行处理,此处如何从文本候选集中选取部分候选文本,以及所选取候选文本的数量可以根据实际情况确定);在此基础上,可以基于所得到的评估分数对文本候选集进行筛选,如可以选取评估分数靠前的设定数量个(如前100个)候选文本作为第一筛选方法的输出。第二次筛选方法的输入是第一次筛选的输出(如前述所选取的前100个),第二次采取筛选方法处理对应的文本候选集(即第一次筛选的输出),得到每个候选文本对应的评估分数,并选取其中评估分数最高的一个候选文本作为目标文本。
可理解的是,若是有多次筛选方法(如大于两次),则在第二次筛选方法处理之后,可以继续将第二次筛选的输出作为后续筛选方法(即第三次筛选方法)的输入进行处理,以此类推,直至筛选得到目标文本。
所筛选候选文本的评估分数可以基于所使用筛选方法的评估指标确定。筛选方法可分为两种类型,一种是需要参照进行筛选的方法,另一种是无需参照进行筛选的方法。在基于所使用筛选方法的评估指标确定所筛选候选文本的评估分数时,两种类型筛选方法的评估分数的打分方式可以是不同的。
可选的,在筛选方法为基于参照进行筛选的方法时,目标候选文本的评估分数为目标候选文本的多个参照评估分数的均值,每个参照评估分数为目标候选文本基于评估指标以对应参照候选文本为参照确定的分数,参照候选文本与 目标候选文本位于同一文本候选集。
其中,目标候选文本可理解为文本候选集中当前待评估打分的候选文本,其中该文本候选集为该筛选方法所对应的文本候选集。参照候选文本可理解为目标候选文本对应的文本候选集中,作为目标候选文本参照的候选文本;例如,在目标候选文本对应的文本候选集中,参照候选文本可以是该文本候选集中的所有候选文本,也可以是该文本候选集中除目标候选文本外的其他所有候选文本,或者还可以是该文本候选集所包括的子集。示例性的,若目标候选文本对应的文本候选集包括100个候选文本,则参照候选文本可以是这100个候选文本;也可以是除目标候选文本外的其他99个候选文本;或者还可以是100个候选文本所包括的子集,如其中的50个候选文本。
在上述示例的基础上,若目标候选文本的参照候选文本为除目标候选文本外的其他99个候选文本,则针对这99个参照候选文本,目标候选文本基于评估指标以每个参照候选文本为参照可以确定一个对应的参照评估分数,即可得到99个参照评估分数;在此基础上,可将这99个参照评估分数的均值作为目标候选文本的评估分数。
可选的,在筛选方法为无需参照进行筛选的方法时,目标候选文本的评估分数为基于评估指标的分数。
其中,每个筛选方法可对应一个评估指标,在筛选方法为无需参照进行筛选的方法时,目标候选文本的评估分数可以为基于评估指标的分数。
可选的,筛选输出的结果为从所筛选候选文本中按照各目标候选文本的评估分数的高低选取的所需数量个目标候选文本。
其中,可按照各目标候选文本的评估分数由高到低的顺序从所筛选候选文本中选取所需数量个目标候选文本作为筛选输出的结果。所需数量可以是一个或多个,此处可根据实际需求进行灵活设定。
本公开实施例一提供的一种文本生成方法,获取文本候选集,文本候选集包括多个候选文本,各候选文本为对待处理文本进行处理后得到的文本;参照最小贝叶斯风险解码的筛选策略,将文本候选集进行至少两次筛选方法处理后,得到目标文本;其中,各筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。该方法通过将所获取的文本候选集进行至少两次筛选方法的处理,能够改善给定文本生成模型所生成结果较差的问题。
在上述实施例的基础上,用于文本生成的筛选方法可以根据用户所输入的 文本生成需求进行确定。
可选的,所述方法还包括:在人机交互界面上获取需求信息;基于需求信息确定对应的筛选方法。
其中,人机交互界面可理解为可进行人机交互的界面,即用户可通过该界面输入或选择相关的指令以实现与用于文本生成的电子设备的交互。需求信息可以理解为用户在人机交互界面所输入的关于文本生成需求的信息,如需求信息可以包括文本质量需求和/或文本长短需求等。
在人机交互界面上获取需求信息,可根据所获取的需求信息确定该需求信息所需要的筛选方法。示例性的,若用户在人机交互界面所输入的是质量高、文本精简的需求信息,则对应的筛选方法可确定为用于生成高质量文本的筛选方法和用于生成精简文本的筛选方法,也就是说对所获取的文本候选集进行上述两次筛选方法的处理得到对应的目标文本。
实施例二
图2为本公开实施例二提供的一种文本生成方法的流程示意图。本实施例二在上述各实施例的基础上进行细化。在本实施例中,对获取文本候选集,以及对文本候选集进行至少两次筛选方法处理以得到目标文本的过程进行了具体描述。需要说明的是,本实施例尚未详尽的内容请参考实施例一。
如图2所示,本公开实施例二提供的一种文本生成方法,方法包括如下步骤:
S210、将待处理文本输入处理模型,处理模型基于采样解码方法处理待处理文本得到文本候选集。
在本实施例中,处理模型可以指可用于对待处理文本进行编码和解码处理的模型;如处理模型可以是基于深度神经网络的端到端模型,可包括编码器和解码器,编码器可用于将待处理文本编码为向量表示,然后解码器可依靠编码器的向量表示,根据指定的解码算法生成对应的文本候选集。可选的,可以将待处理文本输入处理模型,处理模型可基于采样解码方法(即Sampling方法)处理待处理文本以得到对应的文本候选集。本实施例通过基于采样解码方法的处理模型处理待处理文本,能够使得处理后得到的文本候选集具备多样化的特点。
需要说明的是,此处的处理模型即可理解为文本生成模型。
S220、将文本候选集进行第一筛选方法处理,得到文本候选集中第一设定 数量个目标候选文本的评估分数。
在本实施例中,第一筛选方法可理解为第一次筛选时所选取的筛选方法;其中筛选方法可包括质量筛选、风格筛选和关键词筛选。第一设定数量个可以理解为在采用第一筛选方法对文本候选集进行处理时,所处理的目标候选文本的个数。如在采用第一筛选方法对文本候选集进行第一次筛选方法处理时,可以选取文本候选集中的部分目标候选文本进行处理,也可以对文本候选集中的全部目标候选文本进行处理。其中,部分目标候选文本或全部目标候选文本的数量可以为第一设定数量。第一设定数量个目标候选文本可认为是所筛选候选文本。
可选的,将文本候选集进行第一筛选方法处理,可以得到文本候选集中第一设定数量个目标候选文本所对应的评估分数。
S230、参照最小贝叶斯风险解码的筛选策略,按照评估分数从高到低的顺序,从文本候选集中选取第二设定数量个目标候选文本形成选取后的文本候选集作为本次筛选输出的结果。
在本实施例中,第二设定数量可以指预先设定的评估分数靠前的目标候选文本数量。参照最小贝叶斯风险解码的筛选策略,基于所得到的评估分数,按照评估分数从高到低的顺序,可以从文本候选集中选取第二设定数量个目标候选文本形成选取后的文本候选集,并将该选取后的文本候选集作为本次筛选输出的结果。第二设定数量可以小于第一设定数量。
S240、继续参照最小贝叶斯风险解码将选取后的文本候选集进行后续筛选方法处理,直至通过后续筛选方法中的第二筛选方法从选取后的文本候选集中筛选得到目标文本。
在本实施例中,第二筛选方法可理解为各筛选方法中最后执行的筛选方法;如假设一共采用了3个筛选方法,则最后一个执行的筛选方法,即第三个筛选方法就可认为是第二筛选方法。继续参照最小贝叶斯风险解码将选取后的文本候选集作为下一个筛选方法的输入以进行后续筛选方法处理,直至筛选得到目标文本。可选的,可以将第一次筛选的输出作为下一次(即第二次)筛选方法的输入进行处理,在此基础上将第二次筛选的输出作为下一次(就第三次)筛选方法的输入进行处理,以此类推,直至通过第二筛选方法从选取后的文本候选集中筛选得到目标文本。
后续筛选方法的次数可以为一次或多次。
可根据在人际交互界面所获取的需求信息来确定对应的筛选方法,其中可包括至少两次筛选方法对文本候选集进行处理,以得到目标文本。
本实施例二提供了一种文本生成方法,该方法通过采样解码方法处理待处理文本,能够得到多样化的文本候选集;还通过对文本候选集进行至少两次筛选方法的处理,通过多个筛选方法的不同筛选功能,能够改善给定文本生成模型所生成结果较差的问题,例如在筛选方法为质量筛选、风格筛选和/或关键词过滤时,能够在提高文本生成结果多样性的同时,还提高了文本的生成质量和/或控制文本的生成风格。
以下对本公开进行示例性描述:
目前,文本生成方法中,基于集束搜索方法所得到的结果受限,往往生成类似的结果。基于采样解码方法,往往生成结果的质量不够稳定,且无法控制结果风格。
表1示出了两种筛选方法与对应译文之间的对应关系。如表1所示,是分别采用集束搜索方法和采样解码方法进行英语到中文翻译的示例性结果。可以看出,基于集束搜索方法的结果的质量相对基于采样的结果来说较高,但是用词、句式等却都比较相似。而基于采样解码方法的译文结果尽管比较多样化,但是有一些句子却并不符合语义(例如最后一个译文结果中的“老板”,就不符合语义)。
表1两种筛选方法与对应译文之间的对应关系

本公开实施例提供了一种文本生成方法,如以机器翻译为例,通过长度筛选实现控制译文的长度,可以优化双语字幕展示的效果;通过关键词过滤实现控制译文中脏词的过滤,可以减少不雅翻译的出现。该方法通过质量筛选在获得高质量结果的同时,采用采样解码方法处理得到的文本候选集还能够保持文本候选集内候选文本的多样性,并通过风格筛选在一定程度上控制结果输出的风格。
需要说明的是,本公开所提供的文本生成方法可以扩展应用到其他任意端到端的用于文本生成的系统。
给定一个端到端翻译模型M(即文本生成模型)以及评估指标c,将x(x可表示待处理文本)输入到模型M,生成的译文候选集为H(即文本候选集),最小贝叶斯风险解码策略就是从译文候选集H中选择一个最合适的结果h,使得c(h)得分最高,从而使得风险1-c最小。
评估指标可分为两类,一类是需要参照才能打分的,一类是无需参照就可以打分的。一般来说对质量进行评估的指标往往都需要参照译文,比较译文(即目标候选文本)和参照译文(即参照候选文本),给出一个分数,即评估分数,其对应的公式可表示为:
其中,E可表示期望;r可表示参照译文;c(h,r)可表示译文候选集中的任一个译文h以r为参照的评估分数;arg max可表示寻找具有最大值的参量,即可理解为基于c(h,r)中评估分数的期望选取最合适的结果h,使得c(h,r)得分最高。
但是在真实的解码过程中,通常没有参照译文,因此一般可以将译文,即目标候选文本和译文候选集中的每一个译文进行打分,得到参照评估分数,然后对所得到的所有打分分数计算平均值,作为该译文的质量分数(即该译文的评估分数)。可以这样做的原因是译文的结果代表了训练数据分布,可以一定程度上作为参照译文的近似表示。
在对译文本候选集进行至少两次筛选方法处理的过程中,在非最后一次筛选方法处理时,可基于上述得到的评估分数选取前N个候选译文作为输出,其可表示为:
其中,top(N)可表示按照评估分数从高到低的顺序选取评估分数排序前N的候选译文;y可表示译文候选集中的每一个译文,即参照候选文本;c(h,y)可表示译文候选集中的任一个译文h以y为参照的参照评估分数。|H|可表征参照译文的个数。可表示任一个译文h以y为参照的所有参照评估分数的和值,在此基础上,可将任一个译文h的所有参照评估分数的均值(即所有参照评估分数的和值与参照译文的个数|H|的比值)作为h的评估分数。
在最后一次筛选方法处理时,由于要选取出最终的译文结果,此时可选取评估分数最高的一个译文作为输出,其可表示为:
其中,hbest即可表示为评估分数最高的一个译文。
而对于无需参照的指标来说,则可直接打分(即为基于评估指标的分数),而不需要其他操作;同理,在这种情况下,在对译文本候选集进行至少两次筛选方法处理的过程中,在非最后一次筛选方法处理时,可基于上述得到的评估分数从高到低对各候选译文进行排序后,选取前N个候选译文作为输出,其可表示为:
htop(N)=c(h)
在最后一次筛选方法处理时,由于要选取出最终的译文结果,此时可选取评估分数最高的一个译文作为输出,其可表示为:
本公开提出的基于层级打分系统的可控多样文本生成方法,首先使用基于 采样的解码策略,生成足够数量多的文本候选集(一般是几百到几千),然后参照最小贝叶斯风险解码的筛选策略,采用多次筛选方法处理(即进行至少两次筛选方法处理)。一般来说,第一次解码通常可使用衡量质量的质量指标进行初次打分,并按照得分从高到低进行排序,过滤掉得分太低的结果;第二次解码可使用风格化的风格指标进行风格筛选,同样按照得分进行排序。每一次的筛选可以减少下一次筛选的数量,加快解码的速度。
图3为本公开实施例二提供的一种文本生成方法的实现示意图。如图3所示,首先使用BLEU指标(即指机器翻译中判断译文质量最常用的指标)进行初次打分筛选,然后再使用长度指标进行二次打分筛选,最后可以得到质量相对较高,且内容精简(长度短)或者详尽(长度长)的译文,两句话基本都传达了类似的语义,但长度相差了约37%。
由图3可见,后续筛选方法的评估分数中保留了之前所有筛选方法的评估分数,如第二次风格筛选的第一个评估分数=[36.7,19],其中,36.7为质量筛选后得到的评估分数。19为风格筛选得到的评估分数。最终输出的目标文本基于最后一次评估分数的取值决定,如将最后一次筛选方法的评估分数的最大值对应的候选文本确定为目标文本。以图3为例,若没有后续筛选,则可以将风格筛选的评估分数中的最大值19对应的候选文本作为目标文本。
图3中后续筛选的数量和内容可以根据实际需求进行设置,可以为一次筛选,也可以为多次筛选。可以为风格筛选、质量筛选、关键词筛选等任意一类筛选方法。风格筛选后也可以不进行后续筛选直接得到目标文本。
使用层级打分系统的可控多样文本生成方法,除了长度控制,本公开还可以采用其他一些可能的方案,包括:
(1)、想要获得更加高质量且更贴合人工的译文。可以使用BLEU+BLEURT的方式进行多次质量筛选。BLEU作为质量评价指标仅仅从词形层面衡量译文和参考译文的相关度,而BLEURT则从语义层面打分,且更贴近人工打分。同时使用二者可以获得更高质量的译文。
(2)、想要对某些歧义词语消除歧义。可以使用关键词过滤+BLEU的方式进行处理。对于一词多义的情况,往往某些含义由于在训练数据中出现次数较少,普通的解码方式几乎不会出现。但在采样样本足够大的情况下可以首先通过指定某些关键词一定会在译文中出现,进行一轮筛选,再通过质量筛选选出质量较高的译文。
需要注意的是,本公开各种筛选方法的执行顺序可以根据实际需求进行调整,具体顺序基于所包括内容确定,如,可以按照关键词筛选、质量筛选和风 格筛选的顺序依次进行。
本公开提出一种神经网络端到端模型的文本生成解码策略,通过使用层级打分系统,减少传统解码策略译文过于类似、或者不可控的现状,能够获得可控且多样的译文。通过替换不同的打分系统(不同的打分系统内使用的筛选方法不同,图3中的打分系统包括质量筛选、风格筛选和后续筛选),可以控制不同的风格,其中风格包括长度等。如结合BLEU+长度的方式,能够让译文整体长度可控;结合BLEU+BLEURT的方式,能够在评估的测试集上人工评分提高约3%。
实施例三
图4为本公开实施例三提供的一种文本生成装置的结构示意图,其中该装置可由软件和/或硬件实现,并一般集成在电子设备上。
如图4所示,该装置包括:获取模块310和处理模块320;
其中,获取模块310,设置为获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
处理模块320,设置为参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
在本实施例中,该装置通过获取模块310,获取文本候选集,文本候选集包括多个候选文本,各候选文本为对待处理文本进行处理后得到的文本;通过处理模块320,参照最小贝叶斯风险解码的筛选策略,将文本候选集进行至少两次筛选方法处理后,得到目标文本;其中,至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。该装置通过将所获取的文本候选集进行至少两次筛选方法的处理,能够改善给定文本生成模型所生成结果较差的问题。
可选的,获取模块310,设置为通过如下方式获取文本候选集:
将所述待处理文本输入处理模型,所述处理模型基于采样解码方法处理所述待处理文本得到文本候选集。
可选的,所述筛选方法,包括如下一个或多个:质量筛选、风格筛选和关 键词过滤;
在所使用筛选方法为质量筛选时,确定评估分数的评估指标为质量指标;
在所使用筛选方法为关键词过滤时,确定评估分数的评估指标为关键词指标;
在所使用筛选方法为风格筛选时,确定评估分数的评估指标为风格指标。
可选的,风格筛选包括长度筛选,风格指标包括长度指标。
可选的,处理模块320,包括:
处理单元,设置为将所述文本候选集进行第一筛选方法处理,得到所述文本候选集中第一设定数量个目标候选文本的评估分数,所述第一设定数量个目标候选文本为所筛选候选文本;
选取单元,设置为参照最小贝叶斯风险解码的筛选策略,按照所述评估分数从高到低的顺序,从所述文本候选集中选取第二设定数量个目标候选文本形成选取后的文本候选集作为本次筛选输出的结果;
筛选单元,设置为继续参照最小贝叶斯风险解码将选取后的文本候选集进行后续筛选方法处理,直至通过所述后续筛选方法中的第二筛选方法从选取后的文本候选集中筛选得到目标文本,所述第二筛选方法为所述至少两次筛选方法中最后执行的筛选方法。
可选的,在筛选方法为基于参照进行筛选的方法时,目标候选文本的评估分数为所述目标候选文本的多个参照评估分数的均值,每个参照评估分数为所述目标候选文本基于评估指标以对应参照候选文本为参照确定的分数,所述参照候选文本与所述目标候选文本位于同一文本候选集。
可选的,在筛选方法为无需参照进行筛选的方法时,目标候选文本的评估分数为基于评估指标的分数。
可选的,筛选输出的结果为从所筛选候选文本中按照所述目标候选文本的评估分数的高低选取的所需数量个目标候选文本。
可选的,所述装置还包括:
信息获取模块,设置为在人机交互界面上获取需求信息;
方法确定模块,设置为基于所述需求信息确定对应的筛选方法。
上述文本生成装置可执行本公开任意实施例所提供的文本生成方法,具备执行方法相应的功能模块和有益效果。
实施例四
图5为本公开实施例四提供的一种电子设备的结构示意图。图5示出了适于用来实现本公开实施例的电子设备400的结构示意图。本公开实施例中的电子设备400可以包括移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(Portable Android Device,PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载电子(例如车载导航电子)等等的移动电子以及诸如数字电视(television,TV)、台式计算机等等的固定电子。图5示出的电子设备400仅仅是一个示例。
如图5所示,电子设备400可以包括一个或多个处理装置(例如中央处理器、图形处理器等)401,其可以根据存储在只读存储器(Read-Only Memory,ROM)402中的程序或者从存储装置408加载到随机访问存储器(Random Access Memory,RAM)403中的程序而执行各种适当的动作和处理。一个或多个处理装置401实现如本公开提供的方法。在RAM403中,还存储有电子设备400操作所需的各种程序和数据。处理装置401、ROM 402以及RAM403通过总线404彼此相连。输入/输出(Input/Output,I/O)接口405也连接至总线404。
通常,以下装置可以连接至I/O接口405:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置406;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置407;包括例如磁带、硬盘等的存储装置408,存储装置408用于存储一个或多个程序;以及通信装置409。通信装置409可以允许电子设备400与其他设备进行无线或有线通信以交换数据。虽然图5示出了具有各种装置的电子设备400,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,该计算机程序产品包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置409从网络上被下载和安装,或者从存储装置408被安装,或者从ROM402被安装。在该计算机程序被处理装置401执行时,执行本公开实施例的方法中的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质可以包括:具有一个或多个导线的电连接、 便携式计算机磁盘、硬盘、RAM、ROM、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如超文本传输协议(Hyper Text Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备400中所包含的;也可以是单独存在,而未装配入该电子设备400中。
上述计算机可读介质存储有一个或者多个计算机程序,当上述一个或者多个程序被处理装置执行时实现如下方法:上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备400:可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN)连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计 算机程序产品的可能实现的体系架构、功能和操作。流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programming logic device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质可以包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、CD-ROM、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,示例1提供了一种文本生成方法,包括:
获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
根据本公开的一个或多个实施例,示例2根据示例1所述的方法,
所述获取文本候选集,包括:
将所述待处理文本输入处理模型,所述处理模型基于采样解码方法处理所述待处理文本得到文本候选集。
根据本公开的一个或多个实施例,示例3根据示例1所述的方法,
所述筛选方法,包括如下一个或多个:质量筛选、风格筛选和关键词过滤;
在所使用筛选方法为质量筛选时,确定评估分数的评估指标为质量指标;
在所使用筛选方法为关键词过滤时,确定评估分数的评估指标为关键词指标;
在所使用筛选方法为风格筛选时,确定评估分数的评估指标为风格指标。
根据本公开的一个或多个实施例,示例4根据示例3所述的方法,
风格筛选包括长度筛选,风格指标包括长度指标。
根据本公开的一个或多个实施例,示例5根据示例1所述的方法,
参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本,包括:
将所述文本候选集进行第一筛选方法处理,得到所述文本候选集中第一设定数量个目标候选文本的评估分数,所述第一设定数量个目标候选文本为所筛选候选文本;
参照最小贝叶斯风险解码的筛选策略,按照基于所述评估分数从高到低的顺序,从所述文本候选集中选取第二设定数量个目标候选文本形成选取后的文本候选集作为本次筛选输出的结果;
继续参照最小贝叶斯风险解码将选取后的文本候选集进行后续筛选方法处理,直至通过所述后续筛选方法中的第二筛选方法从选取后的文本候选集中筛选得到目标文本,所述第二筛选方法为所述至少两次筛选方法中最后执行的筛选方法。
根据本公开的一个或多个实施例,示例6根据示例1所述的方法,
在筛选方法为基于参照进行筛选的方法时,目标候选文本的多个参照评估分数的均值,每个参照评估分数为所述目标候选文本基于评估指标以对应参照候选文本为参照确定的分数,所述参照候选文本与所述目标候选文本位于同一文本候选集。。
根据本公开的一个或多个实施例,示例7根据示例1所述的方法,
在筛选方法为无需参照进行筛选的方法时,目标候选文本的评估分数为基于评估指标的分数。
根据本公开的一个或多个实施例,示例8根据示例6或8所述的方法,
筛选输出的结果为从所筛选候选文本中按照所述目标候选文本的评估分数的高低选取的所需数量个目标候选文本。
根据本公开的一个或多个实施例,示例9根据示例1所述的方法,
还包括:
在人机交互界面上获取需求信息;
基于所述需求信息确定对应的筛选方法。
根据本公开的一个或多个实施例,示例10提供了一种文本生成装置,包括:
获取模块,设置为获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
处理模块,设置为参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
根据本公开的一个或多个实施例,示例11提供了一种电子设备,包括:
一个或多个处理装置;
存储装置,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如示例1-9中任一所述的方法。
根据本公开的一个或多个实施例,示例12提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现如示例1-9中任一所述的方法。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。

Claims (12)

  1. 一种文本生成方法,所述方法包括:
    获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
    参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
    其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
  2. 根据权利要求1所述的方法,其中,所述获取文本候选集,包括:
    将所述待处理文本输入处理模型,所述处理模型基于采样解码方法处理所述待处理文本得到文本候选集。
  3. 根据权利要求1所述的方法,其中,所述筛选方法,包括如下一个或多个:质量筛选、风格筛选和关键词过滤;
    在所使用筛选方法为质量筛选时,确定评估分数的评估指标为质量指标;
    在所使用筛选方法为关键词过滤时,确定评估分数的评估指标为关键词指标;
    在所使用筛选方法为风格筛选时,确定评估分数的评估指标为风格指标。
  4. 根据权利要求3所述的方法,其中,风格筛选包括长度筛选,风格指标包括长度指标。
  5. 根据权利要求1所述的方法,其中,参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本,包括:
    将所述文本候选集进行第一筛选方法处理,得到所述文本候选集中第一设定数量个目标候选文本的评估分数,所述第一设定数量个目标候选文本为所筛选候选文本;
    参照最小贝叶斯风险解码的筛选策略,按照所述评估分数从高到低的顺序,从所述文本候选集中选取第二设定数量个目标候选文本形成选取后的文本候选集作为本次筛选输出的结果;
    继续参照最小贝叶斯风险解码将选取后的文本候选集进行后续筛选方法处理,直至通过所述后续筛选方法中的第二筛选方法从选取后的文本候选集中筛选得到目标文本,所述第二筛选方法为所述至少两次筛选方法中最后执行的筛选方法。
  6. 根据权利要求1所述的方法,其中,
    在筛选方法为基于参照进行筛选的方法时,目标候选文本的评估分数为所述目标候选文本的多个参照评估分数的均值,每个参照评估分数为所述目标候选文本基于评估指标以对应参照候选文本为参照确定的分数,所述参照候选文本与所述目标候选文本位于同一文本候选集。
  7. 根据权利要求1所述的方法,其中,
    在筛选方法为无需参照进行筛选的方法时,目标候选文本的评估分数为基于评估指标的分数。
  8. 根据权利要求6或7所述的方法,其中,筛选输出的结果为从所筛选候选文本中按照所述目标候选文本的评估分数的高低选取的所需数量个目标候选文本。
  9. 根据权利要求1所述的方法,还包括:
    在人机交互界面上获取需求信息;
    基于所述需求信息确定对应的筛选方法。
  10. 一种文本生成装置,包括:
    获取模块,设置为获取文本候选集,所述文本候选集包括多个候选文本,所述候选文本为对待处理文本进行处理后得到的文本;
    处理模块,设置为参照最小贝叶斯风险解码的筛选策略,将所述文本候选集进行至少两次筛选方法处理后,得到目标文本;
    其中,所述至少两次筛选方法为不同功能的筛选方法,每次筛选的输出作为下次筛选方法的输入,每次筛选输出的结果基于所筛选候选文本的评估分数确定,所述所筛选候选文本的评估分数基于所使用筛选方法的评估指标确定。
  11. 一种电子设备,包括:
    一个或多个处理装置;
    存储装置,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理装置执行,使得所述一个或多个处理装置实现如权利要求1-9中任一所述的方法。
  12. 一种计算机可读介质,所述计算机可读介质上存储有计算机程序,所述计算机程序被处理装置执行时实现如权利要求1-9中任一所述的方法。
PCT/CN2023/089101 2022-04-24 2023-04-19 一种文本生成方法、装置、电子设备及介质 WO2023207690A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210434655.4A CN114881008B (zh) 2022-04-24 2022-04-24 一种文本生成方法、装置、电子设备及介质
CN202210434655.4 2022-04-24

Publications (1)

Publication Number Publication Date
WO2023207690A1 true WO2023207690A1 (zh) 2023-11-02

Family

ID=82672190

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/089101 WO2023207690A1 (zh) 2022-04-24 2023-04-19 一种文本生成方法、装置、电子设备及介质

Country Status (2)

Country Link
CN (1) CN114881008B (zh)
WO (1) WO2023207690A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114881008B (zh) * 2022-04-24 2024-08-13 北京有竹居网络技术有限公司 一种文本生成方法、装置、电子设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334188A (zh) * 2019-07-11 2019-10-15 中国传媒大学 一种多文档摘要生成方法和系统
JP2020027514A (ja) * 2018-08-15 2020-02-20 沖電気工業株式会社 情報処理装置、情報処理システム、情報処理方法およびプログラム
CN113221545A (zh) * 2021-05-10 2021-08-06 北京有竹居网络技术有限公司 一种文本处理方法、装置、设备及介质、程序产品
CN114881008A (zh) * 2022-04-24 2022-08-09 北京有竹居网络技术有限公司 一种文本生成方法、装置、电子设备及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8060358B2 (en) * 2008-03-24 2011-11-15 Microsoft Corporation HMM alignment for combining translation systems
WO2010003117A2 (en) * 2008-07-03 2010-01-07 Google Inc. Optimizing parameters for machine translation
US9368106B2 (en) * 2013-07-30 2016-06-14 Verint Systems Ltd. System and method of automated evaluation of transcription quality
CN103530284B (zh) * 2013-09-22 2016-07-06 中国专利信息中心 短句切分装置、机器翻译系统及对应切分方法和翻译方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020027514A (ja) * 2018-08-15 2020-02-20 沖電気工業株式会社 情報処理装置、情報処理システム、情報処理方法およびプログラム
CN110334188A (zh) * 2019-07-11 2019-10-15 中国传媒大学 一种多文档摘要生成方法和系统
CN113221545A (zh) * 2021-05-10 2021-08-06 北京有竹居网络技术有限公司 一种文本处理方法、装置、设备及介质、程序产品
CN114881008A (zh) * 2022-04-24 2022-08-09 北京有竹居网络技术有限公司 一种文本生成方法、装置、电子设备及介质

Also Published As

Publication number Publication date
CN114881008A (zh) 2022-08-09
CN114881008B (zh) 2024-08-13

Similar Documents

Publication Publication Date Title
KR102401942B1 (ko) 번역품질 평가 방법 및 장치
EP3648099B1 (en) Voice recognition method, device, apparatus, and storage medium
JP6751122B2 (ja) ページ制御方法および装置
CN109241524B (zh) 语义解析方法及装置、计算机可读存储介质、电子设备
CN111402861B (zh) 一种语音识别方法、装置、设备及存储介质
CN112115706A (zh) 文本处理方法、装置、电子设备及介质
CN112037792B (zh) 一种语音识别方法、装置、电子设备及存储介质
WO2022143105A1 (zh) 文本生成模型生成方法、文本生成方法、装置及设备
CN111428010A (zh) 人机智能问答的方法和装置
US12039281B2 (en) Method and system for processing sentence, and electronic device
CN114861889B (zh) 深度学习模型的训练方法、目标对象检测方法和装置
WO2022247562A1 (zh) 多模态数据检索方法、装置、介质及电子设备
WO2024045475A1 (zh) 语音识别方法、装置、设备和介质
WO2023143016A1 (zh) 特征提取模型的生成方法、图像特征提取方法和装置
CN111597825B (zh) 语音翻译方法、装置、可读介质及电子设备
WO2024146328A1 (zh) 翻译模型的训练方法、翻译方法及设备
WO2023207690A1 (zh) 一种文本生成方法、装置、电子设备及介质
WO2020052061A1 (zh) 用于处理信息的方法和装置
WO2022161122A1 (zh) 一种会议纪要的处理方法、装置、设备及介质
WO2021012691A1 (zh) 用于检索图像的方法和装置
JP2022529268A (ja) 音声を認識する方法及び装置
WO2020052060A1 (zh) 用于生成修正语句的方法和装置
KR102621436B1 (ko) 음성 합성 방법, 장치, 전자 기기 및 저장 매체
WO2023000782A1 (zh) 获取视频热点的方法、装置、可读介质和电子设备
CN111460214B (zh) 分类模型训练方法、音频分类方法、装置、介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795123

Country of ref document: EP

Kind code of ref document: A1