CN115879469B

CN115879469B - Text data processing method, model training method, device and medium

Info

Publication number: CN115879469B
Application number: CN202211737328.2A
Authority: CN
Inventors: 高杨帆; 孙辉丰; 孙叔琦; 常月
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-10-03
Anticipated expiration: 2042-12-30
Also published as: CN115879469A

Abstract

The disclosure provides a text data processing method, a model training method, a device and a medium, relates to the technical field of artificial intelligence, and particularly relates to the fields of text data processing, deep learning, natural language processing and dialogue systems. The implementation scheme is as follows: generating an original text for replying to the input text based on the input text of the user; acquiring target style information; and generating a target text corresponding to the target style based on the original text and the target style information.

Description

Text data processing method, model training method, device and medium

Technical Field

The present disclosure relates to the field of artificial intelligence, and more particularly to the field of text data processing, deep learning, natural language processing, and dialog systems, and in particular to a text data processing method, model training method, apparatus, electronic device, computer readable storage medium, and computer program product.

Background

Artificial intelligence is the discipline of studying the process of making a computer mimic certain mental processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.) of a person, both hardware-level and software-level techniques. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

In the conversation and boring system, the conversation robot needs to generate reasonable and content-rich replies according to the expression of the user, and the user has strong perception demands on the reply style of the robot, for example, common replies such as ' I are not the eggs, I are the intelligent eggs ', I can not be the eggs, I are the intelligent egg woolen ', can provide completely different use experiences for the user, and the expression mode of the robot close to a real person is always one of the targets pursued by the conversation system. Many conversational robots now have TTS (Text To Speech) Speech packages of different styles, and the source of the TTS is still Text, so that the user can feel the style and expression characteristics of the robot in front of the screen in the conversational process, whether the consistency of the Text and the Speech in the style or the consistency of the style of all replies of the history of the robot is ensured.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.

Disclosure of Invention

The present disclosure provides a text data processing method, a model training method, an apparatus, an electronic device, a computer readable storage medium, and a computer program product.

According to an aspect of the present disclosure, there is provided a text data processing method including: generating an original text for replying to the input text based on the input text of the user; acquiring target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style of the original text to be converted, and the target style dictionary comprises at least one corpus text corresponding to the target style; and generating a target text corresponding to the target style based on the original text and the target style information.

According to another aspect of the present disclosure, there is provided a model training method for converting an original text into a text of a target style, including: obtaining a sample dataset, wherein the sample dataset comprises at least one sample text pair, each sample text pair in the at least one sample text pair comprising an original sample text and a target sample text corresponding to a target style; for each sample text pair in at least one sample text pair, acquiring a corresponding labeling sequence of the sample text pair, wherein the labeling sequence comprises at least one operation label corresponding to at least one character of an original sample text in the sample text pair, the at least one operation label comprises a reserved label and a modified label, the reserved label is used for indicating characters needing to be reserved in comparison with a target sample text in the sample text pair, the modified label comprises an insertion label, and the insertion label is used for indicating characters needing to be inserted in the original sample text compared with the target sample text; determining characters corresponding to the inserted labels in the corresponding labeling sequences of each sample text pair in at least one sample text pair as corpus texts so as to construct a target style dictionary corresponding to the target style; and for each sample pair in the sample dataset, performing the following: inputting the corpus texts in the target style dictionary, the original sample texts in the sample text pair and the target sample text into a model to obtain a label sequence prediction result output by the model; and training a model based on the annotation sequence prediction result and the annotation sequence corresponding to the sample text pair.

According to another aspect of the present disclosure, there is provided a model training method including: obtaining a sample data set, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the plurality of target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and for each sample pair in the sample dataset, performing the following: inputting an original sample text, a target sample text and a corresponding target style label of the sample text in the sample text pair into a model to obtain a target text prediction result output by the model; and training a model based on the target text prediction result and the target sample text in the sample text pair.

According to another aspect of the present disclosure, there is provided a text data processing apparatus including: a first generation unit configured to generate an original text for replying to the input text based on the input text of the user; the first acquisition unit is configured to acquire target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style of the original text to be converted, and the target style dictionary comprises at least one corpus text corresponding to the target style; and a second generation unit configured to generate a target text corresponding to the target style based on the original text and the target style information.

According to another aspect of the present disclosure, there is provided a model training apparatus for converting an original text into a text of a target style, including: a second acquisition unit configured to acquire a sample data set, wherein the sample data set includes at least one sample text pair, each sample text pair of the at least one sample text pair including an original sample text and a target sample text corresponding to a target style; a third obtaining unit configured to obtain, for each of at least one sample text pair, a labeling sequence corresponding to the sample text pair, the labeling sequence including at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, respectively, the at least one operation tag including a retention tag for indicating characters to be retained in comparison with a target sample text in the sample text pair and a modification tag including an insertion tag for indicating characters to be inserted in the original sample text compared with the target sample text; a determining unit configured to determine, as a corpus text, a character corresponding to an insertion label in a labeling sequence corresponding to each of at least one sample text pair, so as to construct a target style dictionary corresponding to a target style; and a first execution unit configured to perform operations of the following sub-units for each sample pair in the sample data set, the first execution unit comprising: the first input subunit is configured to input the corpus text in the target style dictionary, the original sample text in the sample text pair and the target sample text into a model so as to obtain a label sequence prediction result output by the model; and a first training subunit configured to train a model based on the annotation sequence prediction and the annotation sequence corresponding to the sample text pair.

According to another aspect of the present disclosure, there is provided a model training apparatus including: a fourth acquisition unit configured to acquire a sample data set including a plurality of target style tags, and at least one sample text pair corresponding to each of the plurality of target style tags, each sample text pair including an original sample text and a target sample text having a corresponding target style; and a second execution unit configured to perform operations of the following sub-units for each sample pair in the sample data set, the second execution unit including: the second input subunit is configured to input the original sample text, the target sample text and the corresponding target style label of the sample text in the sample text pair into a model so as to obtain a target text prediction result output by the model; and a second training subunit configured to train the model based on the target text prediction result and the target sample text in the sample text pair.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the text data processing method or the model training method.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the above-described text data processing method or model training method.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the above-described text data processing method or model training method.

According to one or more embodiments of the present disclosure, the complexity of model training can be reduced, and the model training efficiency can be improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.

FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, in accordance with an embodiment of the present disclosure;

FIG. 2 illustrates a flow chart of a text data processing method according to an embodiment of the present disclosure;

FIG. 3 illustrates an architectural diagram of a dialog system, according to an exemplary embodiment of the present disclosure;

FIG. 4 illustrates a flow chart of a method of generating target text according to an embodiment of the present disclosure;

FIG. 5 shows a schematic diagram of a first model according to an exemplary embodiment of the present disclosure;

FIG. 6 illustrates an architectural diagram of a second model according to an exemplary embodiment of the present disclosure;

FIG. 7 illustrates a flow chart of a model training method according to an embodiment of the present disclosure;

FIG. 8 illustrates a flow chart of a model training method according to an embodiment of the present disclosure;

FIG. 9 shows a block diagram of a text data processing device according to an embodiment of the present disclosure;

FIG. 10 shows a block diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 11 shows a block diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 12 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another element. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.

The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.

In the related art, a large dialogue model with a style conversion function is trained to directly generate a style-converted reply text, a large amount of style corpus and computing resources are needed to be relied on, training cost and complexity are high, and training efficiency is low.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented, in accordance with an embodiment of the present disclosure. Referring to fig. 1, the system 100 includes one or more client devices 101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120. Client devices 101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In embodiments of the present disclosure, the server 120 may run one or more services or software applications that enable execution of the text data processing methods, model training methods described above.

In some embodiments, server 120 may also provide other services or software applications, which may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of client devices 101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof that are executable by one or more processors. A user operating client devices 101, 102, 103, 104, 105, and/or 106 may in turn utilize one or more client applications to interact with server 120 to utilize the services provided by these components. It should be appreciated that a variety of different system configurations are possible, which may differ from system 100. Accordingly, FIG. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.

The user may enter text data using client devices 101, 102, 103, 104, 105, and/or 106. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that the present disclosure may support any number of client devices.

Client devices 101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., GOOGLE Chrome OS); or include various mobile operating systems such as MICROSOFT Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head mounted displays (such as smart glasses) and other devices. The gaming system may include various handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a number of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. For example only, the one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a blockchain network, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture that involves virtualization (e.g., one or more flexible pools of logical storage devices that may be virtualized to maintain virtual storage devices of the server). In various embodiments, server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above as well as any commercially available server operating systems. Server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, etc.

In some implementations, server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of client devices 101, 102, 103, 104, 105, and/or 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of client devices 101, 102, 103, 104, 105, and/or 106.

In some implementations, the server 120 may be a server of a distributed system or a server that incorporates a blockchain. The server 120 may also be a cloud server, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology. The cloud server is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and virtual private server (VPS, virtual Private Server) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of databases 130 may be used to store information such as audio files and video files. Database 130 may reside in various locations. For example, the database used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. Database 130 may be of different types. In some embodiments, the database used by server 120 may be, for example, a relational database. One or more of these databases may store, update, and retrieve the databases and data from the databases in response to the commands.

In some embodiments, one or more of databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key value stores, object stores, or conventional stores supported by the file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

According to an embodiment of the present disclosure, as shown in fig. 2, there is provided a text data processing method, including: step S201, generating an original text for replying to the input text based on the input text of the user; step S202, obtaining target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style of an original text to be converted, and the target style dictionary comprises at least one corpus text corresponding to the target style; and step S203, generating target text corresponding to the target style based on the original text and the target style information.

Therefore, firstly, a reply text is generated based on a text input by a user through the large dialogue model, and then the reply text is input into the text style conversion model to obtain the reply text after style conversion, so that retraining of the large dialogue model can be avoided, meanwhile, training of the style conversion model can be completed only by a small amount of samples aiming at style conversion, complexity of model training is reduced, and model training efficiency is improved.

In some embodiments, the input text may be obtained by receiving text input or voice input from a user. And after the target text corresponding to the target style is obtained, the target text can be displayed on the display device in a text form, or TTS voice generation can be performed based on the target text, and the target text is fed back to the user in a voice broadcasting form. Wherein the style of the TTS voice packet may correspond to the target style described above.

Fig. 3 shows an architectural diagram of a dialog system according to an exemplary embodiment of the present disclosure.

In some exemplary embodiments, as shown in fig. 3, the input text of the user may be acquired by the conversation robot 310 and transmitted to the conversation model 320, and the conversation model 320 may generate an original reply text for replying to the input text based on the input of the user and the history context information and input the original reply text into the style conversion model 330 to convert the original reply text into a target reply text of a target style and return it to the conversation robot 310 to reply to the user.

In some embodiments, the dialog model may apply a transform-based dialog model.

In some exemplary embodiments, the conversation model may employ a PLATO conversation large model that employs a Unified-transform structure on the network architecture, which can simultaneously perform joint modeling of conversation understanding and reply generation; by introducing multi-personally perceived input representations, consistency across multiple rounds of conversations is improved.

In some embodiments, the style conversion model may be an edit-based sequence annotation model.

In some embodiments, as shown in fig. 4, generating the target text corresponding to the target style based on the original text and the target style information may include: step S401, determining a first model corresponding to a target style based on the target style label, wherein the first model is obtained based on at least one first text pair and target style dictionary training, and each first text pair in the at least one first sample text pair comprises a first original text and a first target text corresponding to the target style; step S402, acquiring a character sequence of an original text, wherein the character sequence comprises at least one character of the original text; step S403, based on the target style dictionary, performing sequence labeling on the character sequence by using a first model to obtain a labeling sequence, wherein the labeling sequence comprises at least one operation label corresponding to at least one character respectively, the at least one operation label comprises a reserved label and an insertion label, the reserved label is used for indicating the character corresponding to the reserved label, the insertion label corresponds to one of the at least one corpus text and is used for indicating the insertion of the corresponding corpus text into the character sequence; and step S404, generating a target text based on the labeling sequence.

Therefore, by applying the model corresponding to the target style, the input text is subjected to sequence labeling based on the corpus dictionary corresponding to the target style, so that the operation label of each character in the input text is obtained, and the output text is determined based on the operation label. Therefore, some texts in the texts can be reserved or newly added through the model, so that simpler style conversion is realized, and the text style conversion can be realized with less calculation resources and higher calculation efficiency.

In some embodiments, the target style tag may be used to determine a first model corresponding to the target style to be converted and a target style dictionary, and then style conversion of the corresponding target style may be performed based on the corresponding first model and dictionary.

In some embodiments, the target style dictionary may include one or more corpus texts corresponding to the target style, where the corpus texts may be selected by the first model to be inserted into corresponding positions in the original text, so as to implement style conversion on the original text. For example, the original text is "good", i.e., the target text "good" of the lively style is obtained by inserting the corpus text "good" in the dictionary into the original text.

Fig. 5 shows a schematic diagram of a first model according to an exemplary embodiment of the present disclosure.

In some exemplary embodiments, as shown in fig. 5, for the original text, "good, i go right away", the corresponding character sequence may be acquired first, where the character sequence includes all the characters in the original text, and may include a sequence start symbol [ CLS ] and a sequence end symbol [ SEP ], and then the character sequence may be: [ CLS ], "good", "',", "," i "," horse "," up "," go "," SEP ".

After the character sequence is embedded, the character sequence can be input into the first model 500. In some embodiments, at least one corpus text in the target style dictionary may be simultaneously formed into a corpus sequence, and simultaneously input into the first model 500, where the first model 500 may predict a corresponding labeling sequence "keep, keep, keep, keep-log, and keep-log. The "keep" is a reserved label, and is used for indicating that the text corresponding to the label needs to be reserved at the corresponding position in the target text. "keep-X" is an inserted label, and is used for indicating that characters corresponding to the label are reserved, and characters "X" are inserted in front of the characters, where "X" may be a certain corpus text selected by the first model 500 in the target style dictionary. It will be appreciated that the skilled artisan can set the form of the sequence tag on his own as desired, without limitation.

Based on the labeling sequence, a corresponding good couple of target texts can be generated, and the text can be removed immediately, so that the conversion of the text style is realized by inserting selected characters into the original text.

In some embodiments, the number of at least one character may be plural, and the at least one operation tag may include a reservation tag, and at least one of an insertion tag and a deletion tag for indicating deletion of the character corresponding to the deletion tag from the character sequence.

Therefore, the operation mode for the text is further enriched, and the flexibility of text style conversion is improved.

In some exemplary embodiments, for the original text "i am not working, i am playing at home", their corresponding character sequences may be obtained first: [ CLS ], "i", "no", "on-duty", "shift", "i", "at", "home", "in", "play", "SEP ].

After the character sequence is embedded, the character sequence can be input into the first model 500. In some embodiments, at least one corpus text in the target style dictionary may be simultaneously formed into a corpus sequence, and simultaneously input into the first model 500, where the first model 500 may predict a corresponding labeling sequence "keep, keep, keep, keep-jo. The delete is a delete label, and is used to indicate that the text corresponding to the label needs to be deleted at the corresponding position in the target text. The character corresponding to the delete label may be a certain corpus text selected by the first model 500 in the target style dictionary. It will be appreciated that the skilled artisan can set the form of the sequence tag on his own as desired, without limitation.

Based on the labeling sequence, a corresponding target text ' I don't work, I play the woolen at home ' can be generated, so that the conversion of the text style is realized by inserting and/or deleting selected characters in the original text.

In some embodiments, the target style dictionary may further include reference information, and the reference information may include at least one of a first operation tag corresponding to each of the at least one corpus text and a use probability, where the first operation tag is used to indicate that an operation corresponding to the corresponding corpus text is inserted or deleted, and the use probability is determined based on occurrence frequencies of the corresponding corpus text in constructing the target style dictionary.

In some embodiments, sequence labeling the character sequence with the first model based on the target style dictionary to obtain a labeled sequence may include: and based on at least one corpus text and reference information in the target style dictionary, performing sequence labeling on the character sequence by using the first model to obtain a labeling sequence.

Therefore, multidimensional reference information in the corpus dictionary can be introduced into model prediction, and the effects of model prediction and text style conversion are further improved.

When a target style dictionary of a corresponding style is constructed, a longest common subsequence of an original text sequence and a target text sequence can be obtained by dynamically programming a sample text pair prepared in advance, the text to be inserted and deleted can be obtained based on the target text sequence by subtracting the longest common subsequence, a corresponding modification tag (an insertion tag or a deletion tag) is marked on the text, and the text and the corresponding modification tag are added into the target style dictionary to be used as one corpus text.

In some embodiments, word frequency statistics may be further performed on the corpus texts in the target style dictionary obtained by the above method, so as to obtain the occurrence probability of each corpus text, and this information is also recorded in the dictionary as one of the reference information.

In some embodiments, when the labeling sequence is generated, at least one kind of reference information in the modification label and the occurrence probability can be input into the first model at the same time, so that multidimensional reference information in the corpus dictionary is introduced into model prediction, and the effects of model prediction and text style conversion are further improved.

In some embodiments, the first model may be a transducer-based language model or a sequence annotation model. It will be appreciated that the relevant person can determine the model to be applied by himself, without limitation.

In some embodiments, the style conversion model may be an end-to-end generation model.

In some embodiments, generating the target text corresponding to the target style based on the original text and the target style information may include: and inputting the target style tag and the original text into a second model to obtain a target text output by the second model, wherein the second model is obtained based on training of a plurality of style tags and at least one second sample text pair corresponding to each style tag in the plurality of style tags, and each second sample text pair in the at least one second sample text pair comprises a second original text and a second target text corresponding to the corresponding style tag.

Fig. 6 shows an architectural diagram of a second model according to an exemplary embodiment of the present disclosure.

In some embodiments, the second model 600 may be a generative model capable of multiple style transformations. In order to determine the target style to be converted, the target style tag may be spliced with the original text sequence, and the spliced sequence may be embedded and then input into the second model 600, so as to obtain the target text sequence of the corresponding target style. For example, where the original text is "thank you exaggerate, you also have a smart o" and the target style label is "antique", the target text generated by the second model 600 may be "thank you, you also have a smart o".

Therefore, by introducing style labels as guide information in the model prediction process, multiple styles of conversion can be realized through one model, and meanwhile, the smoothness of generated text can be further optimized, so that more complex style conversion is realized.

In some embodiments, the second model may be a transducer-based language model. In some embodiments, the second model may be obtained based on a Unilm model training. It will be appreciated that the relevant person can determine the model to be applied by himself, without limitation.

In some embodiments, as shown in fig. 7, a model training method is provided, the model is used for converting original text into text of a target style, and the method includes: step S701, acquiring a sample data set, wherein the sample data set comprises at least one sample text pair, and each sample text pair in the at least one sample text pair comprises an original sample text and a target sample text corresponding to a target style; step S702, for each sample text pair in at least one sample text pair, obtaining a corresponding labeling sequence of the sample text pair, where the labeling sequence includes at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, the at least one operation tag includes a retention tag and a modification tag, the retention tag is used to indicate characters to be retained in comparison with a target sample text in the sample text pair, the modification tag includes an insertion tag, and the insertion tag is used to indicate characters to be inserted in the original sample text compared with the target sample text; step S703, determining characters corresponding to the inserted labels in the corresponding labeling sequences of each sample text pair in at least one sample text pair as corpus text, so as to construct a target style dictionary corresponding to the target style; and for each sample pair in the sample dataset, performing the following: step S704, inputting the corpus texts in the target style dictionary, the original sample texts in the sample text pair and the target sample text into a model to obtain a label sequence prediction result output by the model; and step S705, training a model based on the labeling sequence prediction result and the labeling sequence corresponding to the sample text pair.

Therefore, the obtained model is trained, the input text is subjected to sequence labeling based on a corpus dictionary corresponding to the target style, so that an operation label of each character in the input text is obtained, and the output text is determined based on the operation label. Therefore, some texts in the texts can be reserved or newly added through the model, so that simpler style conversion is realized, and the text style conversion can be realized with less calculation resources and higher calculation efficiency.

In some embodiments, for the training of the first model, a target style of the first model to be subjected to style conversion may be determined first, and then, a corresponding sample text pair may be obtained based on the target style.

In some embodiments, each sample text pair may be dynamically planned first to obtain the longest common subsequence of the original text sequence and the target text sequence, and the text to be saved, inserted, and deleted may be obtained based on subtracting the longest common subsequence from the target text sequence, so as to obtain the labeling sequence of the sample text pair correspondingly. Meanwhile, the text marked with the modification tag (the insert tag or the delete tag) can be added into the target style dictionary together with the corresponding modification tag, so that the text is used as one of the corpus texts, and the target style dictionary corresponding to the target style is constructed based on the sample data.

In some embodiments, the number of at least one character may be a plurality, and modifying the tag may include at least one of inserting the tag and deleting the tag to indicate the character that needs to be deleted in the original sample text as compared to the target sample text. Therefore, the operation mode for the text is further enriched, and the flexibility of text style conversion is improved.

In some embodiments, the target style dictionary may also include a corresponding modification tag for each corpus text. Therefore, multidimensional reference information (such as a corresponding modification label of each corpus text) in the corpus dictionary is introduced into model training, so that the effects of model prediction and text style conversion are further improved.

In some embodiments, building a target style dictionary corresponding to a target style may further include: counting the occurrence frequency of characters corresponding to the inserted labels in the corresponding labeling sequence of each sample text pair in at least one sample text pair to obtain at least one first character ordered according to the occurrence frequency; constructing a target style dictionary based on a preset number of first characters with highest occurrence frequency so as to delete the rest first characters; deleting the sample text pair corresponding to the deleted first character to update the sample data set; and training a model based on the updated sample dataset.

Therefore, when a corpus dictionary is constructed, the corpora are ordered according to word frequency, so that the corpora with low occurrence frequency are deleted, and the computing resources are further saved; correspondingly, deleting the sample pairs corresponding to the deleted corpus to delete the samples which cannot be covered by the dictionary, thereby avoiding influencing the model training effect.

Each sample text in the updated sample data set may then be embedded in a corresponding original text sequence, target text sequence, and input into a model (i.e., the first model described above). In some embodiments, at least one corpus text in the target style dictionary can be formed into a corpus sequence at the same time, and the corpus sequence is input into the model at the same time, and a prediction result of the labeling sequence is obtained by the model prediction; then, a loss function is calculated based on the corresponding annotation sequence of the sample text pair and the prediction result to train the model based on the loss function.

In some embodiments, the above-described loss function may apply a cross entropy loss function. It will be appreciated that the person skilled in the relevant art may also determine the loss function by himself based on the actual need, without limitation.

In some embodiments, as shown in fig. 8, a model training method is provided, comprising: step S801, a sample data set is obtained, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the plurality of target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and for each sample pair in the sample dataset, performing the following: s802, inputting an original sample text, a target sample text and a corresponding target style label of the sample text in the sample text pair into a model to obtain a target text prediction result output by the model; and S803, training a model based on the target text prediction result and the target sample text in the sample text pair.

In some embodiments, pairs of sample text corresponding to multiple target styles may be collected separately first, where each pair of sample text is provided with a respective target style tag. In the training process of the model (namely the second model), a corresponding original text sequence, a corresponding target text sequence and a corresponding target style label of the sample text can be spliced into a sequence, the sequence is embedded and then input into the model, so that a text sequence prediction result of a corresponding target style output by the model is obtained; a penalty function may then be calculated based on the target text sequence and the text sequence predictions to train the model based on the penalty function.

Therefore, by introducing style labels as guide information in the model prediction process, fusion training of multiple styles of corpus is realized, training efficiency is improved, multiple styles of conversion can be realized by the model, smoothness of generated text is further optimized, and more complex style conversion is realized.

In some embodiments, as shown in fig. 9, there is provided a text data processing apparatus 900 comprising: a first generation unit 910 configured to generate an original text for replying to the input text based on the input text of the user; a first obtaining unit 920 configured to obtain target style information, where the target style information includes at least one of a target style tag and a target style dictionary, and the target style tag is used to indicate a target style of the original text to be converted, and the target style dictionary includes at least one corpus text corresponding to the target style; and a second generation unit 930 configured to generate a target text corresponding to the target style based on the original text and the target style information.

The operations of the units 910-930 in the apparatus 900 are similar to the operations of the steps S201-S203 of the text processing method, and are not described herein.

In some embodiments, the second generating unit may include: a first determining subunit configured to determine, based on the target style tag, a first model corresponding to the target style, the first model obtained based on at least one first pair of sample texts and target style dictionary training, each first pair of sample texts of the at least one first pair of sample texts including a first original text and a first target text corresponding to the target style; a first acquisition subunit configured to acquire a character sequence of the original text, the character sequence including at least one character of the original text; a second obtaining subunit configured to perform sequence labeling on the character sequence based on the target style dictionary by using a first model to obtain a labeling sequence, where the labeling sequence includes at least one operation tag corresponding to at least one character respectively, the at least one operation tag includes a reservation tag for indicating a character corresponding to the reservation tag and an insertion tag corresponding to one of the at least one corpus text and for indicating insertion of the corresponding corpus text into the character sequence; and a first generation subunit configured to generate the target text based on the annotation sequence.

In some embodiments, the target style dictionary may further include reference information, the reference information may include at least one of a first operation tag corresponding to each of the at least one corpus text and a use probability, the first operation tag indicating that an operation corresponding to the corresponding corpus text is inserted or deleted, the use probability being determined based on occurrence frequencies of the corresponding corpus text in constructing the target style dictionary, and the second obtaining subunit may be further configured to: and based on at least one corpus text and reference information in the target style dictionary, performing sequence labeling on the character sequence by using the first model to obtain a labeling sequence.

In some embodiments, the second generating unit may include: a third obtaining subunit configured to input the target style tag and the original text into a second model to obtain the target text output by the second model, wherein the second model is obtained based on training of the plurality of style tags and at least one second sample text pair corresponding to each style tag in the plurality of style tags, and each second sample text pair in the at least one second sample text pair includes the second original text and the second target text corresponding to the corresponding style tag.

In some embodiments, as shown in fig. 10, there is provided a model training apparatus 1000 for converting an original text into a text of a target style, the apparatus 1000 comprising: a second obtaining unit 1010 configured to obtain a sample data set, wherein the sample data set comprises at least one sample text pair, each sample text pair of the at least one sample text pair comprising an original sample text and a target sample text corresponding to a target style; a third obtaining unit 1020 configured to obtain, for each of at least one sample text pair, a corresponding labeling sequence of the sample text pair, the labeling sequence including at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, respectively, the at least one operation tag including a retention tag for indicating characters of the target sample text in the sample text pair that need to be retained and a modification tag including an insertion tag for indicating characters of the target sample text that need to be inserted in the original sample text compared to the target sample text; a determining unit 1030 configured to determine, as corpus text, a character corresponding to an inserted tag in a corresponding labeling sequence of each of the at least one sample text pair, to construct a target style dictionary corresponding to the target style; and a first execution unit 1040 configured to execute operations of the following sub-units for each sample pair in the sample data set, the first execution unit 1040 including: the first input subunit 1041 is configured to input the corpus text in the target style dictionary, the original sample text in the sample text pair, and the target sample text into the model, so as to obtain a label sequence prediction result output by the model; and a first training subunit 1042 configured to train the model based on the labeling sequence prediction result and the labeling sequence corresponding to the sample text pair.

The operations of the units 1010-1040, 1041, 1042 in the apparatus 1000 are similar to those of the steps S701-S705 of the model training method, and are not described herein.

In some embodiments, as shown in fig. 11, there is provided a model training apparatus 1100 comprising: a fourth obtaining unit 1110 configured to obtain a sample data set, where the sample data set includes a plurality of target style tags, and at least one sample text pair corresponding to each of the plurality of target style tags, and each sample text pair includes an original sample text and a target sample text having a corresponding target style; and a second execution unit 1120 configured to perform operations of the following sub-units for each sample pair in the sample data set, the second execution unit 1120 including: a second input subunit 1121 configured to input the original sample text, the target sample text, and the sample text in the sample text pair into a model for the corresponding target style tag, so as to obtain a target text prediction result output by the model; and a second training subunit 1122 configured to train a model based on the target text prediction result and the target sample text in the sample text pair.

The operations of the unit 1110, the unit 1120, the subunit 1121, and the subunit 1122 in the apparatus 1100 are similar to the operations of the steps S801-S803 of the model training method described above, and are not described herein.

According to embodiments of the present disclosure, there is also provided an electronic device, a readable storage medium and a computer program product.

With reference to fig. 12, a block diagram of an electronic device 1200 that may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 12, the electronic device 1200 includes a computing unit 1201 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1202 or a computer program loaded from a storage unit 1208 into a Random Access Memory (RAM) 1203. In the RAM 1203, various programs and data required for the operation of the electronic device 1200 may also be stored. The computing unit 1201, the ROM 1202, and the RAM 1203 are connected to each other via a bus 1204. An input/output (I/O) interface 1205 is also connected to the bus 1204.

Various components in the electronic device 1200 are connected to the I/O interface 1205, including: an input unit 1206, an output unit 1207, a storage unit 1208, and a communication unit 1209. The input unit 1206 may be any type of device capable of inputting information to the electronic device 1200, the input unit 1206 may receive input numeric or character information and generate key signal inputs related to user settings and/or function control of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a trackpad, a trackball, a joystick, a microphone, and/or a remote control. The output unit 1207 may be any type of device capable of presenting information, and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 1208 may include, but is not limited to, magnetic disks, optical disks. The communication unit 1209 allows the electronic device 1200 to exchange information/data with other devices over computer networks, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 1201 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1201 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The computing unit 1201 performs the various methods and processes described above, such as the text data processing method or model training method described above. For example, in some embodiments, the text data processing method or model training method described above may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1200 via the ROM 1202 and/or the communication unit 1209. When the computer program is loaded into the RAM 1203 and executed by the computing unit 1201, one or more steps of the text data processing method or model training method described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured to perform the text data processing method or model training method described above in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely exemplary embodiments or examples, and that the scope of the present invention is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims

1. A text data processing method, comprising:

generating an original text for replying to an input text based on the input text of a user;

Acquiring target style information, wherein the target style information comprises target style labels and a target style dictionary, the target style labels are used for indicating target styles of the original text to be converted, the target style dictionary comprises at least one corpus text corresponding to the target styles and reference information, the reference information comprises first operation labels and use probabilities corresponding to each corpus text in the at least one corpus text, the first operation labels are used for indicating that operations corresponding to the corresponding corpus texts are insertion or deletion, and the use probabilities are determined based on occurrence frequencies of the corresponding corpus texts in the construction of the target style dictionary; and

generating a target text corresponding to the target style based on the original text and the target style information, including:

determining, based on the target style tag, a first model corresponding to the target style, the first model obtained based on at least one first pair of sample texts and the target style dictionary training, and the first model constructed based on a language model of a transducer, each first pair of sample texts including a first original text and a first target text corresponding to the target style;

Acquiring a character sequence of the original text, wherein the character sequence comprises at least one character of the original text;

performing sequence labeling on the character sequence by using the first model based on the target style dictionary to obtain a labeling sequence, wherein the labeling sequence comprises at least one operation label corresponding to the at least one character respectively, the at least one operation label comprises a reserved label and an inserted label, the reserved label is used for indicating to reserve the character corresponding to the reserved label, the inserted label corresponds to one of the at least one corpus texts and is used for indicating to insert the corresponding corpus text into the character sequence, and the performing sequence labeling on the character sequence by using the first model based on the target style dictionary to obtain the labeling sequence comprises:

inputting a corpus sequence consisting of the at least one corpus text and the reference information in the target style dictionary and the character sequence into the first model, so as to carry out sequence labeling on the character sequence by utilizing the first model and obtain the labeling sequence; and

and generating the target text based on the labeling sequence.

2. The method of claim 1, wherein the number of the at least one character is a plurality, the at least one operation tag includes a reservation tag, and at least one of an insertion tag and a deletion tag for indicating deletion of the character corresponding to the deletion tag from the character sequence.

3. A model training method for converting an original text into a text of a target style, and the model is constructed based on a language model of a transducer, the method comprising:

obtaining a sample dataset, wherein the sample dataset comprises at least one sample text pair, each sample text pair of the at least one sample text pair comprising an original sample text and a target sample text corresponding to the target style;

for each sample text pair in the at least one sample text pair, acquiring a corresponding labeling sequence of the sample text pair, wherein the labeling sequence comprises at least one operation label corresponding to at least one character of an original sample text in the sample text pair, the at least one operation label comprises a reserved label and a modified label, the reserved label is used for indicating characters needing to be reserved in comparison with a target sample text in the sample text pair, the modified label comprises an insertion label, and the insertion label is used for indicating characters needing to be inserted in the original sample text compared with the target sample text;

Determining characters corresponding to the modification labels in the corresponding labeling sequences of each sample text pair in the at least one sample text pair as corpus texts so as to construct a target style dictionary corresponding to the target style, wherein the target style dictionary also comprises reference information, the reference information comprises the corresponding modification labels and the use probability of each corpus text, and the use probability is determined based on the occurrence frequency of the corresponding corpus texts constructing the target style dictionary; and

for each sample pair in the sample dataset, performing the following:

inputting a corpus sequence consisting of the corpus text in the target style dictionary and the reference information, an original sample text in the sample text pair and a target sample text into the model, so as to carry out sequence labeling on a character sequence corresponding to the original sample text by using the model, and acquiring a labeled sequence prediction result output by the model; and

and training the model based on the marking sequence prediction result and the marking sequence corresponding to the sample text pair.

4. The method of claim 3, wherein the number of the at least one character is a plurality, the modification tag includes at least one of an insert tag and a delete tag, the delete tag to indicate a character to be deleted in the original sample text compared to the target sample text.

5. The method of claim 3 or 4, wherein the constructing a target style dictionary corresponding to the target style further comprises:

counting the occurrence frequency of characters corresponding to the inserted labels in the corresponding labeling sequence of each sample text pair in the at least one sample text pair to obtain at least one first character ordered according to the occurrence frequency;

constructing the target style dictionary based on the first characters with the highest occurrence frequency and the preset number, so as to delete the rest first characters;

deleting the sample text pair corresponding to the deleted first character to update the sample data set; and

the model is trained based on the updated sample data set.

6. A text data processing apparatus comprising:

a first generation unit configured to generate an original text for replying to an input text of a user based on the input text;

the first obtaining unit is configured to obtain target style information, the target style information comprises a target style label and a target style dictionary, the target style label is used for indicating a target style of the original text to be converted, the target style dictionary comprises at least one corpus text corresponding to the target style and reference information, the reference information comprises a first operation label corresponding to each corpus text in the at least one corpus text and a use probability, the first operation label is used for indicating that an operation corresponding to the corresponding corpus text is insertion or deletion, and the use probability is determined based on the occurrence frequency of the corresponding corpus text in constructing the target style dictionary; and

A second generation unit configured to generate a target text corresponding to the target style based on the original text and the target style information, including:

a first determination subunit configured to determine, based on the target style tag, a first model corresponding to the target style, the first model obtained based on at least one first pair of sample texts and the target style dictionary training, and the first model constructed based on a language model of a Transformer, each first pair of sample texts including a first original text and a first target text corresponding to the target style;

a first acquisition subunit configured to acquire a character sequence of the original text, the character sequence including at least one character of the original text;

a second obtaining subunit configured to sequence label the character sequence with the first model based on the target style dictionary to obtain a label sequence, wherein the label sequence includes at least one operation tag corresponding to the at least one character, the at least one operation tag including a reservation tag for indicating reservation of the character corresponding to the reservation tag and an insertion tag corresponding to one of the at least one corpus text and for indicating insertion of the corresponding corpus text into the character sequence, the second obtaining subunit further configured to:

a first generation subunit configured to generate the target text based on the annotation sequence.

7. The apparatus of claim 6, wherein the number of the at least one character is a plurality, the at least one operation tag includes a reservation tag, and at least one of an insertion tag and a deletion tag for indicating deletion of a character corresponding to the deletion tag from the character sequence.

8. A model training apparatus for converting an original text into a text of a target style, and constructed based on a language model of a transducer, the apparatus comprising:

a second acquisition unit configured to acquire a sample data set, wherein the sample data set includes at least one sample text pair, each sample text pair of the at least one sample text pair including an original sample text and a target sample text corresponding to the target style;

A third obtaining unit configured to obtain, for each of the at least one sample text pair, a labeling sequence corresponding to the sample text pair, the labeling sequence including at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, respectively, the at least one operation tag including a retention tag for indicating characters to be retained in comparison with a target sample text in the sample text pair and a modification tag including an insertion tag for indicating characters to be inserted in the original sample text compared with the target sample text;

a determining unit, configured to determine characters corresponding to modification labels in corresponding labeling sequences of each sample text pair in the at least one sample text pair as corpus texts, so as to construct a target style dictionary corresponding to the target style, wherein the target style dictionary further comprises reference information, the reference information comprises corresponding modification labels and use probabilities of each corpus text, and the use probabilities are determined based on occurrence frequencies of corresponding corpus texts in constructing the target style dictionary; and

A first execution unit configured to perform operations of the following sub-units for each sample pair in the sample dataset, the first execution unit comprising:

the first input subunit is configured to input a corpus sequence formed by the corpus text and the reference information in the target style dictionary, an original sample text in the sample text pair and a target sample text into the model, so as to carry out sequence labeling on a character sequence corresponding to the original sample text by using the model, and acquire a labeled sequence prediction result output by the model; and

and a first training subunit configured to train the model based on the labeling sequence prediction result and a labeling sequence corresponding to the sample text pair.

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the method comprises the steps of

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.