CN113850291B

CN113850291B - Text processing and model training method, device, equipment and storage medium

Info

Publication number: CN113850291B
Application number: CN202110947683.1A
Authority: CN
Inventors: 李若铭; 潘政林
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2023-11-24
Anticipated expiration: 2041-08-18
Also published as: CN113850291A

Abstract

The disclosure provides a text processing and model training method, a device, equipment and a storage medium, relates to the technical field of computers, and in particular relates to the artificial intelligence fields of speech synthesis, deep learning, natural language processing and the like. The text processing method comprises the following steps: detecting characters in the text; extracting gender-related text of the character from the text, wherein the gender-related text is text containing gender information of the character; and processing the gender-related text to determine the gender of the character. The present disclosure may determine the gender of a character in text.

Description

Text processing and model training method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technology, and in particular, to the field of artificial intelligence such as speech synthesis, deep learning, and natural language processing, and more particularly, to a method, apparatus, device, and storage medium for text processing and model training.

Background

The audio book is a derivative form of the traditional book, which is a book with a playing function and taking a magnetizer as a carrier, and is developed along with the development of the acousto-magnetic technology, and the most common audio book is an audio novel.

In the related art, a voiced novel pronounces the dialogue content of all characters by using the same speaker.

Disclosure of Invention

The present disclosure provides a text processing and model training method, apparatus, device and storage medium.

According to an aspect of the present disclosure, there is provided a text processing method including: detecting characters in the text; extracting gender-related text corresponding to the role from the text, wherein the gender-related text is text containing gender information of the role; and processing the gender-related text to determine the gender of the character.

According to another aspect of the present disclosure, there is provided a training method of a gender prediction model for determining a gender of a character of text, the method comprising: obtaining a training sample, the training sample comprising: training gender-related text of characters in the text, and tag information of the gender-related text, wherein the tag information is used for identifying the gender corresponding to the gender-related text; and training a gender prediction model by adopting the training sample.

According to another aspect of the present disclosure, there is provided a text processing apparatus including: the detection module is used for detecting roles in the text; the extraction module is used for extracting gender-related text of the character from the text, wherein the gender-related text is text containing gender information of the character; and the determining module is used for processing the gender-related text to determine the gender of the character.

According to another aspect of the present disclosure, there is provided a training apparatus of a gender prediction model for determining a gender of a character of text, the apparatus comprising: the acquisition module is used for acquiring training samples, and the training samples comprise: training gender-related text of characters in the text, and tag information of the gender-related text, wherein the tag information is used for identifying the gender corresponding to the gender-related text; and the training module is used for training the gender prediction model by adopting the training sample.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the above aspects.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method according to any one of the above aspects.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method according to any of the above aspects.

According to the technical scheme of the disclosure, the sex of the character in the text can be determined.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;

FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;

FIG. 5 is a schematic diagram according to a fifth embodiment of the present disclosure;

FIG. 6 is a schematic diagram according to a sixth embodiment of the present disclosure;

FIG. 7 is a schematic diagram according to a seventh embodiment of the present disclosure;

FIG. 8 is a schematic diagram according to an eighth embodiment of the present disclosure;

fig. 9 is a schematic diagram of an electronic device for implementing either the text processing method or the training method of the gender prediction model according to the embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the related art, a voiced novel pronounces the dialogue content of all characters by using the same speaker. However, different roles adopt pronouncing persons with proper sexes to pronounce the content, so that the playing effect of the audio reading can be improved, and the user experience is improved.

In order to improve the playing effect of the audio reading material, the present disclosure provides the following embodiments.

Fig. 1 is a schematic diagram of a first embodiment of the present disclosure, which provides a text processing method, including:

101. characters in the text are detected.

102. And extracting gender-related text of the character from the text, wherein the gender-related text is text containing gender information of the character.

103. And processing the gender-related text to determine the gender of the character.

The text refers to the text of the audio book, and takes the audio novel as an example, the text refers to the novel text. In this embodiment, the style, field, style, form, length, and the like of the novel text are not limited. It is to be understood that the text of the audio book is not limited to audio novels, but may be audio news, audio script, audio learning resources, etc.

Roles refer to the speaker in the text, taking a novel text as an example, for example, a speaks: "today weather is good", A is the name of a person, wherein A is the role.

The execution body of the embodiment may be a text processing apparatus, and the apparatus may be located in an electronic device, where the electronic device may be a cloud device, a server device, a client device, etc., and a specific form of the apparatus is not limited, and may be hardware, software, or a combination of hardware and software. For software forms, it may include a web application (web APP), a mobile application (APP, such as a cell phone hundred degrees), a system application (OS APP, such as a duros), and the like. The client device may also be called a terminal device, and may include a mobile device (such as a mobile phone and a tablet computer), a wearable device (such as a smart watch and a smart bracelet), a smart home device (such as a smart television and a smart sound box), and the like.

Wherein, as shown in fig. 2, a character prediction model may be used to predict text to detect characters in the text.

The input of the character detection model is text, and the output is character words, such as name of the character. For example, A, B, etc. in the text can be detected using a character detection model, with a and B representing the names of the people, respectively.

The character detection model may be a deep neural network model, which may be trained using various related techniques, and is not described in detail herein.

After the characters in the text are detected, as shown in fig. 2, the character-related text can be obtained based on the keyword search mode, and then the gender-related text corresponding to the characters can be obtained from the character-related text based on the keyword search mode.

For example, if a character is detected, text content including a may be regarded as a character-related text of a, for example, one text is "a goes from the head to the head", another text is "a goes over the tail of the horse with high height," a is playing basketball with a partner, and "a is playing basketball with a partner," and since all three texts relate to a, all three texts are character-related texts of a.

After the character-related text is obtained, the gender-related text may be obtained from the character-related text based on a preset gender-related keyword (or referred to as term data).

Further, the gender-related text may include gender word text and reference word text, the gender word text refers to text containing gender words, the reference word text refers to text containing reference words, the gender words may include explicit gender words, such as male and female, and may include implicit gender words, such as "high-end-of-horse-combed" described above. The reference words include: "he" or "she".

In some embodiments, the gender-related text comprises: gender word text and reference word text, said processing the gender-related text to determine the gender of the character, comprising: carrying out prediction processing on the gender word text by adopting a gender model so as to determine a first gender; adopting a reference model to predict the reference word text so as to determine the second gender; and if the first gender and the second gender are the same, determining that the gender of the character is the same gender.

As described in fig. 2, the gender word text may be predicted using a gender model to determine a first gender and the gender word text may be predicted using a reference model to determine a second gender. Then, the comparison module may be used to determine whether the first gender and the second gender are the same, and if the first gender and the second gender are the same, for example, the first gender and the second gender are both females, the gender of the character is determined to be female. Conversely, if the first gender and the second gender are different, for example, one is male and one is female, the first gender and the second gender may be sent to the manual processing module, and the gender of the character may be marked manually.

By comparing the gender prediction results output by the gender model and the reference model, when the gender prediction results output by the gender model and the reference model are the same, the same gender prediction result is used as the gender of the character, so that the accuracy of the gender can be improved.

Further, the gender word text is a plurality of pieces, the gender information corresponding to the gender word text includes gender scores corresponding to different genres, and the determining the first gender based on the gender information corresponding to the gender word text includes: summarizing gender scores corresponding to the gender word texts to obtain total scores of the same gender; the gender with the highest total score is taken as the first gender.

After the gender word text is obtained, a gender model can be adopted to process the gender word text so as to determine gender scores corresponding to different genres.

The input of the gender model is gender word text, and gender information corresponding to the gender word text is output.

The sex information may be probability values corresponding to the respective sexes, i.e., a probability value corresponding to a male and a probability value corresponding to a female.

The probability value may be used as a gender score, or the probability value may be converted into a gender score, for example, if the probability value is 10%, the score may be 10 points.

The summary may be an addition or other operation.

As shown in fig. 3, taking gender information as a gender score and summarizing as an example, each gender word text may be processed by using a gender model to obtain a gender score corresponding to each gender word text, then the gender scores corresponding to each gender word text are added to obtain a total score of the corresponding sexes, and then the gender with the highest total score may be used as the first gender.

Generally, a character, such as character a, corresponds to a plurality of gender word texts, for example, "a is bound to a high horse tail," a is pregnant, and "two gender word texts. And corresponding to the multiple gender word texts, processing each gender word text by adopting a gender model to obtain the score of each gender word text on each gender, namely obtaining the score of each gender word text corresponding to a male and the score of each gender word text corresponding to a female.

After obtaining the score of each gender word text on each gender, the scores corresponding to the multiple gender word texts of the same gender can be added corresponding to each gender, so as to determine the total score of the gender.

For example, the gender word text corresponding to the character A is N, S is used _i,1 Indicates the sex score of the male corresponding to the ith text, S _i,2 Indicating the gender score of the i text corresponding to female, the total score of the character a corresponding to male is:the total score for the corresponding women is: />Then, the gender with the highest total score can be used as the first gender corresponding to the character A, for example, through calculation, the total score of the female corresponding to the character A is greater than the total score of the male corresponding to the character A, and then the first gender corresponding to the character A is female.

By adopting the gender model, the gender information corresponding to the gender word text can be accurately obtained.

Further, by summarizing the gender scores corresponding to the plurality of gender word texts of the same gender to obtain the total score of the corresponding gender, and taking the gender with the highest total score as the first gender, the determination accuracy of the first gender can be improved.

The gender model is described above, and the reference model may also use similar prediction results corresponding to a plurality of reference texts to aggregate to determine the second gender.

In some embodiments, the reference text is a plurality of pieces, the gender information corresponding to the reference text includes reference scores corresponding to different reference texts, and the determining the second gender based on the gender information corresponding to the reference text includes: summarizing the index scores corresponding to the plurality of index word texts to obtain the total score of the same index word; and taking the gender corresponding to the reference word with the highest total score as the second gender.

After the reference text is obtained, the reference text can be processed by adopting a reference model to determine the reference scores corresponding to different reference words.

The input of the reference model is a reference word text, and the output is reference information corresponding to the reference word text.

The reference information may be probability values of respective reference words, which refer to reference words for distinguishing gender, including "he", "she", and thus, a probability value corresponding to "he" and a probability value corresponding to "she" may be obtained by the reference model.

The probability value may be used as the index score, or the probability value may be converted into the index score, for example, if the probability value is 10%, the score may be 10 points.

As shown in fig. 4, taking the index information as an index score and summarizing as an addition example, each index text may be processed by adopting an index model to obtain an index score corresponding to each index text, then the index score corresponding to each index text is corresponding to the same index, the index scores corresponding to each index text are added to obtain a total score of the corresponding index, and then the gender corresponding to the index with the highest total score may be used as the second gender.

Generally, a character, such as character a, corresponds to a plurality of pieces of reference text, for example, "he is playing basketball," he is working, and "two pieces of reference text. Corresponding to the plurality of pieces of the reference text, each piece of the reference text can be processed by adopting a reference model to obtain the score of each piece of the reference text on each reference, namely, the score of each piece of the reference text corresponding to 'he' and the score of each piece of the reference text corresponding to 'she'.

After obtaining the score of each of the reference texts on each of the reference texts, the scores corresponding to the plurality of reference texts of the same reference may be added corresponding to each of the reference texts to determine the total score of the reference texts.

For example, the text of the reference word corresponding to the character B is M pieces, S is used _i,1 A reference score representing that the ith text corresponds to "he", S _i,2 Meaning that the ith text corresponds to the "her" reference score, role B corresponds to "he" with a total score of:the total score for "her" is: />Then, the gender corresponding to the reference word with the highest total score may be taken as the second gender corresponding to the character B, for example, through calculation, the total score of the character B corresponding to "he" is greater than the total score of the corresponding "she", and since "he" corresponds to a male, the second gender corresponding to the character B is a male.

By adopting the reference model, the sex information corresponding to the reference text can be accurately obtained.

Further, by summarizing the index scores corresponding to the plurality of index word texts of the same index word to obtain the total score of the corresponding index word, and taking the gender corresponding to the index word with the highest total score as the second gender, the determination accuracy of the second gender can be improved.

In some embodiments, the gender model comprises: the gender information processing method comprises an input layer, a hidden layer, an attention layer and a classification layer, wherein a gender model is adopted to conduct prediction processing on gender word texts so as to obtain gender information corresponding to the gender word texts, and the method comprises the following steps: converting the gender word text into an input vector by adopting the input layer; converting the input vector into a hidden layer vector by adopting the hidden layer; converting the hidden layer vector into a coding vector by adopting the attention layer, wherein the parameters of the attention layer comprise attention weights, and the attention weights corresponding to the character appearance positions are larger than the attention weights corresponding to the non-character appearance positions; and classifying the coded vectors by adopting the classifying layer to obtain gender information corresponding to the gender word text.

Wherein the hidden layer may employ a pre-trained language model, such as an Encoder (Bidirectional Encoder Representations from Transformers, BERT) model of a bi-directional transducer.

Through continuous processing of each layer of the gender model, gender information corresponding to the gender word text can be obtained.

For the attention layer, attention weights are adopted to process hidden layer vectors, and the attention weights corresponding to the appearance positions of the roles are larger than the attention weights corresponding to the appearance positions of the roles, so that the attention layer can pay more attention to the appearance positions of the roles, and the determination accuracy of sex information is improved. The attention weight may be determined during a training phase, the determination of which may be referred to as a description of the correlation during the training process.

By adopting the attention layer, the sex model can pay more attention to the position where the character appears, so that the accuracy of the sex information can be improved.

For the reference model, the structure shown above may be adopted, and the processing process of the gender word text is similar to that of the gender model, and the reference word text is processed by adopting the designated model, which is not described in detail herein.

Further, as shown in fig. 5, for the gender model, the classifying layer includes a text classifying layer and a name classifying layer, and the classifying the encoding vector by using the classifying layer to obtain gender information corresponding to the gender word text includes: classifying the coded vectors by adopting the text classification layer to obtain a first classification result; classifying the coded vectors by adopting the name classification layer to obtain a second classification result; and fusing the first classification result and the second classification result to obtain gender information corresponding to the gender word text.

Since the gender word text and the specified word text both belong to the character-related text, and the character-related text is generally a text containing the name of the character, the name of the character is generally contained in the gender word text. In classifying gender word text, not only text-level classification, but also name-level classification, for example, "small just right" in a piece of text, generally, because the probability that the character name is a male is higher, that is, the probability that the male corresponding to the text is determined by the name classification layer is higher, however, because the text classification layer is obtained based on context training, the probability that the female corresponding to the text is determined by the text classification layer is higher.

When the first classification result and the second classification result are fused, a weighted addition method may be used. For example, the first classification result includes a first male score S11 and a first female score S12, the second classification result includes a second male score S21 and a second female score S22, and the first male score and the second male score may be weighted and added to obtain a male score output by the final gender model, i.e., male score=k1×s11+k2×s21; the first female score and the second female score are weighted and added to obtain a female score output by the final gender model, namely female score=k1×s12+k2×s22. The above k1 and k2 are weighted values, and may be empirical values set according to actual requirements.

The classification layer comprises a text classification layer and a name classification layer, so that fusion of a text classification result and a name classification result can be realized, and the accuracy of gender information is further improved.

In some embodiments, the method may further comprise: acquiring the voice corresponding to the gender of the character; and playing the dialogue content of the role by adopting the voice.

For example, corresponding to the same dialogue content, voices of different speakers, such as male voices and female voices, may be recorded in advance, and then if the role is determined to be male, male voices are obtained in the voice library, and the section of dialogue content is played by adopting the male voices.

Alternatively, a voice synthesis technique may be used to perform a voice synthesis process based on gender and dialogue content to obtain a voice of the corresponding gender and play the voice.

By adopting the voice with the gender corresponding to the role to play the voice of the dialogue content, the voice with the proper gender can be adopted to play, and the playing effect is improved.

In the embodiment of the disclosure, by determining the gender of the character in the text, the voice corresponding to the gender can be adopted based on the gender, so that the playing effect of the audio reading material can be improved, and the user experience is improved.

The above description relates to a gender prediction model, which may be trained in advance, and which may be trained as follows.

Fig. 6 is a schematic diagram of a sixth embodiment of the present disclosure, where the present embodiment provides a training method of a gender prediction model, the method including:

601. obtaining a training sample, the training sample comprising: the method comprises training gender-related text of characters in text and label information of the gender-related text, wherein the label information is used for identifying the gender corresponding to the gender-related text.

602. And training a gender prediction model by adopting the training sample.

The gender prediction model may be used in the text processing described above, i.e., the gender prediction model is used to determine the gender of a character in text.

The gender prediction model may include: the gender model and the reference model, and the corresponding gender-related text is gender word text and reference word text respectively.

The training samples of the gender model may include gender word text and corresponding tag information thereof, i.e., a set of training samples may be represented as < gender word text, tag information >, and the gender model may be obtained by training with a large number of training samples.

The training samples of the reference model may include the specified word text and its corresponding tag information, i.e., a set of training samples may be represented as < reference word text, tag information >, and the reference model may be obtained by training through a large number of training samples.

Taking a novel text as an example, a large number of novel texts can be collected, a role prediction model is adopted for the novel texts, roles in the novel texts are detected, and then gender word texts and reference word texts corresponding to the roles are obtained based on keyword retrieval.

After the gender word text and the reference word text are obtained, corresponding tag information can be obtained in a manual labeling mode, for example, a corresponding gender word text A is bound with a high horsetail, and if the tag information is 1 and 0 respectively represents females and males, the tag information corresponding to the gender word text can be labeled as 1.

For the reference text, if "he" is included in the reference text, the corresponding tag information may be marked with 0, i.e., the corresponding gender is male.

The gender model and the reference model may both be deep neural network models.

In some embodiments, the gender-related text comprises: gender word text, the gender prediction model comprising: a gender model, the gender model comprising: input layer, hidden layer, attention layer and classification layer, training sample still includes: and training a gender prediction model by adopting the training sample according to the attention degree identification corresponding to the role, wherein the training comprises the following steps: converting the gender word text into an input vector by adopting the input layer; converting the input vector into a hidden layer vector by adopting the hidden layer; determining an attention weight of the attention layer based on the hidden layer vector and the attention identifier, and converting the hidden layer vector into a coded vector using the attention layer having the attention weight; classifying the coded vectors by adopting the classifying layer to determine gender prediction information corresponding to the gender-related text; constructing a loss function based on the gender prediction information and the tag information, and training the gender model based on the loss function.

Further, the classifying layer includes a text classifying layer and a name classifying layer, and the classifying the encoding vector by using the classifying layer to determine gender prediction information corresponding to the gender-related text includes: classifying the coded vectors by adopting the text classification layer to obtain a first classification result; classifying the coded vectors by adopting the name classification layer to obtain a second classification result; and fusing the first classification result and the second classification result to obtain gender prediction information corresponding to the gender word text.

For example, in training, a gender word text is "a is pricked by a high horse tail, B is dad of a", if the current character is character a, the corresponding character a may set a different attention degree identifier from other words, for example, the attention degree identifier corresponding to character a is set to 10, the attention degree identifier corresponding to other words, for example, "junior", "B", and the like, may be set to 8, and through different attention degree tags, the attention weight of the attention layer may be controlled, so that the attention layer focuses more on the current character, for example, the attention weight corresponding to the word with a larger attention degree identifier is also larger, and according to the above example, the attention weight corresponding to "a" is larger than the attention weight corresponding to other words, so that the attention layer focuses more on the character "a".

The results for the gender model can be seen in fig. 5. In the prediction stage corresponding to fig. 5, unlike the prediction stage, in the training stage, a loss function needs to be constructed, the form of the loss function can be set according to needs, and based on the loss function, model parameters can be adjusted until the loss function converges, or the preset iteration number is reached, and the model parameters when the end condition is reached are used as a final model.

Further, the classification layer comprises a text classification layer and a name classification layer, so that fusion of a text classification result and a name classification result can be realized, and the accuracy of gender information is further improved.

According to the method and the device for achieving the gender-related text, the gender-related text is obtained based on the training text, the tag information is obtained, the gender prediction model can be trained based on the gender-related text and the tag information, the gender of the characters in the text can be predicted by the gender prediction model, and the speaker corresponding to the gender is used for pronouncing the dialogue content of the characters, so that the voice playing effect can be improved, and the user experience is improved.

Fig. 7 is a schematic view of a seventh embodiment of the present disclosure, which provides a text processing apparatus. As shown in fig. 7, the apparatus 700 includes: a detection module 701, an extraction module 702 and a determination module 703.

The detection module 701 is used for detecting characters in the text; the extracting module 702 is configured to extract, from the text, a gender-related text of the character, where the gender-related text is a text containing gender information of the character; the determining module 703 is configured to process the gender-related text to determine the gender of the character.

In some embodiments, the gender-related text comprises: gender word text and reference word text, the determining module 703 includes: the apparatus includes a first prediction unit, a second prediction unit, and a determination unit.

The first prediction unit is used for performing prediction processing on the gender word text by adopting a gender model so as to obtain gender information corresponding to the gender word text, and determining a first gender based on the gender information corresponding to the gender word text; the second prediction unit is used for performing prediction processing on the reference word text by adopting a reference model so as to obtain gender information corresponding to the reference word text, and determining a second gender based on the gender information corresponding to the reference word text; and the determining unit is used for determining that the gender of the character is the same gender if the first gender and the second gender are the same.

In some embodiments, the gender word text is a plurality of pieces, the gender information corresponding to the gender word text includes gender scores corresponding to different genres, and the first prediction unit is specifically configured to: summarizing gender scores corresponding to the gender word texts to obtain total scores of the same gender; the gender with the highest total score is taken as the first gender.

In some embodiments, the reference text is a plurality of pieces, the gender information corresponding to the reference text includes reference scores corresponding to different reference words, and the second prediction unit is specifically configured to: summarizing the index scores corresponding to the plurality of index word texts to obtain the total score of the same index word; and taking the gender corresponding to the reference word with the highest total score as the second gender.

In some embodiments, the gender model comprises: the input layer, the hidden layer, the attention layer and the classification layer, the first prediction unit is specifically configured to: converting the gender word text into an input vector by adopting the input layer; converting the input vector into a hidden layer vector by adopting the hidden layer; converting the hidden layer vector into a coding vector by adopting the attention layer, wherein the parameters of the attention layer comprise attention weights, and the attention weights corresponding to the character appearance positions are larger than the attention weights corresponding to the non-character appearance positions; and classifying the coded vectors by adopting the classifying layer to obtain gender information corresponding to the gender word text.

In some embodiments, the classification layers include a text classification layer and a name classification layer, and the first prediction unit is further specifically configured to: classifying the coded vectors by adopting the text classification layer to obtain a first classification result; classifying the coded vectors by adopting the name classification layer to obtain a second classification result; and fusing the first classification result and the second classification result to obtain gender information corresponding to the gender word text.

In some embodiments, the apparatus 700 further comprises: the acquisition module is used for acquiring the voice corresponding to the gender of the character; and the playing module is used for playing the voice for the dialogue content of the role by adopting the voice.

Fig. 8 is a schematic diagram of an eighth embodiment of the present disclosure, which provides a training apparatus for a gender prediction model. The gender prediction model is used to determine the gender of a character of text, and the apparatus 800 includes: an acquisition module 801 and a training module 802.

The obtaining module 801 is configured to obtain a training sample, where the training sample includes: training gender-related text of characters in the text, and tag information of the gender-related text, wherein the tag information is used for identifying the gender corresponding to the gender-related text; training module 802 is configured to train a gender prediction model using the training samples.

In some embodiments, the gender-related text comprises: gender word text, the gender prediction model comprising: a gender model, the gender model comprising: input layer, hidden layer, attention layer and classification layer, training sample still includes: the attention degree identifier corresponding to the role is specifically used for the training module: converting the gender word text into an input vector by adopting the input layer; converting the input vector into a hidden layer vector by adopting the hidden layer; determining an attention weight of the attention layer based on the hidden layer vector and the attention identifier, and converting the hidden layer vector into a coded vector using the attention layer having the attention weight; classifying the coded vectors by adopting the classifying layer to determine gender prediction information corresponding to the gender-related text; constructing a loss function based on the gender prediction information and the tag information, and training the gender model based on the loss function.

In some embodiments, the classification layers include a text classification layer and a name classification layer, and the training module is further specifically configured to: classifying the coded vectors by adopting the text classification layer to obtain a first classification result; classifying the coded vectors by adopting the name classification layer to obtain a second classification result; and fusing the first classification result and the second classification result to obtain gender prediction information corresponding to the gender word text.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

It is to be understood that in the embodiments of the disclosure, the same or similar content in different embodiments may be referred to each other.

It can be understood that "first", "second", etc. in the embodiments of the present disclosure are only used for distinguishing, and do not indicate the importance level, the time sequence, etc.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile apparatuses, such as personal digital assistants, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the electronic device 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 909 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the electronic device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

A number of components in the electronic device 900 are connected to the I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the electronic device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, for example, a text processing method or a training method of a sex prediction model. For example, in some embodiments, the text processing method or the training method of the gender prediction model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the text processing method or the training method of the gender prediction model described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the text processing method or the training method of the gender prediction model in any other suitable way (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A text processing method, comprising:

detecting characters in the text;

extracting gender-related text of the character from the text, wherein the gender-related text is text containing gender information of the character; wherein the gender-related text includes: gender word text and reference word text;

adopting a gender model to predict the input gender word text so as to output gender information corresponding to the gender word text, and determining a first gender based on the gender information corresponding to the gender word text; wherein the gender model comprises an attention layer, the parameters of the attention layer comprise attention weights, and the attention weights corresponding to the character appearance positions are larger than the attention weights corresponding to the non-character appearance positions; the attention weight corresponding to the character appearance position is greater than the attention weight corresponding to the non-character appearance position by the set attention mark, and the attention mark corresponding to the character is greater than the attention mark corresponding to the non-character;

Adopting a reference model to predict the input reference text so as to output gender information corresponding to the reference text, and determining a second gender based on the gender information corresponding to the reference text; if the first gender and the second gender are the same, determining that the gender of the character is the same gender;

wherein the gender information is a probability value corresponding to each gender;

the gender information corresponding to the gender word text comprises gender scores of different sexes, and the determining the first gender based on the gender information corresponding to the gender word text comprises the following steps:

summarizing gender scores corresponding to the gender word texts to obtain total scores of the same gender;

the gender with the highest total score is taken as the first gender;

the method for determining the second gender based on the gender information corresponding to the reference text comprises the following steps:

summarizing the index scores corresponding to the plurality of index word texts to obtain the total score of the same index word;

Taking the gender corresponding to the reference word with the highest total score as the second gender;

wherein the gender model further comprises: the classifying layer comprises a text classifying layer and a name classifying layer, the input layer is used for converting the gender word text into an input vector, the hidden layer is used for converting the input vector into a hidden layer vector, the attention layer is used for converting the hidden layer vector into a coding vector, the text classifying layer is used for classifying the input coding vector to obtain a first classifying result, the name classifying layer is used for classifying the coding vector to obtain a second classifying result, and the gender information corresponding to the gender word text is obtained after the first classifying result and the second classifying result are fused.

2. The method of claim 1, further comprising:

acquiring the voice corresponding to the gender of the character;

and playing the dialogue content of the role by adopting the voice.

3. A training method of a gender prediction model is used for determining a first gender of a character of a text, corresponding to the same gender, summarizing gender scores corresponding to a plurality of gender word texts to obtain a total score of the same gender, and taking the gender with the highest total score as the first gender, and comprises the following steps:

Obtaining a training sample, the training sample comprising: training gender-related text of characters in the text, and tag information of the gender-related text, wherein the tag information is used for identifying the gender corresponding to the gender-related text; wherein the gender-related text includes: gender word text; the training sample further comprises: a focus identifier corresponding to the role;

training a gender prediction model by adopting the training sample; wherein the gender prediction model comprises: a gender model; the gender model includes an attention layer, an attention weight of which is determined based on the attention identifier; the attention degree identification of the character is larger than the attention degree identification of the non-character, so that the attention weight corresponding to the appearance position of the character is larger than the attention weight corresponding to the appearance position of the non-character; the attention weight corresponding to the character appearance position is greater than the attention weight corresponding to the non-character appearance position by the set attention mark, and the attention mark corresponding to the character is greater than the attention mark corresponding to the non-character;

the gender model is used for carrying out prediction processing on the input gender word text so as to output gender prediction information corresponding to the gender word text;

wherein the gender model further comprises: the classifying layer comprises a text classifying layer and a name classifying layer, the input layer is used for converting the gender word text into an input vector, the hidden layer is used for converting the input vector into a hidden layer vector, the attention layer is used for converting the hidden layer vector into a coding vector, the text classifying layer is used for classifying the input coding vector to obtain a first classifying result, the name classifying layer is used for classifying the coding vector to obtain a second classifying result, and gender prediction information corresponding to the gender word text is obtained after the first classifying result and the second classifying result are fused.

4. A method according to claim 3, wherein said training a gender prediction model using said training samples comprises:

converting the gender word text into an input vector by adopting the input layer;

converting the input vector into a hidden layer vector by adopting the hidden layer;

determining an attention weight of the attention layer based on the hidden layer vector and the attention identifier, and converting the hidden layer vector into a coded vector using the attention layer having the attention weight;

Constructing a loss function based on the gender prediction information and the tag information, and training the gender model based on the loss function.

5. A text processing apparatus, comprising:

the detection module is used for detecting roles in the text;

the extraction module is used for extracting gender-related text of the character from the text, wherein the gender-related text is text containing gender information of the character; wherein the gender-related text includes: gender word text and reference word text;

the determining module is used for processing the gender-related text to determine the gender of the character; the determining module includes:

the first prediction unit is used for performing prediction processing on the input gender word text by adopting a gender model so as to output gender information corresponding to the gender word text, and determining a first gender based on the gender information corresponding to the gender word text; wherein the gender model comprises an attention layer, the parameters of the attention layer comprise attention weights, and the attention weights corresponding to the character appearance positions are larger than the attention weights corresponding to the non-character appearance positions; the attention weight corresponding to the character appearance position is greater than the attention weight corresponding to the non-character appearance position by the set attention mark, and the attention mark corresponding to the character is greater than the attention mark corresponding to the non-character;

The second prediction unit is used for performing prediction processing on the input index word text by adopting an index model so as to output gender information corresponding to the index word text, and determining a second gender based on the gender information corresponding to the index word text;

a determining unit configured to determine that the sex of the character is the same sex if the first sex and the second sex are the same;

the gender information corresponding to the gender word text comprises gender scores of different sexes, and the first prediction unit is specifically used for:

the gender with the highest total score is taken as the first gender;

the second prediction unit is specifically configured to:

6. The apparatus of claim 5, further comprising:

the acquisition module is used for acquiring the voice corresponding to the gender of the character;

and the playing module is used for playing the voice for the dialogue content of the role by adopting the voice.

7. A training device for a gender prediction model, the gender prediction model being used for determining a first gender of a character of a text, corresponding to the same gender, summarizing gender scores corresponding to a plurality of gender word texts to obtain a total score of the same gender, and taking the gender with the highest total score as the first gender, the device comprising:

The acquisition module is used for acquiring training samples, and the training samples comprise: training gender-related text of characters in the text, and tag information of the gender-related text, wherein the tag information is used for identifying the gender corresponding to the gender-related text; wherein the gender-related text includes: gender word text; the training sample further comprises: a focus identifier corresponding to the role;

the training module is used for training a gender prediction model by adopting the training sample; wherein the gender prediction model comprises: a gender model; the gender model includes an attention layer, an attention weight of which is determined based on the attention identifier; the attention degree identification of the character is larger than the attention degree identification of the non-character, so that the attention weight corresponding to the appearance position of the character is larger than the attention weight corresponding to the appearance position of the non-character; the attention weight corresponding to the character appearance position is greater than the attention weight corresponding to the non-character appearance position by the set attention mark, and the attention mark corresponding to the character is greater than the attention mark corresponding to the non-character;

8. The apparatus of claim 7, wherein the training module is specifically configured to:

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.