CN112069786A

CN112069786A - Text information processing method and device, electronic equipment and medium

Info

Publication number: CN112069786A
Application number: CN202010866385.5A
Authority: CN
Inventors: 杜春赛; 徐文铭; 陈可蓉; 杨晶生; 赵田
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-08-25
Filing date: 2020-08-25
Publication date: 2020-12-11

Abstract

The embodiment of the disclosure discloses a text information processing method, a text information processing device, an electronic device and a medium, wherein the method comprises the following steps: acquiring basic attribute information of a target text, and determining a target text processing model corresponding to the target text according to the basic attribute information; processing the target text to obtain a text feature vector corresponding to the target text; and processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text. According to the technical scheme of the embodiment of the disclosure, when the text corresponding to the audio information is acquired, the intention identification model corresponding to the text information can be determined, and the intention corresponding to the text information is determined based on the intention identification model, so that the technical effect of interactive interaction efficiency is improved.

Description

Text information processing method and device, electronic equipment and medium

Technical Field

The embodiment of the disclosure relates to the technical field of computers, and in particular relates to a text information processing method and device, electronic equipment and a medium.

Background

Currently, the intention of each user is determined by the words made by the user or by reading the text information corresponding to the user.

If the user intention is determined based on the text information, the user is required to read and analyze the text information, and then the intention is determined based on the text information, so that the problems of low efficiency, complexity and complexity in knowing the intention exist. Further, since the intention of the user is determined based on the text content understood by the user, there is a technical problem that the user cannot accurately understand the text content, thereby failing to determine the text intention.

Disclosure of Invention

The embodiment of the disclosure provides a text information processing method, a text information processing device, electronic equipment and a text information processing medium, so as to achieve the technical effect of conveniently, efficiently and reliably determining a text intention corresponding to text content.

In a first aspect, an embodiment of the present disclosure provides a text information processing method, where the method includes:

acquiring basic attribute information of a target text, and determining a target text processing model corresponding to the target text according to the basic attribute information;

processing the target text to obtain a text feature vector corresponding to the target text;

and processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

In a second aspect, an embodiment of the present disclosure further provides a text information processing apparatus, including:

the target text processing model determining module is used for acquiring basic attribute information of a target text and determining a target text processing model corresponding to the target text according to the basic attribute information;

the text characteristic vector determining module is used for processing the target text to obtain a text characteristic vector corresponding to the target text;

and the text intention determining module is used for processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the text information processing method according to any one of the embodiments of the present disclosure.

In a fourth aspect, the embodiments of the present disclosure further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are used for executing the text information processing method according to any one of the embodiments of the present disclosure.

According to the technical scheme of the embodiment of the disclosure, the corresponding target text processing model is determined through the basic attribute information of the text information, and the content of the text information is processed based on the target text processing model, so that the text intention corresponding to the text information is conveniently and accurately determined, and the interaction efficiency of an interactive user in the interaction process is improved.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

Fig. 1 is a schematic flowchart of a text information processing method according to a first embodiment of the disclosure;

fig. 2 is a schematic flow chart of a text information processing method according to a second embodiment of the disclosure;

fig. 3 is a schematic flow chart of a text information processing method according to a third embodiment of the present disclosure;

fig. 4 is a schematic flowchart of a text information processing method according to a fourth embodiment of the disclosure;

fig. 5 is a schematic flowchart of a text information processing method according to a fifth embodiment of the disclosure;

fig. 6 is a schematic structural diagram of a text information processing apparatus according to a sixth embodiment of the disclosure;

fig. 7 is a schematic structural diagram of an electronic device according to a seventh embodiment of the disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

Example one

Fig. 1 is a flowchart illustrating a text information processing method according to a first embodiment of the present disclosure, where the method is applied to determine a text intention corresponding to a text content according to a content displayed in a text, and the method may be implemented by a text information processing apparatus, which may be implemented in software and/or hardware, or alternatively implemented by an electronic device, which may be a mobile terminal, a PC terminal, a server, or the like. The technical scheme for executing the embodiment of the disclosure can be realized by the cooperation of the client and/or the server.

As shown in fig. 1, the method of the present embodiment includes:

s110, obtaining basic attribute information of the target text, and determining a target text processing model corresponding to the target text according to the basic attribute information.

The technical scheme of the embodiment of the disclosure can be applied to real-time interactive scenes, such as video conferences, live broadcasts and the like. The real-time interactive interface is any interactive interface in a real-time interactive application scene. The real-time interactive application scenario may be implemented through the internet and computer means, such as an interactive application implemented through a native program or a web program, etc. In a real-time interactive interface, multiple interactive users may be allowed to interact with various forms of interactive behavior, such as input of text, voice, video, or sharing of content objects.

In the real-time interaction process, when the speaking user speaks, the server or the client can collect the audio information of the speaking user and convert the audio information into corresponding text information, so that the text intention of the speaking user is determined based on the text information, and the speaking intention of the speaking user corresponding to the text information is further determined.

In order to improve the timeliness of determining the text intention corresponding to the text content, the technical scheme of the embodiment of the disclosure can be implemented as long as the audio information of the speaking user is collected and the speaking information is converted into the corresponding text, so that the speaking intention corresponding to the speaking content is determined.

When the speaking user speaks, the audio information of the speaking user can be collected and converted into a corresponding text. The target text refers to a text corresponding to the audio information converted into characters, and the intention of the text needs to be determined according to the content of the text. The basic attribute information may be, but is not limited to, an identity of a speaking user corresponding to the target text, a speaking timestamp, the number of characters in the text, and the like. The target text processing model is obtained by training in advance and is used for determining the text intention corresponding to the target text. The target text processing model is determined by the basic properties of the target text.

Specifically, after the target text corresponding to the audio information of the speaking user is obtained, basic attribute information of the target text may be obtained, and a target text processing model corresponding to the basic attribute information may be determined based on the basic attribute information, so as to determine a target text intention corresponding to the target text based on the target text processing model.

And S120, processing the target text to obtain a text feature vector corresponding to the target text.

Wherein the text feature vector is determined based on the text content of the target text.

Since the target text processing model is used for processing the vector, before the target text is input into the target text processing model, the text feature vector corresponding to the target text needs to be determined.

And S130, processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

The target text processing model may be obtained by pre-training and is used for processing the feature vector of the target text. Thus, based on the output result, a target text intention corresponding to the target text can be determined. The target text intent may be a purpose corresponding to the audio information of the speaking user, e.g., the speaking user utters an intent to ask questions, praise, criticize, and/or encourage.

Specifically, after the text feature vector corresponding to the target text is input to a target text processing model obtained through pre-training, the target text processing model may process the feature vector to obtain a label corresponding to the feature vector, and the text intention corresponding to the target text may be determined based on the label.

According to the technical scheme, when the audio information is collected, the audio information can be converted into the corresponding text, the text processing model corresponding to the audio information can be determined based on the basic attribute information of the text, and the text intention of the text is determined based on the output result of the text processing model, so that the intention corresponding to the speech information is determined, and the technical effects of convenience and high efficiency in determining the text intention are improved.

On the basis of the above technical solution, before obtaining the basic attribute information of the target text, the method further includes: collecting audio information of a target speaking user, and converting the audio information into a corresponding text to obtain a text to be processed; deleting stop words in the text to be processed from the text to be processed to obtain the target text.

It should be noted that, no matter which user speaks, the server may collect its audio information and convert it into a corresponding text, and determine the speaking intent of the speaking user based on the content in the text.

Among them, the speaking user who is currently speaking can be taken as the target speaking user. That is, the server may use the speaking user corresponding to the acquired audio information as the target speaking user. The text to be processed may refer to, for example, a text obtained by converting audio information of a speaking user.

Further, in order to facilitate rapid determination of the target text intent corresponding to the target text, the text to be processed may be processed. Optionally, the text to be processed is processed, which may be to remove preset words, such as stop words, "yes", and the like, from the text to be processed. And taking the text obtained after the stop words are removed as a target text.

That is, the text to be processed is directly converted based on the audio information, and thus some useless words may exist in the text. In order to improve the processing efficiency of the target text, useless characters in the text can be removed to obtain the target text, so that the technical effects of convenience and accuracy in text intention determination are improved when the text intention is determined based on the target text.

Example two

Fig. 2 is a schematic flow chart of a text information processing method according to a second embodiment of the disclosure. On the basis of the foregoing embodiment, optimization may be performed on "processing the target text to obtain a text feature vector corresponding to the target text". The same or corresponding terms as those in the above embodiments are not described herein again.

The specific steps are shown in figure 2:

s210, obtaining basic attribute information of the target text, and determining a target text processing model corresponding to the target text according to the basic attribute information.

S220, dividing the target text into at least one keyword based on the word segmentation tool, determining a word vector of each keyword based on the word vector dictionary, and determining a text feature vector corresponding to the target text based on the word vector of the at least one keyword.

The word segmentation tool can be adopted to perform word segmentation processing on at least one sentence in the target text to obtain each keyword in the target text. The word vector dictionary includes vectors corresponding to respective words. Based on the word vector dictionary, a word vector corresponding to each keyword may be determined. In order to determine the speaking intention corresponding to the speaking content, namely, to determine the target text intention of the target text, each sentence spoken by the speaking user needs to be processed, and based on the word vector corresponding to each keyword, the sentence vector corresponding to the sentence to which each keyword belongs can be determined. And obtaining a text feature vector of the target text based on the sentence vector of each sentence.

In this embodiment, determining the sentence vector based on each keyword may be determining the keyword included in a sentence and the word vector of each keyword, and obtaining a sentence feature vector corresponding to the sentence by summing and averaging the word vectors. Based on the sentence feature vectors corresponding to the sentences, text feature vectors corresponding to the target text can be obtained.

In order to further improve the accuracy of determining the speaking intention corresponding to the speaking content, a one-dimensional feature vector corresponding to the text compliance degree can be added to the text feature vector, that is, the compliance degree of the text is also taken as a factor for considering the intention corresponding to the speaking content.

Optionally, determining a confusion feature vector corresponding to the target text; obtaining a text feature vector corresponding to the target text based on the word vector of the at least one keyword and the confusion feature vector; the confusion characteristic vector is used for representing the smoothness degree of the text content in the target text.

Wherein the confusion feature vector may be determined based on a pre-trained language model. The text may be input into a language model, resulting in a confusion feature vector corresponding to the text. The confusion feature vector is used for representing the smoothness of the statement corresponding to the speech content, and accordingly, the text feature vector corresponding to the target text can be comprehensively determined based on the confusion feature vector and the word vectors of the keywords, and further the target text intention of the target text can be determined based on the text feature vector.

And S230, processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

According to the technical scheme of the embodiment of the disclosure, after the text feature vector corresponding to the target text is determined, the text feature vector can be input into the corresponding text processing model, so that the text feature vector is processed based on the text processing model to obtain the text intention corresponding to the target text, and the accuracy and convenience for determining the target text intention are improved.

EXAMPLE III

Fig. 3 is a flowchart illustrating a text information processing method according to a third embodiment of the disclosure. On the basis of the foregoing embodiment, optimization may be performed on "obtaining basic attribute information of a target text, and determining a target text processing model corresponding to the target text according to the basic attribute information" in the foregoing embodiment. Wherein explanations of the same or corresponding terms as those of the above-described embodiments are omitted.

The specific steps are shown in fig. 3:

s310, if the number of characters is larger than or equal to a preset character number threshold value, determining that a target text processing model corresponding to the target text is a first prediction evaluation model.

In this embodiment, the basic attribute information of the target text may be the number of characters in the target text.

The threshold of the number of characters may be set empirically or determined according to a plurality of experimental results, and optionally, the threshold of the number of characters is preset to be 25 characters. The first prediction evaluation model is obtained based on a plurality of sub-models and integrated learning training and is used for determining the text intention corresponding to the target text. Optionally, ensemble learning is performed based on a bagging mode, and the method has the advantage that the accuracy of model judgment can be improved, and further the accuracy of target text intention determination is improved.

Wherein, at least three multi-classification sub-evaluation models are integrated and learned in the first prediction evaluation model. Optionally, the number of the sub-evaluation models may be five, and each sub-evaluation model may be: a logistic regression model, a gradient decision tree model, a random forest model, a Bayesian model and a support vector machine model. Each sub-evaluation model can process the text feature vector to obtain the text intention to be processed corresponding to the target text. Since the plurality of sub-evaluation models in the first predictive evaluation model are integrally learned, after the text intentions to be processed are output by the respective models, the text intentions to be processed with the highest vote rate can be determined based on the voting mechanism in the first predictive evaluation model, and the text intentions to be processed are taken as the target text intentions.

And S320, processing the target text to obtain a text feature vector corresponding to the target text.

S330, respectively inputting the text feature vectors into at least three sub-evaluation models in the first prediction evaluation model to obtain at least three text intentions to be processed corresponding to the target text.

Specifically, after the text feature vector corresponding to the target text is obtained, the text feature vector may be input to the first prediction evaluation model. Each sub-evaluation model in the first prediction evaluation model can process the input text feature vector and output the text intention to be processed corresponding to the text feature vector.

S340, determining a target text intention of the target text from at least three text intentions to be processed based on a voting mechanism in the first prediction evaluation model.

It should be noted that each sub-evaluation model in the first predictive evaluation model is based on the bagging mode ensemble learning, and therefore after each sub-evaluation model outputs the text intentions to be processed, the highest text intention to be processed can be voted and selected from all the text intentions to be processed as the target text intention based on the voting mechanism in the first predictive evaluation model. The method has the advantage that the accuracy of the model for judging the text intention is improved.

According to the technical scheme of the embodiment of the disclosure, the text intentions to be processed corresponding to the text feature vectors can be respectively determined through the sub-evaluation models in the first prediction model, and the text intention to be processed with the highest ticket obtaining rate is selected from all the text intentions to be processed and used as the target text intention, so that the accuracy and convenience for determining the target text intention corresponding to the target text are improved. That is, by processing the text by using the model corresponding to the target text, the technical effects of convenience and accuracy of text processing can be improved.

On the basis of the technical scheme, before the target text intention is determined based on the first prediction evaluation model, the first prediction evaluation model needs to be trained.

Optionally, acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample and a negative sample, wherein the positive sample is an intentional sample; negative samples are unintended samples; inputting a text feature vector and a text intention identifier in each training sample data into each to-be-trained multi-class sub-evaluation model aiming at each to-be-trained multi-class sub-evaluation model in the first prediction evaluation model to obtain an initial evaluation value corresponding to each training sample data; based on the initial evaluation value and a preset output value, correcting a preset loss function value in the multi-class sub-evaluation model to be trained; taking the preset loss function value convergence as a training target, and training the multi-class sub-evaluation model to be trained to obtain a multi-class sub-evaluation model; integrally learning each multi-classification sub-evaluation model to obtain a first prediction evaluation model; the first predictive evaluation model is used to determine a textual intent of the target text.

In order to improve the accuracy of each sub-estimation model in the first prediction estimation model, training sample data can be as much and rich as possible. Both positive and negative samples are included in the training sample data. The positive sample refers to a deliberate sample, optionally, the intention may be encouragement, praise, criticism, question, and the like, the labels corresponding to different intentions may be set, and simultaneously, the output value of the sample, that is, the output label of the sample, may also be set. Negative examples refer to those without intent. The initial evaluation value is an evaluation value obtained after the text feature vector and the label are processed based on the multi-classification evaluation model to be trained. And the loss function is preset and is used for measuring whether the determined multi-classification sub-evaluation model to be trained is accurate or not.

It should be noted that, before the multi-class sub-evaluation model to be trained is trained, the training parameters may be set to default values. When the multi-class sub-evaluation model to be trained is trained, the training parameters in the model can be corrected based on the output result of the multi-class sub-evaluation model to be trained, that is, the training parameters in the model can be corrected based on the loss function value.

For each multi-class sub-evaluation model, the feature vector and the text intention label corresponding to the training sample data can be input into the multi-class sub-evaluation model to be trained to obtain an initial evaluation value corresponding to the training sample data. Based on the initial evaluation value and the output value set in the training sample data, the loss parameter in the sub-evaluation model to be trained can be determined, and the multi-classification model to be trained is corrected based on the loss parameter.

Specifically, the training error of the loss function, that is, the loss parameter, may be used as a condition for detecting whether the loss function reaches convergence currently, for example, whether the training error is smaller than a preset error or whether an error change trend tends to be stable, or whether the current iteration number is equal to a preset number. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error or the error change tends to be stable, indicating that the training of the multi-class sub-evaluation model to be trained is finished, at this moment, the iterative training can be stopped. If the current condition is not met, training sample data can be further acquired to train the multi-class sub-evaluation model to be trained until the training error of the loss function is within a preset range. When the training error of the loss function reaches convergence, the multi-class sub-evaluation model to be trained can be used as the multi-class sub-evaluation model.

And repeatedly executing the steps to obtain each multi-classification sub-evaluation model, and integrally learning each multi-classification sub-evaluation model to obtain a first prediction evaluation model.

Example four

Fig. 4 is a schematic flow chart of a text information processing method according to a fourth embodiment of the present disclosure, and based on the foregoing embodiment, optimization may be performed on "obtaining basic attribute information of a target text, and determining a target text processing model corresponding to the target text according to the basic attribute information". Wherein explanations of the same or corresponding terms as those of the above-described embodiments are omitted.

As shown in fig. 4, the method includes:

and S410, if the number of the characters is less than the preset number of the characters, determining that the target text processing model corresponding to the target text is a single classification model.

Wherein, the basic attribute information of the text may be the number of characters in the text. The preset number of words threshold may be 25 words. The single classification model is predetermined and may be a deepsvd model. The single classification model is used to determine a target text intent corresponding to the target text.

Specifically, when it is detected that the number of words in the target text is smaller than a preset word number threshold, a corresponding single classification model may be obtained, so as to determine the target text intention of the target text based on the single classification model.

And S420, processing the target text to obtain a text feature vector corresponding to the target text.

And S430, processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

Specifically, the text feature vectors may be processed based on a single classification model to determine the text intent of the target text.

According to the technical scheme of the embodiment of the disclosure, the target text processing model corresponding to the target text can be determined through the text attribute information of the target text, and the technical effects of accuracy and convenience in target text processing are improved.

It should be noted that before determining the target text intention of the target text based on the single classification model, the single classification model needs to be trained.

In this embodiment, the training list classification model may be: acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample; the positive sample is an intentional sample; training the single classification model based on the training sample data; the single classification model is used to determine a textual intent of the target text.

In this embodiment, the advantage of training the single classification model is that when the number of characters is less than the preset number of characters threshold, the single classification model can be used to process the characters, thereby improving the pertinence to character processing and the technical effect of accuracy.

EXAMPLE five

Fig. 5 is a schematic flowchart of a text information processing method according to a fifth embodiment of the disclosure. On the basis of the foregoing embodiment, after the utterance intention corresponding to the utterance user is determined, the presentation content corresponding to the utterance intention may also be determined, and the presentation content is sent to the client, so that when the control corresponding to the trigger presentation content is detected, the presentation content is presented at the client. Wherein explanations of the same or corresponding terms as those of the above-described embodiments are omitted.

S510, obtaining basic attribute information of the target text, and determining a target text processing model corresponding to the target text according to the basic attribute information;

s520, processing the target text to obtain a text feature vector corresponding to the target text;

and S530, processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text.

S540, determining the display content corresponding to the target text intention, and sending the display content to a client so that the client displays the display content.

Wherein, the display content is a display icon and/or a display text corresponding to the text intention. The display icons can be icons corresponding to applause, praise, hand-raising and the like; the display text can be 'you real stick', 'please leap report' etc.

Specifically, after the target text intention corresponding to the target text is determined, the display content corresponding to the text intention may be automatically determined, or the display content corresponding to the text intention may be determined from a pre-made comparison table and sent to the client, so that the client displays the display content.

For example, if it is determined that the target text is intended to be a "question" based on the target text, the display content corresponding to the speaking intention "question" such as a "holding hands" icon may be automatically determined; or the display content corresponding to the question is determined based on a pre-established mapping relation table, such as different actions or texts corresponding to different intentions. After the display content is determined, the display content may be sent to the client, for example, the "raise hands" icon is sent to the client corresponding to each interactive user, so as to be displayed at the client.

On the basis of the technical scheme, when the control corresponding to the display content is detected to be triggered, the display content is displayed in a preset display area of the client.

The preset display area may be a main display interface for displaying the target text.

Specifically, after the client receives the display content, the display content may be directly displayed on a main display interface displaying the target text, or the display content may be popped up on the main display interface in a reminding manner, and when the user triggers the display content, the display content may be displayed at a protrusion in a preset display area.

Exemplarily, when an interactive user triggers a 'hand-lifting' control displayed on a client, an icon of the 'hand-lifting' can be displayed at a preset position to remind other interactive users that the user has a hand-lifting answer, so that the interest and the convenience of interaction are improved, and the technical effect of the interactive interaction efficiency is further improved.

According to the technical scheme of the embodiment of the invention, after the target intention corresponding to the target text is determined, the display content corresponding to the target intention can be determined, so that the interestingness and convenience of interaction are improved, and the technical effect of interaction efficiency is further improved.

EXAMPLE six

Fig. 6 is a schematic structural diagram of an information processing apparatus according to a sixth embodiment of the present disclosure. As shown in fig. 6, the apparatus includes: a target text processing model determination module 610, a text feature vector determination module 620, and a text intent determination module 630.

The target text processing model determining module 610 is configured to obtain basic attribute information of a target text, and determine a target text processing model corresponding to the target text according to the basic attribute information; a text feature vector determining module 620, configured to process the target text to obtain a text feature vector corresponding to the target text; a text intent determining module 630, configured to process the text feature vector based on the target text processing model, so as to obtain a target text intent corresponding to the target text.

On the basis of the technical scheme, the target text processing model comprises a first prediction evaluation model; the basic attribute information comprises word quantity information of a target text, and the target text processing model determining module is further used for determining that a target text processing model corresponding to the target text is a first prediction evaluation model if the word quantity is greater than or equal to a preset word quantity threshold; at least three multi-classification sub-evaluation models are integrated and learned in the first prediction evaluation model, and each multi-classification sub-evaluation model is used for determining the text intention of the target text to be processed; the first prediction evaluation model is also used for determining the target intention of the target text according to the text intention to be processed output by the at least three sub-evaluation models.

On the basis of the above technical solutions, the text intention determining module includes: the text to be processed intention determining unit is used for respectively inputting the text feature vectors into at least three sub-evaluation models in the first prediction evaluation model to obtain at least three text intentions to be processed corresponding to the target text; a target text intention determining unit, configured to determine a target text intention of the target text from the at least three text intents to be processed based on a voting mechanism in the first prediction evaluation model.

On the basis of the above technical solutions, the target text processing model further includes a single classification model, the basic attribute information includes word number information in the target text, and the target text processing model determining module: the single classification model is also used for determining a target text processing model corresponding to the target text as the single classification model if the number of the characters is less than the preset number of the characters; the single classification model is used for determining a target text intention of the target text.

On the basis of the above technical solutions, the text feature vector determination module includes: a keyword determination unit for dividing the target text into at least one keyword based on a word segmentation tool;

a word vector determination unit for determining a word vector for each keyword based on the word vector dictionary;

a feature vector determination unit, configured to determine a text feature vector corresponding to the target text based on the word vector of the at least one keyword.

On the basis of the above technical solutions, the feature vector determining unit includes:

a feature vector determination subunit, configured to determine a confusion feature vector corresponding to the target text; a text feature vector determining subunit, configured to obtain a text feature vector corresponding to the target text based on the word vector of the at least one keyword and the confusion feature vector; wherein the confusion feature vector is used for representing the smoothness degree of the text content in the target text.

On the basis of the above technical solutions, the apparatus further includes a to-be-processed text determining module, configured to, before obtaining the basic attribute information of the target text, further: collecting audio information of a target speaking user, and converting the audio information into a corresponding text to obtain a text to be processed; deleting preset characters in the text to be processed from the text to be processed to obtain the target text

On the basis of the above technical solutions, the apparatus further includes: a first predictive assessment model determination module for training the first predictive assessment model; the training the first predictive assessment model includes: acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample and a negative sample, wherein the positive sample is an intentional sample; negative samples are unintended samples; inputting a text feature vector and a text intention identifier in each training sample data into each to-be-trained multi-class sub-evaluation model aiming at each to-be-trained multi-class sub-evaluation model in the first prediction evaluation model to obtain an initial evaluation value corresponding to each training sample data; correcting a preset loss function value in the multi-class sub-evaluation model to be trained based on the initial evaluation value and the preset output value; taking the preset loss function value reaching convergence as a training target, and training the multi-class sub-evaluation model to be trained to obtain the multi-class sub-evaluation model; integrally learning each multi-classification sub-evaluation model to obtain the first prediction evaluation model; the first predictive evaluation model is used to determine a textual intent of the target text.

On the basis of the above technical solutions, the apparatus further includes: the single classification model determining module is used for training a single classification model; training the single classification model comprises: acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample; the positive sample is an intentional sample; training the single classification model based on the training sample data; the single classification model is used to determine a textual intent of the target text.

On the basis of the technical solutions, the text intention comprises at least one or more of encouragement, question, praise and criticism.

On the basis of the above technical solutions, after the obtaining of the target text intention corresponding to the target text, the method further includes: and determining the display content corresponding to the target text intention, and sending the display content to a client so that the client displays the display content.

On the basis of the technical solutions, the device further includes a display module, configured to display the display content in a preset display area of the client when it is detected that the control corresponding to the display content is triggered.

On the basis of the technical schemes, the preset display area is a main display interface for displaying the target text.

On the basis of the above technical solutions, the display content includes a display icon and/or a display text.

According to the technical scheme, the corresponding text processing model is determined through the attribute information of the text information, the speaking intention corresponding to the speaking statement issued by the speaking user is conveniently and accurately determined based on the text processing model, and therefore the interaction efficiency of the interaction user in the interaction process is improved.

The information processing device provided by the embodiment of the disclosure can execute the information processing method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the embodiments of the present disclosure.

EXAMPLE seven

Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the terminal device or server of fig. 7) 700 suitable for implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage means 706 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 706 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 706, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

The electronic device provided by the embodiment of the disclosure is the same as the text information processing method provided by the above embodiment, and technical details that are not described in detail in the embodiment can be referred to the above embodiment, and the embodiment has the same beneficial effects as the above embodiment.

Example eight

The disclosed embodiments provide a computer storage medium on which a computer program is stored, which when executed by a processor implements the text information processing method provided by the above-described embodiments.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit/module does not in some cases constitute a limitation on the unit itself, for example, the target text processing model determination module may also be described as a "model determination module".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, [ example one ] there is provided a text information processing method, including:

According to one or more embodiments of the present disclosure, [ example two ] there is provided a text information processing method, further comprising:

optionally, the target text processing model comprises a first prediction evaluation model; the basic attribute information comprises the word number information of the target text, the basic attribute information of the target text is obtained, and the target text processing model corresponding to the target text is determined according to the basic attribute information, and the method comprises the following steps:

if the number of characters is larger than or equal to a preset character number threshold value, determining a target text processing model corresponding to the target text as a first prediction evaluation model;

at least three multi-classification sub-evaluation models are integrated in the first prediction evaluation model, and each multi-classification sub-evaluation model is used for determining the text intention of the target text to be processed; the first prediction evaluation model is also used for determining the target intention of the target text according to the text intention to be processed output by the at least three sub-evaluation models.

According to one or more embodiments of the present disclosure, [ example three ] there is provided a text information processing method, further comprising:

optionally, the processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text includes:

respectively inputting the text feature vectors into at least three sub-evaluation models in the first prediction evaluation model to obtain at least three text intentions to be processed corresponding to the target text;

determining a target text intent of the target text from the at least three pending text intents based on a voting mechanism in the first predictive assessment model.

According to one or more embodiments of the present disclosure, [ example four ] there is provided a text information processing method, further comprising:

optionally, the target text processing model further includes a single classification model, where the basic attribute information includes information on the number of words in the target text, and the obtaining of the basic attribute information of the target text and determining the target text processing model corresponding to the target text according to the basic attribute information includes:

if the number of characters is less than the preset number of characters, determining a target text processing model corresponding to the target text as the single classification model;

the single classification model is used for determining a target text intention of the target text.

According to one or more embodiments of the present disclosure, [ example five ] there is provided a text information processing method, further comprising:

optionally, the processing the target text to obtain a text feature vector corresponding to the target text includes:

dividing the target text into at least one keyword based on a word segmentation tool;

determining a word vector for each keyword based on the word vector dictionary;

determining a text feature vector corresponding to the target text based on the word vector of the at least one keyword.

According to one or more embodiments of the present disclosure, [ example six ] there is provided a text information processing method, further comprising:

optionally, the determining a text feature vector corresponding to the target text based on the word vector of the at least one keyword includes:

determining a confusion feature vector corresponding to the target text;

obtaining a text feature vector corresponding to the target text based on the word vector of the at least one keyword and the confusion feature vector;

wherein the confusion feature vector is used for representing the smoothness degree of the text content in the target text.

According to one or more embodiments of the present disclosure, [ example seven ] there is provided a text information processing method, further comprising:

optionally, before obtaining the basic attribute information of the target text, the method further includes:

collecting audio information of a target speaking user, and converting the audio information into a corresponding text to obtain a text to be processed;

deleting stop words in the text to be processed from the text to be processed to obtain the target text.

According to one or more embodiments of the present disclosure, [ example eight ] there is provided a text information processing method, further comprising:

optionally, the first prediction evaluation model is obtained by training in the following manner:

acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample and a negative sample, wherein the positive sample is an intentional sample; negative samples are unintended samples;

inputting a text feature vector and a text intention label in each training sample data into each to-be-trained multi-class sub-evaluation model aiming at each to-be-trained multi-class sub-evaluation model in the first prediction evaluation model to obtain an initial evaluation value corresponding to each training sample data;

correcting a preset loss function value in the multi-class sub-evaluation model to be trained based on the initial evaluation value and the preset output value;

taking the preset loss function value reaching convergence as a training target, and training the multi-class sub-evaluation model to be trained to obtain the multi-class sub-evaluation model;

integrally learning each multi-classification sub-evaluation model to obtain the first prediction evaluation model;

the first predictive evaluation model is used to determine a textual intent of the target text.

According to one or more embodiments of the present disclosure, [ example nine ] there is provided a text information processing method, further comprising:

optionally, the single classification model is obtained by training in the following manner:

acquiring training sample data; the training sample data comprises a text characteristic vector, a text intention label and a preset output value; the training sample data comprises a positive sample; the positive sample is an intentional sample;

training the single classification model based on the training sample data;

the single classification model is used to determine a textual intent of the target text.

According to one or more embodiments of the present disclosure, [ example ten ] there is provided a text information processing method, further comprising:

optionally, after obtaining the target text intention corresponding to the target text, the method further includes:

and determining the display content corresponding to the target text intention, and sending the display content to a client so that the client displays the display content.

According to one or more embodiments of the present disclosure, [ example eleven ] there is provided a text information processing method, further comprising: and when the control corresponding to the display content is detected to be triggered, displaying the display content in a preset display area of the client.

According to one or more embodiments of the present disclosure, [ example twelve ] there is provided a text information processing method, further comprising: the preset display area is a main display interface for displaying the target text.

According to one or more embodiments of the present disclosure, [ example thirteen ] there is provided a text information processing method, further comprising:

the display content comprises a display icon and/or display text.

According to one or more embodiments of the present disclosure, [ example fourteen ] there is provided an information processing apparatus comprising:

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A text information processing method, comprising:

2. The method of claim 1, wherein the target text processing model comprises a first predictive assessment model; the basic attribute information comprises the word number information of the target text, the basic attribute information of the target text is obtained, and the target text processing model corresponding to the target text is determined according to the basic attribute information, and the method comprises the following steps:

3. The method of claim 2, wherein the processing the text feature vector based on the target text processing model to obtain a target text intention corresponding to the target text comprises:

4. The method of claim 1, wherein the target text processing model further comprises a single classification model, the basic attribute information comprises word number information in the target text, and the obtaining the basic attribute information of the target text and determining the target text processing model corresponding to the target text according to the basic attribute information comprises:

5. The method of claim 1, wherein the processing the target text to obtain a text feature vector corresponding to the target text comprises:

determining a word vector for each keyword based on the word vector dictionary;

6. The method of claim 5, wherein determining a text feature vector corresponding to the target text based on the word vector of the at least one keyword comprises:

determining a confusion feature vector corresponding to the target text;

7. The method according to claim 1, before obtaining the basic attribute information of the target text, further comprising:

8. The method of claim 3,

the first prediction evaluation model is obtained by training in the following way:

9. The method of claim 4,

the single classification model is obtained by training in the following way:

training the single classification model based on the training sample data;

10. The method according to any one of claims 1-6, further comprising, after said obtaining a target text intent corresponding to said target text:

11. The method of claim 10, further comprising:

and when the control corresponding to the display content is detected to be triggered, displaying the display content in a preset display area of the client.

12. The method of claim 11, wherein the preset display area is a main display interface for displaying the target text.

13. The method of claim 10, wherein the presentation comprises a presentation icon and/or a presentation text.

14. The method of any of claims 1-6, wherein the target text is textual information generated based on a multimedia video stream.

15. A text information processing apparatus characterized by comprising:

16. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the text information processing method of any one of claims 1-14.

17. A storage medium containing computer-executable instructions for performing the method of text information processing according to any one of claims 1-14 when executed by a computer processor.