CN116089602B

CN116089602B - Information processing method, apparatus, electronic device, storage medium, and program product

Info

Publication number: CN116089602B
Application number: CN202111299940.1A
Authority: CN
Inventors: 铁瑞雪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2024-05-03
Anticipated expiration: 2041-11-04
Also published as: CN116089602A

Abstract

The present disclosure provides an information processing method, an apparatus, an electronic device, and a computer readable storage medium, where the method relates to a natural language processing technology in the field of artificial intelligence technology, and may be applied to the field of vehicle-mounted technology, and the information processing method may include: acquiring predicted text information; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description feature coded representation corresponding to the predicted emotion description information from the predicted text feature coded representation according to the predicted position information; and determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation. The embodiment of the disclosure can improve the accuracy of emotion cleaning and classification of the predicted text information.

Description

Information processing method, apparatus, electronic device, storage medium, and program product

Technical Field

The present disclosure relates to the field of artificial intelligence, and more particularly, to an information processing method, apparatus, electronic device, computer readable storage medium, and computer program product.

Background

Emotion classification, also known as opinion mining, refers to predicting the polarity of emotion corresponding to text (e.g., deterministic emotion, uncertainty emotion, which may include positive emotion, negative emotion, neutral emotion, etc.), with the purpose of extracting opinions from a large number of unstructured text.

In recent years, with the continuous development of artificial intelligence technology, high-precision human-computer interaction is increasingly paid attention to by researchers, and the high-precision human-computer interaction not only requires a computer mechanism to solve the emotion and intention of a user, but also requires different feedback and support to different users, different environments and different tasks, so that the emotion of the user is required to be solved by the computer mechanism, and the emotion is effectively expressed.

Existing emotion analysis methods can be divided into two types: (1) An emotion dictionary analysis method (2) a machine learning analysis method.

The emotion dictionary analysis method mainly adopts a word segmentation tool to segment the emotion, further extracts keywords representing emotion, and finally maps the keywords into a corresponding emotion dictionary (for example, keywords of positive emotion are very bar). The machine learning analysis method can be used for extracting emotion keywords first, then characterizing the keywords, for example, word2vec is used for representing word codes, or after the whole sentence is segmented, the emotion keywords are not extracted, the whole sentence is directly characterized, and finally, the distribution difference of word codes representing different types of emotion is learned in a supervised mode.

The emotion dictionary analysis method (1) is very dependent on word segmentation tools and emotion dictionaries, the word segmentation tools can solve the problems that word segmentation is inaccurate or new words cannot be covered, the emotion dictionary is not necessarily large enough in scale and needs to be maintained and updated regularly, and the dictionary method is simple keyword matching, so that the emotion judgment accuracy is low; the machine learning analysis method (2) also relies on word segmentation tools, so that similar problems exist in the method (1), word vector representation is an important link, but word vectors are often trained based on large-scale corpus, one keyword only corresponds to one word vector representation, and therefore the situation of word ambiguity cannot be solved, for example, an apple may represent a fruit or a company or a product.

Therefore, how to efficiently and accurately emotion-classify text data has a significant impact on the field of text classification.

Disclosure of Invention

The present disclosure is directed to an information processing method, apparatus, electronic device, and computer-readable storage medium, capable of improving accuracy of emotion classification of a predicted text.

Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.

The embodiment of the disclosure provides an information processing method, which comprises the following steps: acquiring predicted text information; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description feature coded representation corresponding to the predicted emotion description information from the predicted text feature coded representation according to the predicted position information; and determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

An embodiment of the present disclosure provides an information processing apparatus including: the system comprises a predicted text information acquisition module, a predicted text feature code representation acquisition module, a predicted position information acquisition module, a predicted emotion description feature code representation acquisition module and a predicted emotion type information determination module.

The predictive text information acquisition module is used for acquiring predictive text information; the predicted text feature code representation acquisition module is used for carrying out feature extraction processing on the predicted text information so as to obtain a predicted text feature code representation; the prediction position information acquisition module is used for determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; the predicted emotion description feature code representation acquisition module is used for extracting predicted emotion description feature code representations corresponding to the predicted emotion description information from the predicted text feature code representations according to the predicted position information; the predicted emotion type information determination module is used for determining predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

In some embodiments, the predictive emotion classification information determination module includes: the prediction emotion enhancement feature code representation acquisition sub-module and the prediction emotion enhancement feature code representation classification sub-module.

The prediction emotion enhancement feature coding representation acquisition submodule is used for carrying out feature fusion on the prediction emotion description feature coding representation and the prediction text feature coding representation so as to obtain a prediction emotion enhancement feature coding representation; the prediction emotion enhancement feature coding representation classification submodule is used for carrying out classification processing on the prediction emotion enhancement feature coding representation so as to determine prediction emotion type information corresponding to the prediction text information.

In some embodiments, the information processing apparatus includes further comprising: the prediction emotion description information does not exist in the determining module and the prediction text feature coding representation classifying module.

The prediction emotion description information absence determining module is used for determining that the prediction emotion description information does not exist in the prediction text information according to the prediction text feature coding representation; the prediction text feature coding representation classification module is used for classifying the prediction text feature coding representation to determine prediction emotion type information corresponding to the prediction text information.

In some embodiments, the information processing method is implemented by a target neural network model; wherein the information processing apparatus further comprises: the training emotion type label acquisition module, the training text feature code representation determination module, the training position information determination module, the training emotion description feature code representation extraction module, the training emotion type information determination module, the target loss value determination module and the neural network model training module.

The training emotion type label acquisition module is used for acquiring training text information and training emotion type labels corresponding to the training text information; the training text feature code representation determining module is used for carrying out feature extraction processing on the training text information so as to obtain training text feature code representation; the training position information determining module is used for predicting training position information corresponding to training emotion description information in the training text information according to the training text feature code representation; the training emotion description feature code representation extraction module is used for extracting training emotion description feature code representations corresponding to the training emotion description information from the training text feature code representations according to the training position information; the training emotion type information determining module is used for predicting training emotion type information corresponding to the training text information according to the training emotion description characteristic code representation; the target loss value determining module is used for determining a target loss value according to the training emotion type label, the training emotion type information and the training position information; the neural network model training module is used for training the target neural network model according to the target loss value.

In some embodiments, the training emotion classification information determination module includes: the training emotion enhancement feature code representation determination submodule and the training emotion enhancement feature code representation classification processing submodule.

The training emotion enhancement feature code representation determining submodule is used for carrying out feature fusion on the training emotion description feature code representation and the training text feature code representation so as to obtain training emotion enhancement feature code representation; the training emotion enhancement feature code representation classification processing sub-module is used for classifying the training emotion enhancement feature code representation to determine training emotion type information corresponding to the training text information.

In some embodiments, the target loss value comprises a first loss value and a second loss value; wherein the target loss value determining module includes: the system comprises a first loss value determining sub-module, an actual position information obtaining sub-module, a second loss value determining sub-module and a target loss value generating sub-module.

The first loss value determining submodule is used for generating the first loss value according to the training emotion type label and the training emotion type information; the actual position information acquisition sub-module is used for acquiring an actual position information label corresponding to training emotion description information in the training text information; the second loss value determining submodule is used for generating the second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information label corresponding to the training emotion description information in the training text information; the target loss value generation submodule is used for generating the target loss value according to the first loss value and the second loss value.

In some embodiments, the second loss value determination submodule includes: the actual emotion description feature code represents a determining unit and a second loss value generating subunit.

The practical emotion description feature code representation determining unit is used for extracting practical emotion description feature code representation corresponding to training emotion description information from the training text feature code representation according to the practical position information label corresponding to the training emotion description information in the training text information; the second loss value generation subunit is configured to generate the second loss value according to the actual emotion description feature encoding representation and the training emotion description feature encoding representation.

In some embodiments, the actual position tag includes a first sequence for identifying a position corresponding to a first word of the training emotion description information and a last sequence for identifying a position corresponding to a last word of the training emotion description information.

The embodiment of the disclosure provides an electronic device, which comprises: one or more processors; and a storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the information processing method of any of the above.

The embodiment of the present disclosure proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the information processing method as set forth in any one of the above.

Embodiments of the present disclosure propose a computer program product or a computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the above-described information processing method.

According to the information processing method, the information processing device, the electronic equipment and the computer readable storage medium, the predicted emotion description code representation corresponding to the predicted emotion description information is extracted from the predicted text feature code representation corresponding to the predicted text information by determining the predicted position information of the predicted emotion description information in the predicted text information, and finally the predicted emotion type corresponding to the predicted text information is determined according to the predicted emotion description code representation. According to the method, on one hand, when the emotion classification prediction is carried out on the predicted text information, the prediction is carried out according to the predicted emotion description characteristic coding representation corresponding to the predicted emotion description information with high emotion tendency relevance of the predicted text information, so that the prediction accuracy is improved; on the other hand, when the predictive emotion description feature coding representation corresponding to the predictive emotion description information is determined, instead of independently generating the coding representation according to the predictive emotion description information, the predictive emotion description coding representation corresponding to the predictive emotion description information is extracted from the predictive text feature coding representation generated from the dynamic state through the predictive position information, so that the finally extracted predictive emotion description coding representation can implicitly comprise context information, and further, when the predictive emotion type information of the predictive text information is predicted according to the predictive emotion description coding representation, not only the predictive emotion description information is combined, but also the context information of the predictive emotion description information is combined, so that the emotion classification accuracy of the predictive text information is improved, and the predictive text feature coding representation and the predictive emotion description feature coding representation generated in the dynamic training process can solve the problem of word ambiguity.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.

Fig. 1 shows a schematic diagram of an exemplary system architecture to which an information processing method or an information processing apparatus of an embodiment of the present disclosure may be applied.

Fig. 2 is a flowchart illustrating a method of information processing according to an exemplary embodiment.

FIG. 3 is a schematic diagram illustrating an encoded representation of predicted text information by a Bert model, according to an example embodiment.

Fig. 4 is a schematic diagram of a network framework of a feature extraction unit according to an exemplary embodiment.

Fig. 5 is a flowchart illustrating a method of information processing according to an exemplary embodiment.

FIG. 6 is a training method of a target neural network model, according to an example embodiment.

FIG. 7a is a schematic diagram of a location tag shown according to an example embodiment.

Fig. 7b is a schematic diagram of a position tag shown according to an example embodiment.

FIG. 8 is a training method of a target neural network model, according to an example embodiment.

Fig. 9 is a diagram illustrating a target loss value determination method according to an exemplary embodiment.

Fig. 10 is a schematic diagram of an information processing network architecture, according to an exemplary embodiment.

Fig. 11 is a block diagram of an information processing apparatus according to an exemplary embodiment.

Fig. 12 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted.

The described features, structures, or characteristics of the disclosure may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. However, those skilled in the art will recognize that the aspects of the present disclosure may be practiced with one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.

The drawings are merely schematic illustrations of the present disclosure, in which like reference numerals denote like or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only, and not necessarily all of the elements or steps are included or performed in the order described. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

In the present specification, the terms "a," "an," "the," "said" and "at least one" are used to indicate the presence of one or more elements/components/etc.; the terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements/components/etc., in addition to the listed elements/components/etc.; the terms "first," "second," and "third," etc. are used merely as labels, and do not limit the number of their objects.

In order that the above-recited objects, features and advantages of the present application can be more clearly understood, a more particular description of the application will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings, it being understood that embodiments of the application and features of the embodiments may be combined with each other without departing from the scope of the appended claims.

The technical scheme of the application relates to the technical field of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), wherein the artificial intelligence is the theory, method, technology and application system which utilizes a digital computer or a machine controlled by the digital computer to simulate, extend and expand human intelligence, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

According to the technical scheme, natural language processing (Nature Language processing, NLP) technology in the technical field of artificial intelligence is used, and predicted text information is obtained through a target neural network model; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description characteristic coding representation corresponding to the predicted emotion description information from the predicted text characteristic coding representation according to the predicted position information; and determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation, so that the prediction accuracy of the emotion type information in the predicted text information is improved.

Among them, natural language processing is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

Of course, in the implementation process of the technical scheme of the application, the technology of machine learning (MACHINE LEARNING, ML), deep learning and the like is also involved so as to realize the training and use of the target neural network.

The machine learning or the deep learning is a multi-field interdisciplinary and relates to a multi-gate discipline such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including, but not limited to, smart phones, tablet computers, laptop computers, desktop computers, wearable devices, virtual reality devices, smart homes, smart voice interaction devices, smart home appliances, car terminals, etc.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like.

For example, the user may acquire the predicted text information using the terminal devices 101, 102, 103 and send the predicted text information to the server 105 via the network 104 so that the server processes the predicted text information.

For another example, the user may obtain predictive text information using the terminal devices 101, 102, 103; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description characteristic coding representation corresponding to the predicted emotion description information from the predicted text characteristic coding representation according to the predicted position information; determining predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation; and finally, feeding back the predicted emotion type information corresponding to the predicted text information to the terminal equipment for displaying to the user.

The server 105 may be a server providing various services, such as a background management server providing support for devices operated by users with the terminal devices 101, 102, 103. The background management server can analyze and process the received data such as the request and the like, and feed back the processing result to the terminal equipment.

For example, the server 105 may acquire predictive text information from the terminal device; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description characteristic coding representation corresponding to the predicted emotion description information from the predicted text characteristic coding representation according to the predicted position information; determining predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation; and finally, feeding back the predicted emotion type information corresponding to the predicted text information to the terminal equipment for displaying to the user.

The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server or the like for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the disclosure is not limited thereto.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative, and that the server 105 may be a server of one entity, or may be composed of a plurality of servers, and may have any number of terminal devices, networks and servers according to actual needs.

Fig. 2 is a flowchart illustrating a method of information processing according to an exemplary embodiment. The method provided by the embodiments of the present disclosure may be performed by any electronic device having computing processing capability, for example, the method may be performed by a server or a terminal device in the embodiment of fig. 1, or may be performed by both the server and the terminal device, and in the following embodiments, the server is taken as an example to illustrate an execution subject, but the present disclosure is not limited thereto.

In some embodiments, the technical solutions provided by the present disclosure may be implemented by a target neural network model. Referring to fig. 2, the information processing method provided by the embodiment of the present disclosure may include the following steps.

Step S202, obtaining predicted text information.

The predicted text information may refer to text information that requires emotion type prediction. The emotion classification may be, for example, a determined emotion or an uncertain emotion; or determining positive emotion, negative emotion or central emotion in emotion, and any emotion type needing to be predicted can be the emotion type to be predicted according to the application, and can be set by a person skilled in the art according to actual requirements.

The uncertain emotion may refer to an emotion expressed by an uncertain expression content, in which an answer is uncertain, for example, "i just received a call but i uncertain whether that call is a fraud call", "i suspected that i liked he" etc. are all text information of an uncertain emotion class in which an answer is uncertain. The uncertain emotion may include various uncertain emotion tendencies of suspicion, feeling, look, uncertainty, unconfirmation, surprise, unclear, etc., which the present disclosure does not limit.

In some embodiments, an uncertain emotion may be understood as an emotion that is unknown and not easily classified.

It will be appreciated that the text information corresponding to the uncertain emotion may include some uncertain emotion description information, such as "i'm uncertain", "i'm suspicion", "as if", "yes or no", etc., and the disclosure is not limited thereto.

Determining emotion may refer to determining emotion by deterministic expression such that the expressed answer is a determined and distinct emotion, e.g. "i just received a call, i determined that call was a fraud call", "i certainly me liked it", etc. are text information of a determined emotion class for which an answer is determined.

Of course, it is also possible that the text information corresponding to the determined emotion may include some determined emotion description information, such as "i determine", "this is not necessarily the case", "this is absolutely no longer the case", etc., which the present disclosure does not limit.

In some embodiments, deterministic emotions may in turn include positive emotions, negative emotions, neutral emotions, and the like.

Wherein positive emotion may refer to a praise, commend, praise, etc., and negative emotion may refer to a dislike, disapproval, etc.

In some embodiments, the text information corresponding to the positive emotion may include some positive emotion description information, such as words or phrases that are positively expressed in terms of "i like", "very bar", "very nice" and the like.

Of course, the text information corresponding to the negative emotion may also include some performance and emotion description information, such as words or phrases that are negatively expressed, such as "i dislike", "very ugly", "calculated bar", "i do not want to go to", etc.

In summary, emotion categories according to the present application may be classified into a determined emotion (emotion capable of deterministic expression) and an uncertain emotion (emotion capable of uncertain expression) according to whether emotion expression is determined, where determining emotion may include positive emotion, negative emotion, neutral emotion, and the like, which is not limited in this disclosure.

Step S204, performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation.

In some embodiments, the target neural network model may include a feature extraction unit, a position prediction unit, an emotion description coding representation extraction unit, and a classification unit.

The target neural network model can conduct feature extraction processing on the predicted text information through a feature extraction unit so as to obtain a predicted text feature coding representation. The predictive text feature coded representation may refer to a feature coded representation that is generated from predictive text information and that is capable of describing the content contained in the predictive text information.

The feature extraction unit of the target neural network model may be configured by any network model capable of performing feature extraction, for example, a Bert (Bidirectional encoder representations from transformers, bi-directional encoder representation technology based on a transformer) model, a ALBERT (a Lite Bert, small Bbert) model, or ERNIE (a pre-training model), which is not limited in this disclosure.

In some embodiments, the feature extraction unit of the target neural network model may convert the predicted text information into a high-dimensional vector form by an n-gram method, a word2vec method or the like in the process of performing feature extraction processing on the predicted text information, and then perform feature extraction on the high-dimensional vector by using the neural network model to generate a predicted text feature coding representation.

In other embodiments, in the process of performing feature extraction processing on the predicted text information by the feature extraction unit of the target neural network model, the word vector representation of the predicted text information can be randomly initialized by a deep learning method, then the word vector representation of the predicted text information is dynamically changed along with training, and the predicted text feature coding representation corresponding to the predicted text information can be obtained after the training is completed.

FIG. 3 is a schematic diagram illustrating an encoding representation Embedding of predicted-text information by a Bert model, according to one example embodiment.

As shown in fig. 3, the Bert model may vector predicted text (e.g., my dog is cute, HE LIKES PLAYING) from three angles to generate three different types of vectors: word vector (token embeddings), classification vector (segment embedding), and location vector (position embedding).

As shown in fig. 3, a word vector representation of the predicted text information ([ ECLS, emy, edog, eis, ecute, ESEP, ehe, elikes, eplay, e# # ing, ESEP ]) may be used to characterize each word in the predicted text, a classification vector representation of the predicted text information ([ EA, EA, EA, EA, EA, EA, EB, EB, EB, EB, EB ]) may be used to classify a plurality of sentences in the predicted text (each sentence being self-contained), and a position vector representation of the predicted text information ([ E0, E1, E2, E3, E4, E5, E6, E7, E8, E9, E10 ]) may be used to characterize the position of each word in each sentence in the predicted text.

Fig. 4 is a diagram illustrating a transducer (a model framework) framework structure, according to an example embodiment.

In some embodiments, after the Bert model vectorizes the predicted text from three angles (initial vectorization during the first training process, vectorization of the predicted text according to parameters during the subsequent training and prediction processes), the vectorized word vector (token embeddings), classification vector (segment embedding), position vector (position embedding), and the like may be input to the transducer framework shown in fig. 4 to obtain the predicted text feature encoded representation corresponding to the predicted text information.

Step S206, determining the prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature code representation.

In some embodiments, the predicted text feature encoded representation may be processed by a location prediction unit of the target neural network model to determine whether predicted emotion description information is included in the predicted text information and predicted location information corresponding to the predicted emotion description information.

In some embodiments, the position prediction unit may include two classifiers, one for determining a first position (i.e., a position of a first word) of the predicted emotion description information and one for predicting a last position (i.e., a position of a last word) of the emotion description information. The classifier may be a sigmoid classifier.

The predicted position information of the predicted emotion description information in the predicted text information may be identified by a first sequence and a last sequence, wherein the first sequence is used for identifying the position of the first word of the predicted emotion description information in the predicted text information, the last sequence is used for identifying the position of the last word of the emotion description information in the predicted text information, and the first sequence and the last sequence may be sequences with equal length as the predicted text information (refer to fig. 7a specifically).

It will be appreciated that if there are multiple predicted emotion descriptors in the predicted text information, each predicted emotion descriptor may correspond to a first sequence and a last sequence.

In other embodiments, the first positions of different predicted emotion descriptors may be represented in the same first sequence, and the last positions of different predicted emotion descriptors may be represented in the same last sequence (see fig. 7b in particular), which is not limiting of the present disclosure.

The predicted emotion description information may be text information which may be used for emotion description and is included in predicted text information predicted in the prediction process.

The predicted emotion description information may describe uncertain emotion, determined emotion, active emotion, passive emotion and the like, and a person skilled in the art can set the emotion type to be predicted according to requirements.

It can be appreciated that if a prediction focuses on an uncertain emotion, the predicted emotion description information identified in the prediction should be emotion description information corresponding to the uncertain emotion; if a prediction focuses on determining positive emotion in emotion, the predicted emotion description information identified in the prediction should be emotion description information corresponding to the positive emotion; if a prediction focuses on determining negative emotions in emotion, the predicted emotion description information identified in the prediction should be emotion description information corresponding to the negative emotions, and so on, the disclosure limits this step.

Wherein the predicted emotion description information may refer to words, phrases, sentences, etc., which the present disclosure does not limit.

And step S208, extracting the predictive emotion description characteristic coding representation corresponding to the predictive emotion description information from the predictive text characteristic coding representation according to the predictive position information.

In some embodiments, the predicted emotion description feature encoded representation corresponding to the predicted emotion description information may be extracted from the predicted text feature encoded representation by the above-described predicted location information.

For example, if the predicted position information identifies that the predicted emotion description information is at the position of the n1 st to n2 nd words of the predicted text information, the feature code representation corresponding to the n1 st to n2 nd words may be extracted from the predicted text feature code representation as the predicted emotion description feature code representation, where n1 is less than or equal to n2, and n1 and n2 are integers greater than or equal to 1.

For example, semantic representations corresponding to each word token in the predicted text obtained by processing the predicted text information by the Bert model may be denoted as a predicted text feature code representation H, which is denoted as a sequence, h=h [ CLS ], H1, H2, H3, …, hn, H [ SEP ].

Where h [ CLS ] is a representation of the coded representation of the predicted text feature, hn is a representation of the nth word in the predicted text, and n is an integer greater than or equal to 1.

And assuming that the predicted position information corresponding to the determined predicted emotion description information is at the position of the 2 nd-5 th word, then [ h2, h3, h4, h5] can be fetched to generate a predicted emotion description feature encoded representation.

Wherein the predictive emotion description feature coded representation can be used to describe emotion in the predicted text.

Step S210, determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

In some embodiments, the predictive emotion descriptive feature encoded representation may be classified using a classifier to directly determine predictive emotion classification information (e.g., determined emotion or uncertain emotion; positive emotion, negative emotion or neutral emotion, etc.) corresponding to the predictive text information.

In some embodiments, the predicted emotion description feature encoded representation may be feature fused with the predicted text feature encoded representation, and then the feature fused encoded representation is classified using a classifier to determine predicted emotion classification information corresponding to the predicted text information.

In some embodiments, if it is determined that predictive emotion description information is not present in the predictive text information based on the predictive text feature coded representation; the coded representation of the predicted text feature may be classified by a classifier to determine predicted emotion classification information corresponding to the predicted text information.

The classifier may be a softmax classifier.

According to the information processing method provided by the embodiment of the disclosure, the prediction position information of the prediction emotion description information in the prediction text information is determined, the prediction emotion description coding representation corresponding to the prediction emotion description information is further extracted from the prediction text feature coding representation corresponding to the prediction text information, and finally the prediction emotion type corresponding to the prediction text information is determined according to the prediction emotion description coding representation. According to the method, on one hand, when the emotion classification prediction is carried out on the predicted text information, the prediction is carried out according to the predicted emotion description characteristic coding representation corresponding to the predicted emotion description information with high emotion tendency relevance of the predicted text information, so that the prediction accuracy is improved; on the other hand, when the predictive emotion description feature coding representation corresponding to the predictive emotion description information is determined, instead of generating the coding representation independently according to the predictive emotion description information, the predictive emotion description coding representation corresponding to the predictive emotion description information is extracted from the predictive feature coding representation generated dynamically through the predictive position information, so that the finally extracted predictive emotion description coding representation can implicitly comprise context information, and further, when the predictive emotion type information of the predictive text information is predicted according to the predictive emotion description coding representation, not only the predictive emotion description information is combined, but also the context information of the predictive emotion description information is combined, so that the emotion classification accuracy of the predictive text information is improved.

In some embodiments, the technical solutions provided by the present disclosure may be implemented by a target neural network model.

Referring to fig. 5, the information processing method provided by the embodiment of the present disclosure may include the following steps.

Step S502, obtaining predictive text information.

Step S504, performing feature extraction processing on the predicted text information to obtain a predicted text feature code representation.

Step S506, determining the prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature code representation.

Step S508 extracts the predictive emotion description feature coded representation corresponding to the predictive emotion description information from the predictive text feature coded representation according to the predictive position information.

Step S510, feature fusion is carried out on the predictive emotion description feature coding representation and the predictive text feature coding representation to obtain a predictive emotion enhancement feature coding representation.

In some embodiments, the target neural network model may further include a feature fusion unit that may feature fuse (e.g., splice, pool, etc.) the predicted emotion description feature encoded representation with the predicted text feature encoded representation to obtain a predicted emotion enhanced feature encoded representation.

Step S512, classifying the coded representation of the predicted emotion enhancement feature to determine predicted emotion type information corresponding to the predicted text information.

In some embodiments, if it is determined that predictive emotion description information is not present in the predictive text information based on the predictive text feature coded representation; the coded representation of the predicted text feature may be classified to determine predicted emotion classification information corresponding to the predicted text information.

According to the technical scheme provided by the embodiment, the prediction of emotion type information in the predicted text information is realized through the prediction emotion enhancement feature coding representation. The predictive emotion enhancement feature coding representation is used for carrying out feature description on the predictive text and fusing the predictive emotion description feature coding representation with high relevance to emotion tendencies so as to strengthen the weight of the predictive emotion in the predictive text, so that emotion classification information in the predictive text information can be accurately and clearly determined through the predictive emotion enhancement feature coding representation.

In some embodiments, the training method of the target neural network model may be implemented in a training stage before the prediction stage, or may be implemented synchronously in the prediction process, which is not limited in this embodiment.

Referring to fig. 6, the training method of the target neural network model may include the following steps.

Step S602, training text information and training emotion type labels corresponding to the training text information are obtained.

In some embodiments, emotion type tags may be determined for each training text message in advance, and in this embodiment, the emotion type tags determined before training may be referred to as training emotion type tags.

In step S604, feature extraction processing is performed on the training text information to obtain a training text feature code representation.

In some embodiments, feature extraction processing may be performed on the training text information by a feature extraction unit of the target neural network model to obtain a training text feature encoded representation.

Step S606, according to the training text feature codes, training position information corresponding to training emotion description information in the predicted training text information is represented.

In some embodiments, the training position information corresponding to the training emotion description information may be represented by a first sequence and a last sequence as shown in fig. 7a, where the position of the first word of the training emotion description information may be indicated by the position of "1" in the first sequence, and the position of the last word of the training emotion description information may be indicated by the position of "1" in the last sequence.

It can be appreciated that if there are multiple training emotion description information in the training text information, each training emotion description information may correspond to a first sequence and a last sequence.

Step S608, extracting training emotion description characteristic coding representation corresponding to the training emotion description information from the training text characteristic coding representation according to the training position information.

In some embodiments, the training emotion description feature encoded representation corresponding to the training emotion description information may be extracted from the training text feature encoded representation by the training position information.

For example, if the training position information identifies that the training emotion description information is located at the position of the n3 th to n4 th words of the training text information, the feature code representation corresponding to the n3 th to n4 th words may be extracted from the training text feature code representation to be used as the training emotion description feature code representation, where n3 is less than or equal to n4, and n3 and n4 are integers greater than or equal to 1.

For example, semantic representations corresponding to each word token in the training text obtained by processing the training text information by the Bert model may be denoted as training text feature code representation H, which is denoted as a sequence, h=h [ CLS ], H1, H2, H3, …, hn, H [ SEP ].

Where h [ CLS ] is a representation of the feature encoded representation of the training text, hn represents a representation of the nth word in the training text, and n is an integer greater than or equal to 1.

And assuming that the training position information corresponding to the determined training emotion description information is the position of the 3 rd word to the 6 th word, the [ h3, h4, h5, h6] can be taken out to generate the training emotion description feature coding representation.

Wherein the training emotion description feature encoded representation may be used to describe emotion in the training text.

Step S610, according to the training emotion description feature codes, the training emotion type information corresponding to the predicted training text information is represented.

In some embodiments, the training emotion description feature encoded representation may be classified by a classifier to predict training emotion classification information corresponding to training text information.

The classifier may be a softmax classifier.

Step S612, determining a target loss value according to the training emotion type label, the training emotion type information and the training position information.

In some embodiments, a penalty value may be determined as the target penalty value based on the training emotion classification tag; the method for determining the target loss value is not limited by the method for determining the target loss value.

Step S614, training a target neural network model according to the target loss value.

In some embodiments, various parameters in the target neural network model may be modified according to the target loss value guidance to train the target neural network model.

According to the technical scheme provided by the embodiment, on one hand, when the target neural network model is trained, the emotion tendency prediction is carried out on the predicted text, the training emotion description feature coding representation corresponding to the training emotion description information is extracted, and the training emotion description feature coding representation is added into the training text feature coding representation again, so that the proportion of the training emotion description feature coding representation in the training text feature coding representation is emphasized, the prior knowledge is introduced in an implicit mode, and the accuracy of emotion classification is improved; on the other hand, in the training process of the target neural network model, not only the difference between the predicted and obtained training emotion type information and the training emotion type label is considered, but also the difference between the predicted and obtained training position information and the actual position information of the training emotion description information is considered, so that the trained target neural network can accurately identify the position of the emotion description information from the text, and further the feature code representation corresponding to the emotion description information can be accurately extracted from the feature code representation corresponding to the text.

In some embodiments, the target neural network model may be implemented in a training stage before prediction, or may be implemented synchronously in the prediction process, which is not limited in this embodiment.

Referring to fig. 8, the training method of the target neural network model may include the following steps.

Step S802, training text information and training emotion type labels corresponding to the training text information are obtained.

Step S804, feature extraction processing is performed on the training text information to obtain a training text feature code representation.

Step S806, according to the training text feature codes, training position information corresponding to training emotion description information in the predicted training text information is represented.

Step S808 extracts training emotion description feature code representation corresponding to the training emotion description information from the training text feature code representation according to the training position information.

Step S810 performs feature fusion on the training emotion description feature encoded representation and the training text feature encoded representation to obtain a training emotion enhancement feature encoded representation.

Step S812, classifying the training emotion enhancement feature coded representation to determine training emotion type information corresponding to the training text information.

Step S814, determining a target loss value according to the training emotion type label, the training emotion type information and the training position information.

Step S816, training the target neural network model according to the target loss value.

According to the technical scheme, in the training process of the target neural network model, training of emotion type information in training text information is achieved through training emotion enhancement feature coding representation. The training emotion enhancement feature coding representation can be used for carrying out feature description on the training text and fusing the training emotion description feature coding representation with high relevance to emotion tendencies so as to strengthen the weight of training emotion in the training text, so that emotion classification information in training text information can be accurately and clearly determined through the training emotion enhancement feature coding representation.

In some embodiments, the target loss value may include a first loss value and a second loss value.

Referring to fig. 9, the above-described target loss value determination method may include the following steps.

Step S902, a first loss value is generated according to the training emotion type label and the training emotion type information.

In some embodiments, the training emotion class label and the training emotion class information may be processed by a cross entropy loss function to determine a first loss value CE (p, y), where y may be the training emotion class label and y may be a probability value corresponding to the training emotion class information.

Wherein, the cross entropy loss formula is as follows:

CE(p,y)＝-ylog(p)-(1-y)log(1-p)。

Step S904, obtaining an actual position information label corresponding to training emotion description information in the training text information.

In some embodiments, the actual position tag may include a first sequence and a last sequence as shown in fig. 7a, where the first sequence may identify the position corresponding to the first word of the training emotion description information by "1" in the sequence (but also other identifiers, which are not limited by the present disclosure), and the last sequence may identify the position corresponding to the last word of the training emotion description information by "1" in the sequence (but also other identifiers, which are not limited by the present disclosure).

Step S906, generating a second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information label corresponding to the training emotion description information in the training text information.

In some embodiments, the actual emotion description feature code representation corresponding to the training emotion description information may be extracted from the training text feature code representation according to the actual position information tag corresponding to the training emotion description information in the training text information; and then generating a second loss value according to the actual emotion description feature coding representation and the training emotion description feature coding representation.

Extraction of training emotion description information is essentially a classification task, and meanwhile, the problem of class imbalance is faced, and the method is characterized in that words of training emotion description information are far fewer than other words, namely labels 1 and 0 in the first and the last positions, so that a second loss value is calculated by adopting a focal loss (a loss function) to reduce the influence caused by imbalance.

Wherein, the formula of the focal loss is as follows:

FL(p，y)＝-y(1-p)^γlog(p)-(1-y)p^γlog(1-p)

Wherein FL (p, y) represents the second loss value, p may represent the characterization information corresponding to the word at a certain position (e.g., the first position) in the training emotion description feature code representation, y may represent the characterization information corresponding to the word at the actual emotion description feature code representation position (e.g., the first position), and γ is a parameter value set manually.

The loss of the part is the average value obtained by adding the loss of each word in training emotion description information.

Step S908 generates a target loss value from the first loss value and the second loss value.

In some embodiments, the first loss value and the second loss value may be weighted summed to determine the target loss value.

Since the present application focuses more on the prediction accuracy of the emotion classification section, the penalty weight of the first penalty value can be set to 0.4 and the penalty weight of the predicted text feature code representation can be set to 0.6.

According to the technical scheme provided by the embodiment, in the training process of the target neural network model, not only the difference between the predicted training emotion type information and the training emotion type label is considered, but also the difference between the predicted training position information and the actual position information of the training emotion description information is considered, so that the trained target neural network can accurately identify the position of the emotion description information from the text, and further the feature code representation corresponding to the emotion description information can be accurately extracted from the feature code representation corresponding to the text.

The complaint text is valuable data, embodies visual feedback of a user to a certain process, and is an important clue source for timely finding risks. For example, in the field of money laundering, complaints from certain organizations or units are of great importance to find out illegal actions such as money laundering.

The complaint text has the characteristics of large data volume, irregular content description, higher nonsensical complaint proportion and the like. In an actual business scenario, very high labor cost is often required for processing a large number of complaints, so that complaint contents are generally classified, and invalid information is filtered. However, on the one hand, the amount of effective complaint information remained is still relatively large, and the manual processing is relatively long, on the other hand, there is also a need for further analysis of complaint text, such as further classification into secondary types or classification of emotional tendency of complaint content.

In practical application, many complaint texts can be found to contain complaint contents with suspected emotion tendencies, feelings, unconfirmed emotion types and the like, and the uncertain emotion types indicate that users cannot be completely affirmed to the complaint contents, so that the complaint texts are very important references for manual auditing, and the complaint texts can be further classified into different response priorities in combination with the complaint types. For example, in the money laundering field, complaint text containing uncertain emotion, such as "I don't determine whether they are money laundering, may be received, which is of great importance for finding and locating money laundering. Therefore, how to accurately find text information containing uncertain emotion from a plurality of texts is also a very important subject in the technical field of text analysis.

The present disclosure illustrates a method of emotion classification recognition of text containing uncertain emotion through the information processing network structure shown in fig. 10.

Referring to the network structure shown in fig. 10, the emotion classification recognition method provided by the present disclosure may include the following processes.

(1) Acquiring complaint text information (the complaint text information is called training text information in the training process and the complaint text information is called prediction text information in the prediction process), and performing preprocessing operations such as data cleaning and noise removing on the complaint text.

And cleaning the complaint text mainly in the data preprocessing stage. Including denoising (illegal characters, stop words, etc.), punctuation mark cleaning, misprinted word correction, expression recognition, removal, etc.

For example:

Original text: "I have less confirmation of being cheated, let I pay but not ship, I have no confirmation of being cheated and not have had money laundering activity.

After pretreatment: "I have less confirmation of being deceived, let I pay but not ship, fraud", "I have uncertainty whether this is a money laundering activity".

In the training stage, the uncertain emotion description information (called training emotion description information in the training process and predicted emotion description information in the prediction process) in the complaint text is also required to be marked so as to generate an actual position information label of the uncertain emotion description information in the complaint text. The uncertain emotion description information in the complaint text may refer to short phrases used for representing uncertain emotion, and the function is similar to that of keywords, but phrase level is not word level, so the information content is more abundant. By extracting uncertain emotion description information as a label, the emotion tendencies of complaints can be quickly known.

The application can label the uncertain emotion description information by adopting the following method.

The beginning and end of each uncertain emotion description are marked based on the idea of head-to-tail pointer marking.

For example:

Complaint text: "I do not confirm too much whether or not it is deceptively, let I pay but not ship, fraud.

Uncertain emotion description information: "less acknowledged".

The head-tail pointer is marked as referring to fig. 7a, the position of the first word of the uncertain emotion description information is marked as 1 in the head sequence, and the position corresponding to the last word of the uncertain emotion description information is marked as 1 in the tail sequence.

(2) Feature extraction processing is performed on the complaint text information to obtain a complaint text feature code representation (referred to as a training text feature code representation in the training process and a prediction text feature code representation in the prediction process).

In this embodiment, feature extraction of complaint text information may be implemented through an uncertain emotion classification model (which may be a target neural network), and the uncertain emotion classification model is mainly implemented based on the bert+softmax framework. As shown in FIG. 10, complaint text 1001 may be processed through a BERT model to obtain a high-level semantic representation of the complaint text-a complaint text feature code representation 1002.

The complaint text 1001 is processed through the BERT model to obtain an advanced semantic representation of the complaint text, that is, the complaint text feature code representation 1002 may adopt the BERT model to perform lexical, syntactic and bidirectional semantic feature extraction.

1) Word splitting, mask, and adding extra tags [ CLS ]/[ SEP ].

2) Coding representation embedding (shown in fig. 3, which may include word coding representation token embeddings, classification coding representation segment embeddings, and position coding representation position embeddings).

3) Feature learning is performed by bidirectional transducer.

In some embodiments, the vectorized word vector token embeddings, classification vector segment embeddings, and location vector position embeddings may be input to a transducer framework as shown in fig. 4 to obtain complaint text feature vector 1002.

In some embodiments, the semantic representation corresponding to each token of the Bert output may be represented as the sequence h=h [ CLS ], H1, H2, H3, …, hn, H [ SEP ].

Where h [ CLS ] represents the characterization of the [ CLS ] tag, i.e., complaint text feature code representation 1002, hn represents the characterization of the nth word in the text, and n is an integer greater than or equal to 1.

(3) And determining the position information (training position information in the training process and predicting position information in the predicting process) of the uncertain emotion description information (training emotion description information in the training process and predicting emotion description information in the predicting process) in the complaint text information according to the complaint text feature code representation.

In some embodiments, complaint text feature code representation 1002 of the complaint text may be classified by a location prediction unit (not shown in FIG. 7 a) of the target neural network to predict location information of the uncertain emotion description information.

In some embodiments, the first position of the uncertain emotion description information may be identified by an identification "1" in the first sequence 1003 as shown in fig. 7a, and the first position of the uncertain emotion description information may be identified by an identification "1" in the last sequence 1004.

(4) And extracting the uncertain emotion description characteristic coded representation corresponding to the uncertain emotion description information from the text characteristic coded representation according to the position information of the uncertain emotion description information in the complaint text information.

In some embodiments, the uncertain emotion description feature encoded representation of the uncertain emotion description in the complaint text feature encoded representation corresponding to the complaint text may be identified and extracted by a head sequence and a tail sequence as shown in fig. 7 a.

In this embodiment, the head and tail positions of the uncertain emotion descriptions are obtained through prediction by the head and tail pointers (the specific expression form can be the head and tail sequences in fig. 7 a), and the corresponding representation of each term token in the uncertain emotion descriptions is taken out according to the head and tail positions, so that the representation of each term token is further obtained

For example, the semantic representation of complaint text above, h=h [ CLS ], H1, H2, H3, …, hn, H [ SEP ], is described as "less confirmatory" with the corresponding position being the 2-5 th word, H2, H3, H4, H5 are fetched to generate the uncertainty emotion description feature encoded representation.

(5) Feature fusion is performed on the uncertain emotion description feature coded representation and the complaint text feature coded representation to obtain an emotion enhancement feature coded representation (a training emotion enhancement feature coded representation in the training process and a prediction emotion enhancement feature coded representation in the prediction process).

For example, assume that the overall representation of the original text is stitched with a representation h [ CLS ] corresponding to a [ CLS ] tag, with a representation of an uncertain emotion description to obtain an emotion enhanced feature encoded representation, as follows:

H^＝concat(h[CLS],h2,h3,h4,h5)。

As shown in FIG. 10, a complaint text feature code representation of a complaint text may be represented as 1002, an uncertain emotion description feature code representation corresponding to uncertain emotion description information may be represented as [ A, A, A, A ]), performing uncertain emotion description feature coding representation and complaint text feature coding representation of complaint text splicing may result in the enhanced feature encoding shown in fig. 10 represent 1005[ A, B, A, B ]).

(5) And classifying the emotion enhancement feature coded representation to determine prediction/training emotion type information corresponding to the complaint text information.

In some embodiments, the stitched emotion enhancement feature encoded representation 1005 may be input into a softmax classifier for classification prediction.

The softmax is used for mapping the output of a plurality of neurons into a (0, 1) interval in the classification process, obtaining a probability distribution map of each type, and finally selecting the category with the highest probability as a prediction/training category.

In the training process, the actual position information label corresponding to the training emotion description information in the training text information can be obtained; generating a second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information label corresponding to the training emotion description information in the training text information; and generating a target loss value according to the first loss value and the second loss value, and finally training an uncertain emotion classification model according to the target loss value.

Therefore, the application can design a corresponding head and tail pointer (realized by a sequence) labeling scheme based on the BERT pre-training model, firstly, carry out advanced semantic representation on an input text, then increase an information extraction task, extract the description about uncertainty emotion in complaint content, further fuse the description with the semantic representation of the original complaint text, and finally input the description and the semantic representation of the original complaint text into a classifier so as to obtain an uncertain emotion type.

Therefore, (1) the technical scheme automatically identifies emotion description keywords by adding the information extraction task, solves the problems that word segmentation is inaccurate and new words cannot be automatically found, and reduces the maintenance cost of an emotion dictionary. (2) And the pre-training model is adopted to finely adjust the downstream task, a fixed word vector dictionary is not required to be queried in text representation, and text representation can be dynamically adjusted according to context semantics. (3) The weight of the identified emotion keywords is increased, and the weight is added to the subsequent classification model training, which is equivalent to the prior knowledge introduced implicitly, so that the accuracy of the emotion classification task is improved. (4) Through the end-to-end training of the whole model, unlike the emotion dictionary analysis method, emotion keywords do not need to be mapped in the follow-up process. (5) At present, the recognition of the uncertain emotion is less, and corresponding anticipation and dictionary are also lacking, so that a new thought can be provided for the recognition of the uncertain emotion through the scheme.

According to the technical scheme, the unstructured complaint text can be subjected to structured tag combing, so that auditing personnel can be rapidly assisted in judging the emotion tendency of the complaint text, and manual auditing time is greatly saved. (2) Advanced semantic knowledge is actively learned through the deep learning model, a large-scale keyword library or feature library (3) is not required to be maintained, and compared with a keyword triggering mode, the accuracy and coverage rate of a model scheme are obviously improved. (4) The uncertain emotion description extraction task is added, which is equivalent to the implicit introduction of priori knowledge, and the overall effect is improved. (5) At present, the recognition of the uncertain emotion is less, and corresponding anticipation and dictionary are also lacking, so that a new thought can be provided for the recognition of the uncertain emotion through the scheme.

Fig. 11 is a block diagram of an information processing apparatus according to an exemplary embodiment. Referring to fig. 11, an information processing apparatus 1100 provided by an embodiment of the present disclosure may include: a predicted text information acquisition module 1101, a predicted text feature code representation acquisition module 1102, a predicted position information acquisition module 1103, an uncertain emotion description feature code representation acquisition module 1104, and a predicted emotion classification information determination module 1105.

Wherein the predicted-text-information obtaining module 1101 may be configured to obtain predicted text information; the predicted text feature code representation acquisition module 1102 may be configured to perform feature extraction processing on the predicted text information to obtain a predicted text feature code representation; the predicted position information obtaining module 1103 may be configured to determine predicted position information corresponding to the predicted emotion description information in the predicted text information according to the predicted text feature encoded representation; the predicted emotion description feature encoding representation acquisition module 1104 may be configured to extract, from the predicted text feature encoding representation, a predicted emotion description feature encoding representation corresponding to the predicted emotion description information according to the predicted position information; predictive emotion type information determination module 1105 may be configured to determine predictive emotion type information corresponding to the predicted text information based on the predictive emotion description feature encoded representation.

In some embodiments, predictive emotion classification information determination module 1105 may include: the prediction emotion enhancement feature code representation acquisition sub-module and the prediction emotion enhancement feature code representation classification sub-module.

The prediction emotion enhancement feature coding representation acquisition sub-module can be used for carrying out feature fusion on the prediction emotion description feature coding representation and the prediction text feature coding representation so as to obtain the prediction emotion enhancement feature coding representation; the prediction emotion enhancement feature encoded representation classification sub-module may be configured to classify the prediction emotion enhancement feature encoded representation to determine prediction emotion classification information corresponding to the prediction text information.

In some embodiments, the information processing apparatus 1100 may include may further include: the prediction emotion description information does not exist in the determining module and the prediction text feature coding representation classifying module.

The prediction emotion description information absence determining module can be used for determining that prediction emotion description information does not exist in the prediction text information according to the prediction text feature coding representation; the predictive text feature coded representation classification module may be configured to classify the predictive text feature coded representation to determine predictive emotion classification information corresponding to the predictive text information.

In some embodiments, the information processing method is implemented by a target neural network model; the information processing apparatus 1100 may further include: the training emotion type label acquisition module, the training text feature code representation determination module, the training position information determination module, the training emotion description feature code representation extraction module, the training emotion type information determination module, the target loss value determination module and the neural network model training module.

The training emotion type label acquisition module can be used for acquiring training text information and training emotion type labels corresponding to the training text information; the training text feature code representation determining module can be used for carrying out feature extraction processing on training text information so as to obtain training text feature code representation; the training position information determining module can be used for representing training position information corresponding to training emotion description information in predicted training text information according to the training text characteristic code; the training emotion description feature code representation extraction module can be used for extracting training emotion description feature code representations corresponding to training emotion description information from training text feature code representations according to training position information; the training emotion type information determination module can be used for representing training emotion type information corresponding to predicted training text information according to training emotion description characteristic codes; the target loss value determining module can be used for determining a target loss value according to the training emotion type label, the training emotion type information and the training position information; the neural network model training module may be configured to train the target neural network model based on the target loss value.

In some embodiments, the training emotion classification information determination module may include: the training emotion enhancement feature code representation determination submodule and the training emotion enhancement feature code representation classification processing submodule.

The training emotion enhancement feature code representation determination submodule can be used for carrying out feature fusion on the training emotion description feature code representation and the training text feature code representation so as to obtain the training emotion enhancement feature code representation; the training emotion enhancement feature encoded representation classification processing sub-module may be configured to classify the training emotion enhancement feature encoded representation to determine training emotion category information corresponding to the training text information.

In some embodiments, the target loss value includes a first loss value and a second loss value; wherein the target loss value determination module may include: the system comprises a first loss value determining sub-module, an actual position information obtaining sub-module, a second loss value determining sub-module and a target loss value generating sub-module.

The first loss value determining submodule can be used for generating a first loss value according to the training emotion type label and training emotion type information; the actual position information acquisition sub-module can be used for acquiring an actual position information label corresponding to training emotion description information in training text information; the second loss value determining sub-module may be configured to generate a second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information tag corresponding to the training emotion description information in the training text information; the target loss value generation sub-module may be configured to generate a target loss value from the first loss value and the second loss value.

In some embodiments, the second loss value determination submodule may include: the actual emotion description feature code represents a determining unit and a second loss value generating subunit.

The actual emotion description feature code representation determining unit may be configured to extract an actual emotion description feature code representation corresponding to training emotion description information from the training text feature code representation according to an actual position information tag corresponding to training emotion description information in the training text information; the second penalty value generating subunit may be configured to generate the second penalty value based on the actual emotion description feature encoded representation and the training emotion description feature encoded representation.

Since the functions of the apparatus 1100 are described in detail in the corresponding method embodiments, the disclosure is not repeated herein.

The modules and/or sub-modules and/or units involved in the embodiments of the present application may be implemented in software or in hardware. The described modules and/or sub-modules and/or units may also be provided in a processor. Wherein the names of the modules and/or sub-modules and/or units do not in some cases constitute a limitation of the module and/or sub-modules and/or units themselves.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.

Fig. 12 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. It should be noted that the electronic device 1200 shown in fig. 12 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present disclosure.

As shown in fig. 12, the electronic apparatus 1200 includes a Central Processing Unit (CPU) 1201, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. In the RAM 1203, various programs and data required for the operation of the electronic apparatus 1200 are also stored. The CPU 1201, ROM 1202, and RAM 1203 are connected to each other through a bus 1204. An input/output (I/O) interface 1205 is also connected to the bus 1204.

The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output portion 1207 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1208 including a hard disk or the like; and a communication section 1209 including a network interface card such as a LAN card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. The drive 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 1210 so that a computer program read out therefrom is installed into the storage section 1208 as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1209, and/or installed from the removable media 1211. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 1201.

It should be noted that the computer readable storage medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

As another aspect, the present application also provides a computer-readable storage medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer-readable storage medium carries one or more programs which, when executed by a device, cause the device to perform functions including: acquiring predicted text information; performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation; determining prediction position information corresponding to the prediction emotion description information in the prediction text information according to the prediction text feature coding representation; extracting a predicted emotion description characteristic coding representation corresponding to the predicted emotion description information from the predicted text characteristic coding representation according to the predicted position information; and determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations of the above-described embodiments.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, aspects of the disclosed embodiments may be embodied in a software product, which may be stored on a non-volatile storage medium (which may be a CD-ROM, a U-disk, a mobile hard disk, etc.), comprising instructions for causing a computing device (which may be a personal computer, a server, a mobile terminal, or a smart device, etc.) to perform a method according to embodiments of the disclosure, e.g., one or more of the steps shown in fig. 5, 6, 8, or 9.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the disclosure is not to be limited to the details of construction, the manner of drawing, or the manner of implementation, which has been set forth herein, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. An information processing method, characterized by comprising:

Acquiring predicted text information;

performing feature extraction processing on the predicted text information to obtain a predicted text feature coding representation;

Processing the predicted text feature coding representation through a position prediction unit of a target neural network model to determine predicted position information of predicted emotion description information in the predicted text information, wherein the position prediction unit comprises two classifiers, one classifier is used for predicting a head sequence in the predicted position information, the other classifier is used for predicting a tail sequence in the predicted position information, the head sequence is used for representing the position of a first word of the predicted emotion description information in the predicted text information, and the tail sequence is used for representing the position of a last word of the emotion description information in the predicted text information;

Extracting a predicted emotion description feature code representation corresponding to the predicted emotion description information from the predicted text feature code representation according to the head sequence and the tail sequence;

and determining the predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

2. The method of claim 1, wherein determining predicted emotion classification information corresponding to the predicted text information from the predicted emotion description feature encoded representation comprises:

performing feature fusion on the predicted emotion description feature coded representation and the predicted text feature coded representation to obtain a predicted emotion enhancement feature coded representation;

and classifying the predicted emotion enhancement feature coded representation to determine predicted emotion type information corresponding to the predicted text information.

3. The method according to claim 1, wherein the method further comprises:

Determining that the predicted emotion description information does not exist in the predicted text information according to the predicted text feature coding representation;

and classifying the predictive text feature coded representation to determine predictive emotion type information corresponding to the predictive text information.

4. The method of claim 1, wherein the information processing method is implemented by a target neural network model; wherein the method further comprises:

acquiring training text information and training emotion type labels corresponding to the training text information;

Performing feature extraction processing on the training text information to obtain training text feature coding representation;

Predicting training position information corresponding to training emotion description information in the training text information according to the training text feature code representation;

extracting training emotion description characteristic coding representation corresponding to the training emotion description information from the training text characteristic coding representation according to the training position information;

Predicting training emotion type information corresponding to the training text information according to the training emotion description characteristic coding representation;

determining a target loss value according to the training emotion type label, the training emotion type information and the training position information;

And training the target neural network model according to the target loss value.

5. The method of claim 4, wherein predicting training emotion classification information corresponding to the training text information based on the training emotion description feature encoded representation comprises:

performing feature fusion on the training emotion description feature coding representation and the training text feature coding representation to obtain a training emotion enhancement feature coding representation;

and classifying the training emotion enhancement feature coded representation to determine training emotion type information corresponding to the training text information.

6. The method of claim 4, wherein the target loss value comprises a first loss value and a second loss value; wherein determining a target loss value according to the training emotion type tag, the training emotion type information, and the training position information includes:

Generating the first loss value according to the training emotion type label and the training emotion type information;

acquiring an actual position information label corresponding to training emotion description information in the training text information;

generating the second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information label corresponding to the training emotion description information in the training text information;

And generating the target loss value according to the first loss value and the second loss value.

7. The method of claim 6, wherein generating the second penalty value based on training position information corresponding to training emotion description information in the training text information and an actual position information tag corresponding to training emotion description information in the training text information comprises:

extracting an actual emotion description feature code representation corresponding to the training emotion description information from the training text feature code representation according to an actual position information label corresponding to the training emotion description information in the training text information;

and generating the second loss value according to the actual emotion description feature coding representation and the training emotion description feature coding representation.

8. The method of claim 6, wherein the actual location information tag includes a head sequence for identifying a location corresponding to a first word of the training emotion description information and a tail sequence for identifying a location corresponding to a last word of the training emotion description information.

9. An information processing apparatus, characterized by comprising:

the predicted text information acquisition module is used for acquiring predicted text information;

the predictive text feature code representation acquisition module is used for carrying out feature extraction processing on the predictive text information so as to obtain a predictive text feature code representation;

A predicted position information obtaining module, configured to process the predicted text feature code representation through a position prediction unit of a target neural network model, so as to determine predicted position information corresponding to predicted emotion description information in the predicted text information, where the position prediction unit includes two classifiers, one classifier is used to predict a first sequence in the predicted position information, the other classifier is used to predict a tail sequence in the predicted position information, the first sequence is used to represent a position of a first word of the predicted emotion description information in the predicted text information, and the tail sequence is used to represent a position of a last word of the emotion description information in the predicted text information;

the predicted emotion description feature coding representation acquisition module is used for extracting predicted emotion description feature coding representations corresponding to the predicted emotion description information from the predicted text feature coding representations according to the first sequence and the tail sequence;

And the predicted emotion type information determining module is used for determining predicted emotion type information corresponding to the predicted text information according to the predicted emotion description characteristic coding representation.

10. The apparatus of claim 9, wherein the predictive emotion classification information determination module comprises:

The prediction emotion enhancement feature coding representation acquisition submodule is used for carrying out feature fusion on the prediction emotion description feature coding representation and the prediction text feature coding representation so as to obtain a prediction emotion enhancement feature coding representation;

And the prediction emotion enhancement feature coding representation classification sub-module is used for classifying the prediction emotion enhancement feature coding representation so as to determine the prediction emotion type information corresponding to the prediction text information.

11. The apparatus according to claim 9, wherein the information processing apparatus includes:

The prediction emotion description information absence determining module is used for determining that the prediction emotion description information does not exist in the prediction text information according to the prediction text feature coding representation;

And the prediction text feature coding representation classification module is used for classifying the prediction text feature coding representation to determine the prediction emotion type information corresponding to the prediction text information.

12. The apparatus according to claim 9, wherein said information processing apparatus further comprises:

The training emotion type label acquisition module is used for acquiring training text information and training emotion type labels corresponding to the training text information;

the training text feature code representation determining module is used for carrying out feature extraction processing on the training text information so as to obtain training text feature code representation;

The training position information determining module is used for predicting training position information corresponding to training emotion description information in the training text information according to the training text feature code representation;

the training emotion description feature code representation extraction module is used for extracting training emotion description feature code representations corresponding to the training emotion description information from the training text feature code representations according to the training position information;

The training emotion type information determining module is used for predicting training emotion type information corresponding to the training text information according to the training emotion description characteristic code representation;

the target loss value determining module is used for determining a target loss value according to the training emotion type label, the training emotion type information and the training position information;

and the neural network model training module is used for training the target neural network model according to the target loss value.

13. The apparatus of claim 12, wherein the training emotion classification information determination module comprises:

The training emotion enhancement feature code representation determination submodule is used for carrying out feature fusion on the training emotion description feature code representation and the training text feature code representation so as to obtain a training emotion enhancement feature code representation;

And the training emotion enhancement feature coding representation classification processing sub-module is used for classifying the training emotion enhancement feature coding representation so as to determine training emotion category information corresponding to the training text information.

14. The apparatus of claim 12, wherein the target loss value comprises a first loss value and a second loss value;

the first loss value determining submodule is used for generating the first loss value according to the training emotion type label and the training emotion type information;

the actual position information acquisition sub-module is used for acquiring an actual position information label corresponding to training emotion description information in the training text information;

the second loss value determining submodule is used for generating the second loss value according to training position information corresponding to training emotion description information in the training text information and an actual position information label corresponding to the training emotion description information in the training text information;

And the target loss value generation submodule is used for generating the target loss value according to the first loss value and the second loss value.

15. The apparatus of claim 14, wherein the second loss value determination submodule comprises:

The actual emotion description feature code representation determining unit is used for extracting the actual emotion description feature code representation corresponding to the training emotion description information from the training text feature code representation according to the actual position information label corresponding to the training emotion description information in the training text information;

And the second loss value generation subunit is used for generating the second loss value according to the actual emotion description feature coding representation and the training emotion description feature coding representation.

16. The apparatus of claim 14, wherein the actual location information tag comprises a head sequence for identifying a location corresponding to a first word of the training emotion description information and a tail sequence for identifying a location corresponding to a last word of the training emotion description information.

17. An electronic device, comprising:

a memory; and

A processor coupled to the memory, the processor being configured to perform the information processing method of any of claims 1-8 based on instructions stored in the memory.

18. A computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the information processing method according to any one of claims 1 to 8.

19. A computer program product comprising computer instructions stored in a computer readable storage medium, characterized in that the computer instructions, when executed by a processor, implement the method of any one of claims 1-8.