WO2022267454A1

WO2022267454A1 - Method and apparatus for analyzing text, device and storage medium

Info

Publication number: WO2022267454A1
Application number: PCT/CN2022/071433
Authority: WO
Inventors: 陈凯; 徐冰; 汪伟
Original assignee: 平安科技（深圳）有限公司
Priority date: 2021-06-24
Filing date: 2022-01-11
Publication date: 2022-12-29
Also published as: CN113420122A

Abstract

A method and apparatus for analyzing text, a device and a storage medium, which are applicable to the technical field of artificial intelligence. The method comprises: acquiring text to be analyzed (S101); identifying at least two entities in the text (S102), the text comprising a comment sentence which includes the at least two entities; extracting attribute information in the text by means of a pre-trained attribute extraction model (S103); and analyzing the at least two entities, the attribute information and the text by means of a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities (S104). According to the method, attribute factors are added during a comparison process, simple "entity-advantage and disadvantage" comparison in the prior art is converted into "entity-attribute information-advantage and disadvantage" comparison, and extracted analysis points are comprehensive and accurate, so that the entity comparison results obtained from the analysis are more accurate.

Description

Method, device, equipment and storage medium for analyzing text

This application claims the priority of the Chinese patent application with the application number 202110705319.4 and the title of the invention "method, device, equipment and storage medium for analyzing text" submitted at the Patent Office of the State Intellectual Property Office of the People's Republic of China on June 24, 2021 rights, the entire contents of which are incorporated in this application by reference.

technical field

The present application belongs to the technical field of artificial intelligence, and in particular relates to a method, device, equipment and storage medium for analyzing text.

Background technique

Sentiment analysis holds great promise in natural language processing applications. For example, users' satisfaction with products, companies, services, etc. can be evaluated through the comments posted by users on Internet platforms. Therefore, sentiment analysis is particularly important in natural language processing.

The inventor realizes that in the existing sentiment analysis, the extracted analysis points are not comprehensive, which leads to inaccurate sentiment analysis results.

technical problem

One of the purposes of the embodiments of the present application is to provide a method, device, device and storage medium for analyzing text, so as to solve the problem of inaccurate sentiment analysis results due to incomplete extracted analysis points in existing sentiment analysis.

technical solution

In the first aspect, the embodiment of the present application provides a method for analyzing text, wherein the method includes:

Obtaining the text to be analyzed, the text to be analyzed includes comment sentences containing at least two entities;

identifying at least two entities in the text to be analyzed;

extracting attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The at least two entities, the attribute information, and the text to be analyzed are analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.

In the second aspect, the embodiment of the present application provides a device for analyzing text, wherein the device includes:

an acquisition unit, configured to acquire the text to be analyzed;

An identification unit, configured to identify at least two entities in the text to be analyzed, the text to be analyzed includes commentary sentences containing at least two entities;

An extraction unit, configured to extract attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The analysis unit is configured to analyze the at least two entities, the attribute information, and the text to be analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.

In a third aspect, an embodiment of the present application provides a device for analyzing text, including a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein the processor executes the Realize when describing a computer program:

identifying at least two entities in the text to be analyzed;

In a fourth aspect, the embodiment of the present application provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium stores a computer program, and the computer program Implemented when executed by a processor:

identifying at least two entities in the text to be analyzed;

Beneficial effect

Compared with the prior art, the embodiment of the present application has the following beneficial effects: by obtaining the text to be analyzed; identifying at least two entities in the text to be analyzed, the text to be analyzed includes comment sentences containing at least two entities; The trained attribute extraction model extracts the attribute information in the text to be analyzed; through the pre-trained sentiment analysis model, at least two entities, the attribute information and the text to be analyzed are analyzed to obtain the sentiment analysis corresponding to at least two entities result. In the above scheme, the entity in the text to be analyzed is identified, and the attribute information in the text to be analyzed is extracted through the attribute extraction model; then the entity, attribute information, and the text to be analyzed are analyzed through the sentiment analysis model, and the analysis and comparison process is added. The attribute factor converts the simple "entity-advantages and disadvantages" comparison in the prior art into an "entity-attribute information-advantages and disadvantages" comparison, and the extracted analysis points are comprehensive and accurate, making the final entity comparison results more accurate.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present application more clearly, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can also obtain other drawings based on these drawings on the premise of not paying creative work.

Fig. 1 is a schematic flowchart of a method for analyzing text provided by an exemplary embodiment of the present application;

FIG. 2 is a specific flowchart of step S102 of the method for analyzing text shown in an exemplary embodiment of the present application;

FIG. 3 is a schematic flowchart of a method for analyzing text provided by another embodiment of the present application;

Fig. 4 is a specific flowchart of step S204 of the method for analyzing text shown in an exemplary embodiment of the present application;

Fig. 5 is a schematic flowchart of a method for analyzing text shown in an exemplary embodiment of the present application;

Fig. 6 is a schematic diagram of a device for analyzing text provided by an embodiment of the present application;

Fig. 7 is a schematic diagram of a device for analyzing text provided by another embodiment of the present application.

The realization, functional features and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Embodiments of the present invention

In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

In the description of the embodiments of this application, unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in this article is only a description of the association of associated objects A relationship means that there may be three kinds of relationships, for example, A and/or B means: A exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "plurality" refers to two or more than two.

Hereinafter, the terms "first" and "second" are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of this embodiment, unless otherwise specified, "plurality" means two or more.

However, in the existing sentiment analysis, the problem is often simplified to the comparison of "entity-advantages and disadvantages", so that the extracted analysis points are not comprehensive, which leads to inaccurate sentiment analysis results. For example, in a comment sentence "The mobile phone of brand A is more expensive than that of brand B, but its performance is better", the comparison entity refers to "brand A" and "brand B". For "price", brand A is the inferior party, but For "performance", brand A is the dominant side. The existing technology does not pay attention to the two attribute information of "price" and "performance", and can only get a comparison result. At this time, the comparison result must have a certain value in terms of the two attributes of "price" and "performance". is wrong, so this comparison is not accurate.

In view of this, the present application provides a method for analyzing text, obtaining the text to be analyzed; identifying at least two entities in the text to be analyzed, and the text to be analyzed includes comment sentences containing at least two entities; The attribute extraction model extracts attribute information in the text to be analyzed; at least two entities, the attribute information, and the text to be analyzed are analyzed by a pre-trained sentiment analysis model, and sentiment analysis results corresponding to at least two entities are obtained. In the above scheme, the entity in the text to be analyzed is identified, and the attribute information in the text to be analyzed is extracted through the attribute extraction model; then the entity, attribute information, and the text to be analyzed are analyzed through the sentiment analysis model, and the analysis and comparison process is added. The attribute factor converts the simple "entity-advantages and disadvantages" comparison in the prior art into an "entity-attribute information-advantages and disadvantages" comparison, and the extracted analysis points are comprehensive and accurate, making the final entity comparison results more accurate.

Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of a method for analyzing text provided by an exemplary embodiment of the present application. The subject of execution of the method for analyzing text provided in this application is a device for analyzing text, wherein the device includes but is not limited to smart phones, tablet computers, computers, personal digital assistants (Personal Digital Assistant, PDA), desktop computers and other terminals, and may also include various types of servers. In this example, the terminal is used as an example for illustration. The method for analyzing text as shown in Figure 1 may include: S101~S104, specifically as follows:

S101: Obtain text to be analyzed.

The text to be analyzed refers to the text that needs to perform sentiment analysis on the entities in the text. Since the sentiment analysis in this embodiment refers to the comparison of entities, the comparison is only necessary when there are at least two entities, so the text to be analyzed includes comment sentences containing at least two entities. There is no limit to the length and number of comments. For example, a certain text to be analyzed may be "the market value of company A exceeds that of company B", "the market value of company A exceeds that of company B, but the reputation of company B exceeds that of company A", etc. Optionally, the text to be analyzed may also be an article, a paragraph of text, etc. composed of comment sentences containing at least two entities. The description here is only for illustration and not for limitation.

Exemplarily, when the terminal detects the analysis instruction, it acquires the text to be analyzed. The analysis instruction may be triggered by the user, for example, the user clicks an analysis option in the terminal. Acquiring the text to be analyzed may be the text to be analyzed uploaded by the user to the terminal, or the terminal may obtain the text file corresponding to the file ID according to the file ID contained in the analysis instruction to obtain the text to be analyzed.

S102: Identify at least two entities in the text to be analyzed.

Entities refer to things that exist objectively and can be distinguished from each other. All entities in the text to be analyzed can be identified through a pre-trained named entity recognition model.

S103: Extract attribute information in the text to be analyzed by using a pre-trained attribute extraction model.

Word segmentation processing is performed on the text to be analyzed to obtain multiple word segmentations. Word segmentation processing refers to dividing a continuous word sequence in the text to be analyzed into multiple word sequences, that is, multiple word segmentations, through a word segmentation algorithm. The attribute extraction model may include a word segmentation algorithm, through which the word segmentation process is performed on the text to be analyzed to obtain multiple word segments corresponding to the text to be analyzed. That is, the content in the text to be analyzed is divided into multiple word segmentations through a word segmentation algorithm. Wherein, the participle may be a word or a single character. Exemplarily, multiple word segmentation methods corresponding to the text to be analyzed can be determined according to the word segmentation algorithm, and the most suitable word segmentation method is selected to perform word segmentation on the text to be analyzed to obtain multiple word segmentations corresponding to the text to be analyzed. For example, word segmentation processing is performed on "the market value of company A exceeds that of company B" to obtain "company A/market value/exceeded/company B".

Pre-trained attribute extraction models include Bert network, Dense network and CRF network. Among them, the Bert network is used to convert multiple word segments corresponding to the text to be analyzed into word vectors corresponding to each word; the Dense network is used to classify each word vector, and output each word vector belongs to the category of attribute information The probability of ; the CRF network is used to label the word vectors belonging to attribute information.

Exemplarily, multiple word segments are input into the Bert network for processing, and the Bert network maps each word segment to a common semantic space, and outputs a word vector corresponding to each word segment. There is no limit to the processing order of each word segment. You can input each word segment in sequence according to the order of the word segments, and map each word segment to obtain the word vector corresponding to each word segment; you can also input each word segment out of order, and Each word is mapped to obtain the word vector corresponding to each word. The description here is only for illustration and not for limitation.

Due to the pre-trained attribute extraction model, the ability to judge whether each word segment belongs to attribute information has been learned during the training process. Therefore, the word vector corresponding to each word segment is input into the Dense network for processing, and the Dense network judges each word. Whether the vector belongs to the attribute information, and output the probability that each word vector belongs to the attribute information. For example, the probabilities of the word vectors corresponding to the participles of A company, market value, exceeding, and B company belonging to the attribute information are 0.2, 0.9, 0.1, and 0.2 in sequence.

The output of the Dense network is input into the CRF network, and the CRF network labels the word vector with the highest probability, and outputs the attribute information corresponding to the word vector. For example, the probability corresponding to the market value is the highest, and it is most likely to be attribute information. The word vector corresponding to "market value" is marked with the "BIO" label through the CRF network, where B is used to mark the initial character of the attribute information, and I is used to mark The middle character of attribute information, O is used to mark non-attribute information characters. For example, B is used to mark "city", I is used to mark "value", and O is used to mark after "value" and before "exceeding". This is only an exemplary description and is not limited to this.

S104: Analyze at least two entities, attribute information, and text to be analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to at least two entities.

Obtain the tag corresponding to each entity and the attribute tag corresponding to the attribute information, add the tag corresponding to each entity and the attribute tag corresponding to the attribute information to the text to be analyzed, and input the tagged text into the pre-trained sentiment analysis The model is processed and the sentiment analysis results are output.

Exemplarily, one piece of attribute information corresponds to one kind of sentiment analysis result, and when there is multiple pieces of attribute information, multiple sentiment analysis results are correspondingly output. Wherein, each sentiment analysis result judges the advantages and disadvantages of the two entities based on each attribute information. For example, if the text to be analyzed is "Company A's market value exceeds that of Company B, but Company B has a good reputation", the corresponding entities in the text to be analyzed are Company A and Company B, and the attribute information is market value and word of mouth. The corresponding entities of the text to be analyzed are The final sentiment analysis result can be: the market value of company A is better than that of company B, the reputation of company B is better than that of company A, or the market value of company A is better than that of company B, and the reputation of company A is worse than that of company B. The description here is only for illustration and not for limitation.

In the above-described embodiment, the text to be analyzed is obtained; at least two entities in the text to be analyzed are identified, and the text to be analyzed includes a comment sentence containing at least two entities; Attribute information; analyze at least two entities, the attribute information, and the text to be analyzed through a pre-trained sentiment analysis model, and obtain sentiment analysis results corresponding to at least two entities. In this implementation, by identifying the entity in the text to be analyzed, the attribute information in the text to be analyzed is extracted through the attribute extraction model; and then the entity, attribute information, and the text to be analyzed are analyzed through the sentiment analysis model, and the analysis and comparison process is added. The attribute factor converts the simple "entity-advantages and disadvantages" comparison in the prior art into an "entity-attribute information-advantages and disadvantages" comparison, and the extracted analysis points are comprehensive and accurate, making the final entity comparison results more accurate.

FIG. 2 is a specific flowchart of step S102 of the method for analyzing text shown in an exemplary embodiment of the present application; in some possible implementations of the present application, the above S101 may include S1021~S1022, specifically as follows:

S1021: Perform word segmentation processing on the text to be analyzed to obtain multiple first word segments.

Exemplarily, a word segmentation algorithm is used to perform word segmentation processing on the text to be analyzed to obtain a plurality of first word segments corresponding to the text to be analyzed. For the specific word segmentation process, please refer to the word segmentation process in S103, which will not be repeated here.

Optionally, in a possible implementation manner, before S1021, the text to be analyzed may also be preprocessed to obtain a preprocessing result. Among them, preprocessing refers to extracting and removing redundant information in the text to be analyzed. Redundant information refers to information that has no practical meaning in the text to be analyzed. For example, redundant information may be stop words, punctuation marks, etc. in the text to be analyzed. Stop words are usually determiners, modal particles, adverbs, prepositions, conjunctions, English characters, numbers, mathematical characters, etc. Among them, English characters are letters that exist alone and have no practical meaning. If the English character is a combination of letters and has meaning, at this time, the English character is considered as a valid character and will not be removed. For example, when the English characters are CPU, MAC, HR, etc., they will be reserved as valid characters and will not be removed. The description here is only for illustration and not for limitation. Word segmentation processing is performed on the preprocessing result to obtain multiple first word segmentations.

In this implementation method, the text to be analyzed is preprocessed, and the redundant information in the text to be analyzed is removed in advance, so that when the subsequent named entity recognition model processes the preprocessed text to be analyzed, the redundancy of redundant information is reduced. Interference speeds up the processing speed of the named entity recognition model and improves the accuracy of the processing results.

S1022: Process multiple first participles based on the pre-trained named entity recognition model to obtain at least two entities in the text to be analyzed.

The named entity recognition model is used to identify entities in the text to be analyzed. The type of the named entity recognition model is not limited. For example, the named entity recognition model can be a BERT+CRF model or a BERT+BiLSTM+CRF model.

Exemplarily, multiple first participles are input into the named entity recognition model, and if there are many first participles input, the first several participles are intercepted. For example, if the total length of all input first participle exceeds the preset length, the first participle with the preset length is intercepted. Alternatively, if the total characters of all input first participles exceed the preset character length, the first participle with the preset character length is intercepted. For example, if the total characters of all input first participles exceed 512 characters, the first participle corresponding to the length of the first 512 characters is intercepted.

The intercepted first participles are input to the Bert network in the named entity recognition model for processing, and the Bert network maps each first participle to the public semantic space, and outputs the word vector corresponding to each first participle. The output of the Bert network is input into the CRF network, and the CRF network in the named entity recognition model labels the entities in these word vectors and outputs the recognized entities. For example, the word vector corresponding to "market value" is tagged with "bio" through the CRF network, where b is used to mark the starting character of the entity, i is used to mark the middle character of the entity, and o is used to mark the non-entity character. For example, b is used to mark "A", i is used to mark "company", and o is marked after "division" and before "city". This is only an illustration and not limited.

Optionally, before S1021, training a named entity recognition model may also be included. The named entity recognition model is obtained by training the training set using a machine learning algorithm. Exemplarily, a plurality of sample comment sentences are collected in advance, and entities in each sample comment sentence are marked. A training set is formed based on these sample comment sentences and the labeled entities in the sample comment sentences.

Optionally, a part of the data in the training set can also be used as a test set to facilitate subsequent testing of the model. For example, several sample comment sentences are selected in the training set, and the sample entities corresponding to these sample comment sentences are used as the test set.

Exemplarily, each sample comment sentence in the training set is processed by an initial named entity recognition network (named entity recognition model before training), to obtain the entity corresponding to each sample comment sentence. For the specific process of processing the sample comment sentence by the initial named entity recognition network, refer to the specific process in S1021-S1022 above, and will not be repeated here.

When the preset number of training times is reached, the initial named entity recognition network at this time is tested. Exemplarily, the sample comment sentence in the test set is input into the current initial named entity recognition network for processing, and the current initial named entity recognition network outputs the entity corresponding to the sample comment sentence. A first loss value between the entity corresponding to the sample comment sentence and the sample entity corresponding to the sample comment sentence in the test set is calculated based on the loss function. Wherein, the loss function may be a cross-entropy loss function.

When the first loss value does not meet the first preset condition, adjust the parameters of the initial named entity recognition network (for example, adjust the weight values corresponding to each network layer of the initial named entity recognition network), and continue to train the initial named entity recognition network . When the first loss value satisfies the first preset condition, the training of the initial named entity recognition network is stopped, and the trained initial named entity recognition network is used as a trained named entity recognition model. For example, assume that the first preset condition is that the loss value is less than or equal to a preset loss value threshold. Then, when the first loss value is greater than the loss value threshold, adjust the parameters of the initial named entity recognition network, and continue to train the initial named entity recognition network. When the first loss value is less than or equal to the loss value threshold, stop training the initial named entity recognition network, and use the trained initial named entity recognition network as a trained named entity recognition model. The description here is only for illustration and not for limitation.

Optionally, during the process of training the initial named entity recognition network, observe the convergence of the loss function corresponding to the initial named entity recognition network. When the loss function does not converge, adjust the parameters of the initial named entity recognition network, and continue to train the initial named entity recognition network based on the training set. When the loss function converges, stop training the initial named entity recognition network, and use the trained initial named entity recognition network as a trained named entity recognition model. Among them, the loss function convergence means that the value of the loss function tends to be stable. The description here is only for illustration and not for limitation.

In the above implementation method, the named entity recognition model is obtained by using the machine learning algorithm to train the training set, and then the entity in the text to be analyzed is identified through the named entity recognition model, which can accurately and quickly identify the entity in the text to be analyzed, which is convenient Follow up the entity for sentiment analysis, and then get accurate sentiment analysis results.

Optionally, in some possible implementations of the present application, the above S104 may include S1041~S1044, specifically as follows:

S1041: Acquire an entity tag group, where the entity tag group includes tags corresponding to the entities to be compared.

In this embodiment, at least two entities corresponding to the text to be analyzed include a group of entities to be compared. Exemplarily, when there are two entities corresponding to the text to be analyzed, these two entities are entities that can be compared, and it can be understood that these two entities are entities of different subjects. When there are multiple entities corresponding to the text to be analyzed, at least one group of entities can be compared.

The entity label group refers to the labels corresponding to the two entities to be compared. For example, the text to be analyzed is "the market value of company A exceeds that of company B", and the corresponding entities are "company A" and "company B". Wherein, "Company A" and "Company B" are a group of entities to be compared. The entity tag group refers to the entity tag corresponding to "Company A" and the entity tag corresponding to "Company B".

When the entity in the text to be analyzed is identified by the named entity recognition model, the entity in the text to be analyzed is marked with a "bio" label, and the position of each entity in the text to be analyzed can be determined through the label mark. Sets the entity labels for each entity in the order in which they were determined. Extract the entity labels corresponding to the two entities to be compared.

S1042: Obtain an attribute label corresponding to the attribute information.

When the attribute information in the text to be analyzed is extracted through the attribute extraction model, the attribute information in the text to be analyzed is marked with a "BIO" label, and the position of each attribute information in the text to be analyzed can be determined through the label mark. Set entity tags for each attribute information.

For example, the text to be analyzed is "the market value of company A exceeds that of company B", the corresponding attribute information is "market value", and the attribute tag "<asp></asp>" is set for "market value". The description here is only for illustration and not for limitation.

S1043: Add entity tag groups and attribute tags to the text to be analyzed to obtain a second target text to be analyzed.

According to the positions of the two entities to be compared in the text to be analyzed, and the corresponding entity labels of the two entities, the entity labels corresponding to the two entities are added to the text to be analyzed, and the attribute information and the attribute The attribute label corresponding to the information is added to the beginning of the text to be analyzed to obtain the second target text to be analyzed.

For example, add "<s></s>", "<o></o>", "<asp>market value</asp>" to the text to be analyzed, and get "<asp>market value</asp> <s>Company A</s> has more market value than <o>Company B</o>".

Optionally, the attribute information and the attribute tag corresponding to the attribute information can also be added to the end of the text to be analyzed to obtain "<s>A company</s> market value exceeds <o>B company</o><asp> Market Cap</asp>". The description here is only for illustration and not for limitation.

S1044: Analyze the second target text to be analyzed by using the sentiment analysis model, and obtain sentiment analysis results corresponding to at least two entities.

Exemplarily, mapping processing is performed on the second target text to be analyzed to obtain a semantic vector corresponding to the second target text to be analyzed. To classify the semantic vector is to judge which emotional tendency the semantic vector belongs to.

In the above implementation, the second target text to be analyzed is analyzed through the sentiment analysis model, since the second target text to be analyzed contains attribute tags corresponding to attribute information and entity tags corresponding to the two entities to be compared. Attribute factors are considered in the process, and the extracted analysis points are comprehensive and accurate, which makes the entity comparison results obtained by analysis more accurate.

Optionally, in some possible implementations of the present application, the above S1044 may include S10441~S10444, specifically as follows:

S10441: Perform word segmentation processing on the second target text to be analyzed to obtain multiple third word segments.

For the specific implementation process of performing word segmentation processing on the second target text to be analyzed to obtain multiple third word segmentations, please refer to the process of word segmentation processing in S103 , which will not be repeated here.

S10442: Perform mapping processing on each third participle through the sentiment analysis model to obtain a word vector corresponding to each third participle.

Exemplarily, multiple third word segmentations are input into the Bert network in the sentiment analysis model for processing, and the Bert network maps each word segmentation to a common semantic space, and outputs a word vector corresponding to each third segmentation word.

S10443: Based on the processing sequence of performing word segmentation processing on the second target text to be analyzed, combine the word vectors corresponding to each third word segment to obtain a target word vector set.

Exemplarily, the long short-term memory network (Long Short-Term Memory, LSTM) processes the word vectors corresponding to each third word, and the network will combine the word vectors corresponding to each third word based on the processing order of the second target text to be analyzed, and output A collection of target word vectors.

S10444: Analyze the target word vector set to obtain a sentiment analysis result.

The target word vector set is input to the Dense network in the sentiment analysis model for processing. The Dense network judges the probability that the target word vector set belongs to each emotional tendency, and outputs the emotional tendency with the highest probability, that is, the output sentiment analysis result. For example, the final sentiment analysis result corresponding to the text to be analyzed may be: Company A's market value is higher than that of Company B, Company A is in an advantage, Company B's market value is inferior to Company A, Company B is in a disadvantage, and so on. The description here is only for illustration and not for limitation.

Fig. 3 is a schematic flowchart of a method for analyzing text provided by another embodiment of the present application. Exemplarily, in some possible implementations of the present application, the method for analyzing text as shown in FIG. 3 may include: S201~S206, specifically as follows:

S201: Obtain a text to be analyzed, where the text to be analyzed includes a comment sentence containing at least two entities.

S202: Identify at least two entities in the text to be analyzed.

For S201-S202 in this example, reference may be made to the description of S101-S102 in the embodiment corresponding to FIG. 1 , which will not be repeated here.

S203: Obtain an entity tag corresponding to each entity.

When the entity in the text to be analyzed is identified by the named entity recognition model, the entity in the text to be analyzed is marked with a "bio" label, and the position of each entity in the text to be analyzed can be determined through the label mark. Sets the entity labels for each entity in the order in which they were determined.

For example, the text to be analyzed is "Company A's market capitalization exceeds Company B", and its corresponding entities are "Company A" and "Company B". Set the entity label "<s></s>" for "Company A", and for "Company B" sets the entity tag "<o></o>". The description here is only for illustration and not for limitation.

S204: Add the entity label corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed.

According to the position of each entity in the text to be analyzed and the entity label corresponding to each entity, the entity label corresponding to each entity is added to the text to be analyzed to obtain the first target text to be analyzed. For example, add "<s></s>" and "<o></o>" to the text to be analyzed to get the first target text to be analyzed, that is, "<s>Company A</s> market value exceeds < o>Company B</o>". The description here is only for illustration and not for limitation.

S205: Extract attribute information in the first target text to be analyzed by using a pre-trained attribute extraction model.

For the specific process of extracting the attribute information in the first target text to be analyzed by using the attribute extraction model, please refer to the specific process of extracting the attribute information in the text to be analyzed by using the attribute extraction model in S103. It is worth noting that, in this embodiment, an entity tag is added to the entity. When the attribute information in the first target text to be analyzed is extracted through the attribute extraction model, the participle with the added entity tag can be ignored, and only other participle are processed. , due to the lack of entity interference, the accuracy and speed of extracting attribute information are improved.

S206: Analyze at least two entities, attribute information, and text to be analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to at least two entities.

For S206 in this example, reference may be made to the description of S104 in the embodiment corresponding to FIG. 1 , and details are not repeated here.

In the above embodiment, entity tags are added to the entities. When the attribute information in the first target text to be analyzed is extracted through the attribute extraction model, the participle with the added entity tags can be ignored, and only other participle are processed. Due to the lack of The interference of entities improves the accuracy and speed of extracting attribute information.

Fig. 4 is a specific flowchart of step S204 of the method for analyzing text shown in an exemplary embodiment of the present application; in some possible implementations of the present application, the above S204 may include S2041~S2043, specifically as follows:

S2041: Perform word segmentation processing on the text to be analyzed to obtain multiple second word segments.

For the specific implementation process of performing word segmentation processing on the text to be analyzed to obtain multiple second word segmentations, please refer to the process of word segmentation processing in S103, which will not be repeated here.

S2042: Perform mapping processing on each second participle through the attribute extraction model to obtain a word vector corresponding to each second participle.

Exemplarily, multiple second word segmentations are input into the Bert network in the attribute extraction model for processing, and the Bert network maps each word segmentation to a common semantic space, and outputs a word vector corresponding to each second word segmentation.

S2043: Add an entity label corresponding to each entity to each word vector to obtain the first target text to be analyzed.

Add the entity label corresponding to each entity to the word vector corresponding to each second word segment to obtain the first target text to be analyzed. For example, add "<s></s>" and "<o></o>" entity tags to the word vectors corresponding to each second word segment to obtain the first target text to be analyzed. The description here is only for illustration and not for limitation.

In this embodiment, the entity label corresponding to each entity is added to each word vector, which enhances the connection between each word vector and the entity, and facilitates the attribute information and entity height in the text to be analyzed extracted by the attribute extraction model It also improves the accuracy of extracting attribute information.

Fig. 5 is a schematic flowchart of a method for analyzing text shown in an exemplary embodiment of the present application; it mainly involves the process of obtaining an attribute extraction model before executing the method for analyzing text as shown in Fig. 1 . The method includes: S301~S303, specifically as follows:

S301: Obtain a sample training set, where the sample training set includes multiple sample texts and an attribute label corresponding to each sample text.

Exemplarily, the sample training set may come from data published in the network. Collect multiple sample texts, and set attribute labels for the attribute information in each sample text. It is worth noting that the sample text here may be the same as or different from the sample comment sentences used in training the named entity recognition model, and there is no limitation on this.

Optionally, a part of the data in the sample training set can also be used as a sample test set to facilitate subsequent testing of the attribute extraction model in training. For example, several sample texts are selected in the sample training set, and the respective attribute labels corresponding to these sample texts are used as the sample test set.

S302: Train the initial attribute extraction network based on the sample training set, and update the parameters of the initial attribute extraction network based on the training result.

Exemplarily, each sample text in the sample training set is processed through an initial attribute extraction network (attribute extraction model before training), to obtain attribute information corresponding to each sample text. For the specific process of processing the sample text by the initial attribute extraction network, refer to the specific process in S103 above, which will not be repeated here.

When the preset number of training times is reached, the initial attribute extraction network at this time is tested. Exemplarily, the sample text in the sample test set is input into the current initial attribute extraction network for processing, and the current initial attribute extraction network outputs the actual attribute information corresponding to the sample text. A second loss value between the actual attribute information corresponding to the sample text and the attribute information corresponding to the sample text in the sample test set is calculated based on a loss function. Wherein, the loss function may be a cross-entropy loss function.

When the second loss value does not meet the second preset condition, adjust the parameters of the initial attribute extraction network (for example, adjust the weight values corresponding to each network layer of the initial attribute extraction network), and continue to train the initial attribute extraction network. When the second loss value satisfies the second preset condition, the training of the initial attribute extraction network is stopped, and the trained initial attribute extraction network is used as a trained attribute extraction model.

For example, assume that the second preset condition is that the loss value is less than or equal to a preset loss value threshold. Then, when the second loss value is greater than the loss value threshold, adjust the parameters of the initial attribute extraction network, and continue to train the initial attribute extraction network. When the second loss value is less than or equal to the loss value threshold, stop training the initial attribute extraction network, and use the trained initial attribute extraction network as a trained attribute extraction model. The description here is only for illustration and not for limitation.

S303: When it is detected that the loss function corresponding to the initial attribute extraction network converges, an attribute extraction model is obtained.

Exemplarily, during the process of training the initial attribute extraction network, it is also possible to observe the convergence of the loss function corresponding to the initial attribute extraction network. When the loss function does not converge, adjust the parameters of the initial attribute extraction network, and continue to train the initial attribute extraction network based on the sample training set. When the loss function converges, stop training the initial attribute extraction network, and use the trained initial attribute extraction network as a trained attribute extraction model. Among them, the loss function convergence means that the value of the loss function tends to be stable. The description here is only for illustration and not for limitation.

Optionally, the method for analyzing text provided in this application may further include training a sentiment analysis model. The sentiment analysis model is obtained by training the training set using a machine learning algorithm. Exemplarily, a plurality of sample sentiment analysis sentences containing emotional tendencies are collected in advance, and a sample sentiment analysis result corresponding to each sample sentiment analysis sentence is set. A training set is formed based on these sample sentiment analysis sentences and sample sentiment analysis results corresponding to the sample sentiment analysis sentences.

Optionally, a part of the data in the training set can also be used as a test set to facilitate subsequent testing of the sentiment analysis model. For example, several sample sentiment analysis sentences are selected in the training set, and the sample sentiment analysis results corresponding to these sample sentiment analysis sentences are used as the test set.

Exemplarily, each sample sentiment analysis sentence in the training set is processed by an initial sentiment analysis network (sentiment analysis model before training), to obtain an actual sentiment analysis result corresponding to each sample sentiment analysis sentence. For the specific process of processing the sample sentiment analysis sentence by the initial sentiment analysis network, refer to the specific process in S104 above, which will not be repeated here.

When the preset number of training times is reached, the initial sentiment analysis network at this time is tested. Exemplarily, the sample sentiment analysis sentence in the test set is input into the current initial sentiment analysis network for processing, and the current initial sentiment analysis network outputs the actual sentiment analysis result corresponding to the sample sentiment analysis sentence. A third loss value between the actual sentiment analysis result corresponding to the sample sentiment analysis sentence and the sample sentiment analysis result corresponding to the sample sentiment analysis sentence in the test set is calculated based on the loss function. Wherein, the loss function may be a cross-entropy loss function.

When the third loss value does not meet the third preset condition, adjust the parameters of the initial sentiment analysis network (for example, adjust the weight values corresponding to each network layer of the initial sentiment analysis network), and continue to train the initial sentiment analysis network. When the third loss value satisfies the third preset condition, the training of the initial sentiment analysis network is stopped, and the trained initial sentiment analysis network is used as a trained sentiment analysis model. For example, assume that the third preset condition is that the loss value is less than or equal to a preset loss value threshold. Then, when the third loss value is greater than the loss value threshold, adjust the parameters of the initial sentiment analysis network, and continue to train the initial sentiment analysis network. When the third loss value is less than or equal to the loss value threshold, the training of the initial sentiment analysis network is stopped, and the trained initial sentiment analysis network is used as a trained sentiment analysis model. The description here is only for illustration and not for limitation.

Optionally, during the process of training the initial sentiment analysis network, observe the convergence of the loss function corresponding to the initial sentiment analysis network. When the loss function does not converge, adjust the parameters of the initial sentiment analysis network, and continue to train the initial sentiment analysis network based on the training set. When the loss function converges, stop training the initial sentiment analysis network, and use the trained initial sentiment analysis network as a trained sentiment analysis model. Among them, the loss function convergence means that the value of the loss function tends to be stable. The description here is only for illustration and not for limitation.

Optionally, in a possible implementation manner, the named entity recognition model, the attribute extraction model and the sentiment analysis model are trained simultaneously. At this time, the training sample sets used by the three models can be similar. For example, they can all be sample analysis texts. For each different model, the labels corresponding to the sample analysis texts are different. For the specific training process, please refer to the previous section for each The process of training the model individually. It is worth noting that when the three models are trained together, the corresponding loss values of the three models can be weighted and superimposed, and when comparing whether the weighted and superimposed loss value satisfies the fourth preset condition, if the fourth preset condition is not satisfied , adjust the corresponding parameters of the three models during the training process, and continue to train the three models; if the loss value after weighted superposition meets the fourth preset condition, stop training the three models, and obtain the three trained models.

It is assumed that the fourth preset condition is that the loss value is less than or equal to a preset loss value threshold. Then, when the loss value after weighted superposition is greater than the loss value threshold, adjust the corresponding parameters of the three models during the training process, and continue to train the three models. When the loss value after weighted superposition is less than or equal to the loss value threshold, the training of these three models is stopped, and three trained models are obtained. The description here is only for illustration and not for limitation.

In the above implementation, training the three models at the same time can improve the fit of the three models when processing data, and the three models supervise each other, so that in actual use, the entity comparison results obtained by analysis are more accurate.

Please refer to FIG. 6 . FIG. 6 is a schematic diagram of an apparatus for analyzing text provided by an embodiment of the present application. The units included in the device are used to execute the steps in the embodiments corresponding to FIG. 1 to FIG. 5 . For details, please refer to the relevant descriptions in the embodiments corresponding to FIG. 1 to FIG. 5 . For ease of description, only the parts related to this embodiment are shown. See Figure 6, including:

An acquisition unit 410, configured to acquire text to be analyzed;

An identification unit 420, configured to identify at least two entities in the text to be analyzed, where the text to be analyzed includes comment sentences containing at least two entities;

An extraction unit 430, configured to extract attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The analysis unit 440 is configured to analyze the at least two entities, the attribute information, and the text to be analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.

Optionally, the identification unit 420 is specifically configured to:

performing word segmentation processing on the text to be analyzed to obtain a plurality of first word segmentations;

Processing the plurality of first word segmentations based on a pre-trained named entity recognition model to obtain at least two entities in the text to be analyzed.

Optionally, the device also includes:

a label acquisition unit, configured to acquire an entity label corresponding to each entity;

an adding unit, configured to add an entity label corresponding to each entity to the text to be analyzed to obtain a first target text to be analyzed;

The extraction unit 430 is specifically used for:

The attribute information in the first target text to be analyzed is extracted by using a pre-trained attribute extraction model.

Optionally, the adding unit is specifically used for:

performing word segmentation processing on the text to be analyzed to obtain a plurality of second word segmentations;

Mapping each second participle through the attribute extraction model to obtain a word vector corresponding to each second participle;

An entity label corresponding to each entity is added to each word vector to obtain the first target text to be analyzed.

Optionally, the at least two entities include a group of entities to be compared, and the analysis unit 440 is specifically configured to:

Obtaining an entity tag group, the entity tag group includes tags corresponding to the entities to be compared;

Obtain an attribute tag corresponding to the attribute information;

adding the entity tag group and the attribute tag to the text to be analyzed to obtain a second target text to be analyzed;

The second target text to be analyzed is analyzed by the sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.

Optionally, the analyzing unit 440 is also used for:

performing word segmentation processing on the second target text to be analyzed to obtain a plurality of third word segmentations;

Each third participle is mapped through the sentiment analysis model to obtain a word vector corresponding to each third participle;

Based on the processing sequence of word segmentation processing for the second target text to be analyzed, the word vectors corresponding to each third word segment are combined to obtain a target word vector set;

The target word vector set is analyzed to obtain the sentiment analysis result.

Optionally, the device also includes a training unit, specifically for:

Obtain a sample training set, the sample training set includes a plurality of sample texts, and an attribute label corresponding to each sample text;

Training an initial attribute extraction network based on the sample training set, and updating parameters of the initial attribute extraction network based on a training result;

When it is detected that the loss function corresponding to the initial attribute extraction network converges, the attribute extraction model is obtained.

Please refer to FIG. 7 . FIG. 7 is a schematic diagram of a device for analyzing text provided by another embodiment of the present application. As shown in FIG. 7 , the text analysis device 5 of this embodiment includes: a processor 50 , a memory 51 , and computer instructions 52 stored in the memory 51 and operable on the processor 50 . When the processor 50 executes the computer instruction 52, it implements the steps in the above embodiments of the text analysis method, for example, S101 to S104 shown in FIG. 1 . Alternatively, when the processor 50 executes the computer instruction 52, the functions of the units in the above embodiments are realized, for example, the functions of the units 410 to 440 shown in FIG. 6 .

Exemplarily, the computer instruction 52 may be divided into one or more units, and the one or more units are stored in the memory 51 and executed by the processor 50 to complete the present application. The one or more units may be a series of computer instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer instruction 52 in the text analysis device 5 . For example, the computer instruction 52 may be divided into an acquisition unit, an identification unit, an extraction unit and an analysis unit, and the specific functions of each unit are as described above.

The device for analyzing text may include, but not limited to, a processor 50 and a memory 51 . Those skilled in the art can understand that FIG. 7 is only an example of the device 5 for analyzing text, and does not constitute a limitation to the device for analyzing text. It may include more or less components than those shown in the figure, or combine certain components, or Different components, for example, the device for analyzing text may also include input and output devices, network access devices, buses, and so on.

The so-called processor 50 may be a central processing unit (Central Processing Unit, CPU), and may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

The storage 51 may be an internal storage unit of the device for analyzing text, such as a hard disk or memory of the device for analyzing text. The memory 51 can also be an external storage terminal of the device for analyzing text, such as a plug-in hard disk equipped on the device for analyzing text, a smart memory card (Smart memory card) Media Card, SMC), Secure Digital (Secure Digital, SD) card, Flash Card (Flash Card), etc. Further, the memory 51 may also include both an internal storage unit of the device for analyzing text and an external storage terminal. The memory 51 is used to store the computer instructions and other programs and data required by the terminal. The memory 51 can also be used to temporarily store data that has been output or will be output.

The embodiment of the present application also provides a computer storage medium. The computer storage medium may be non-volatile or volatile. The computer storage medium stores a computer program. When the computer program is executed by a processor, the above-mentioned analysis Steps in the method examples of the text.

The present application also provides a computer program product. When the computer program product is run on the device, the device is made to execute the steps in the above embodiments of the method for analyzing text.

The embodiment of the present application also provides a chip or integrated circuit, the chip or integrated circuit includes: a processor, used to call and run a computer program from the memory, so that the device installed with the chip or integrated circuit executes the above-mentioned analysis texts The steps in the method embodiment.

The above-described embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still implement the foregoing embodiments Modifications to the technical solutions recorded in the examples, or equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit of the technical solutions of the various embodiments of the application, and should be included in this application. within the scope of the application.

Claims

A method of analyzing text, comprising:

Obtaining the text to be analyzed, the text to be analyzed includes comment sentences containing at least two entities;

identifying at least two entities in the text to be analyzed;

extracting attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The at least two entities, the attribute information, and the text to be analyzed are analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
The method according to claim 1, wherein said identifying at least two entities in the text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of first word segmentations;

Processing the plurality of first word segmentations based on a pre-trained named entity recognition model to obtain at least two entities in the text to be analyzed.
The method according to claim 1, wherein, before extracting the attribute information in the text to be analyzed through the pre-trained attribute extraction model, the method further comprises:

Get the entity label corresponding to each entity;

adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed;

The attribute information in the text to be analyzed is extracted through the pre-trained attribute extraction model, including:

The attribute information in the first target text to be analyzed is extracted by using a pre-trained attribute extraction model.
The method according to claim 3, wherein said adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of second word segmentations;

Mapping each second participle through the attribute extraction model to obtain a word vector corresponding to each second participle;

An entity label corresponding to each entity is added to each word vector to obtain the first target text to be analyzed.
The method according to claim 1, wherein the at least two entities include a group of entities to be compared, and the at least two entities, the attribute information and the The text to be analyzed is analyzed, and the sentiment analysis results corresponding to the at least two entities are obtained, including:

Obtaining an entity tag group, the entity tag group includes tags corresponding to the entities to be compared;

Obtain an attribute tag corresponding to the attribute information;

adding the entity tag group and the attribute tag to the text to be analyzed to obtain a second target text to be analyzed;

The second target text to be analyzed is analyzed by the sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
The method according to claim 5, wherein the analysis of the second target text to be analyzed by the sentiment analysis model to obtain the sentiment analysis results corresponding to the at least two entities includes:

performing word segmentation processing on the second target text to be analyzed to obtain a plurality of third word segmentations;

Each third participle is mapped through the sentiment analysis model to obtain a word vector corresponding to each third participle;

Based on the processing sequence of word segmentation processing for the second target text to be analyzed, the word vectors corresponding to each third word segment are combined to obtain a target word vector set;

The target word vector set is analyzed to obtain the sentiment analysis result.
The method according to any one of claims 1 to 6, wherein, before identifying at least two entities in the text to be analyzed, the method further comprises:

Obtain a sample training set, the sample training set includes a plurality of sample texts, and an attribute label corresponding to each sample text;

Training an initial attribute extraction network based on the sample training set, and updating parameters of the initial attribute extraction network based on a training result;

When it is detected that the loss function corresponding to the initial attribute extraction network converges, the attribute extraction model is obtained.
A device for analyzing text, comprising:

an acquisition unit, configured to acquire the text to be analyzed;

An identification unit, configured to identify at least two entities in the text to be analyzed, the text to be analyzed includes commentary sentences containing at least two entities;

An extraction unit, configured to extract attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The analysis unit is configured to analyze the at least two entities, the attribute information, and the text to be analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
A device for analyzing text, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein, when the processor executes the computer program, it realizes:

Obtaining the text to be analyzed, the text to be analyzed includes comment sentences containing at least two entities;

identifying at least two entities in the text to be analyzed;

extracting attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The at least two entities, the attribute information, and the text to be analyzed are analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
The device according to claim 9, wherein said identifying at least two entities in the text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of first word segmentations;

Processing the plurality of first word segmentations based on a pre-trained named entity recognition model to obtain at least two entities in the text to be analyzed.
The device according to claim 9, wherein, before extracting the attribute information in the text to be analyzed through the pre-trained attribute extraction model, the method further comprises:

Get the entity label corresponding to each entity;

adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed;

The attribute information in the text to be analyzed is extracted through the pre-trained attribute extraction model, including:

The attribute information in the first target text to be analyzed is extracted by using a pre-trained attribute extraction model.
The device according to claim 11, wherein said adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of second word segmentations;

Mapping each second participle through the attribute extraction model to obtain a word vector corresponding to each second participle;

An entity label corresponding to each entity is added to each word vector to obtain the first target text to be analyzed.
The device according to claim 9, wherein the at least two entities include a group of entities to be compared, and the at least two entities, the attribute information and the The text to be analyzed is analyzed, and the sentiment analysis results corresponding to the at least two entities are obtained, including:

Obtaining an entity tag group, the entity tag group includes tags corresponding to the entities to be compared;

Obtain an attribute tag corresponding to the attribute information;

adding the entity tag group and the attribute tag to the text to be analyzed to obtain a second target text to be analyzed;

The second target text to be analyzed is analyzed by the sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
The device according to claim 13, wherein the analysis of the second target text to be analyzed by the sentiment analysis model to obtain the sentiment analysis results corresponding to the at least two entities includes:

performing word segmentation processing on the second target text to be analyzed to obtain a plurality of third word segmentations;

Each third participle is mapped through the sentiment analysis model to obtain a word vector corresponding to each third participle;

Based on the processing sequence of word segmentation processing for the second target text to be analyzed, the word vectors corresponding to each third word segment are combined to obtain a target word vector set;

The target word vector set is analyzed to obtain the sentiment analysis result.
The device according to any one of claims 9 to 14, wherein, before identifying at least two entities in the text to be analyzed, the method further comprises:

Obtain a sample training set, the sample training set includes a plurality of sample texts, and an attribute label corresponding to each sample text;

Training an initial attribute extraction network based on the sample training set, and updating parameters of the initial attribute extraction network based on a training result;

When it is detected that the loss function corresponding to the initial attribute extraction network converges, the attribute extraction model is obtained.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, it realizes:

Obtaining the text to be analyzed, the text to be analyzed includes comment sentences containing at least two entities;

identifying at least two entities in the text to be analyzed;

extracting attribute information in the text to be analyzed through a pre-trained attribute extraction model;

The at least two entities, the attribute information, and the text to be analyzed are analyzed by using a pre-trained sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.
The computer-readable storage medium of claim 16, wherein said identifying at least two entities in the text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of first word segmentations;

Processing the plurality of first word segmentations based on a pre-trained named entity recognition model to obtain at least two entities in the text to be analyzed.
The computer-readable storage medium according to claim 16, wherein, before extracting the attribute information in the text to be analyzed through the pre-trained attribute extraction model, the method further comprises:

Get the entity label corresponding to each entity;

adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed;

The attribute information in the text to be analyzed is extracted through the pre-trained attribute extraction model, including:

The attribute information in the first target text to be analyzed is extracted by using a pre-trained attribute extraction model.
The computer-readable storage medium according to claim 18, wherein said adding the entity tag corresponding to each entity to the text to be analyzed to obtain the first target text to be analyzed comprises:

performing word segmentation processing on the text to be analyzed to obtain a plurality of second word segmentations;

Mapping each second participle through the attribute extraction model to obtain a word vector corresponding to each second participle;

An entity label corresponding to each entity is added to each word vector to obtain the first target text to be analyzed.
The computer-readable storage medium according to claim 16, wherein said at least two entities include a group of entities to be compared, and said at least two entities, said The attribute information and the text to be analyzed are analyzed to obtain the sentiment analysis results corresponding to the at least two entities, including:

Obtaining an entity tag group, the entity tag group includes tags corresponding to the entities to be compared;

Obtain an attribute tag corresponding to the attribute information;

adding the entity tag group and the attribute tag to the text to be analyzed to obtain a second target text to be analyzed;

The second target text to be analyzed is analyzed by the sentiment analysis model to obtain sentiment analysis results corresponding to the at least two entities.