CN111144126A - Training method of semantic analysis model, semantic analysis method and device - Google Patents

Training method of semantic analysis model, semantic analysis method and device Download PDF

Info

Publication number
CN111144126A
CN111144126A CN201911348581.7A CN201911348581A CN111144126A CN 111144126 A CN111144126 A CN 111144126A CN 201911348581 A CN201911348581 A CN 201911348581A CN 111144126 A CN111144126 A CN 111144126A
Authority
CN
China
Prior art keywords
semantic
attribute
sample text
analysis model
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911348581.7A
Other languages
Chinese (zh)
Inventor
杨扬
王金刚
任磊
步佳昊
张富峥
王仲远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201911348581.7A priority Critical patent/CN111144126A/en
Publication of CN111144126A publication Critical patent/CN111144126A/en
Withdrawn legal-status Critical Current

Links

Images

Abstract

In the training method of the semantic analysis model provided by the embodiment of the specification, because a self-attention matrix used for representing the correlation among different semantic attributes is added in the model, a second feature expression vector of a sample text related to each semantic attribute is obtained through the self-attention matrix, the sample text is classified based on the second feature expression vector, and finally the semantic analysis model is trained according to the classification result, so that the semantic analysis model learns the influence of the correlation among different semantic attributes on the classification result, therefore, when the trained semantic analysis model classifies the text to be analyzed, the correlation factors among different semantic attributes are also taken into consideration, and the semantic analysis is more accurate.

Description

Training method of semantic analysis model, semantic analysis method and device
Technical Field
The application relates to the technical field of internet, in particular to a training method of a semantic analysis model, semantic analysis and a device.
Background
With the development of internet technology, it is a great trend to use machines instead of human beings to analyze the semantics of texts. Among them, emotion analysis is widely used in many fields as an important component of semantic analysis.
For example, sentiment analysis is performed on a comment made by a user for a restaurant to analyze the semantic attribute involved in the comment and the sentiment polarity of the semantic attribute. Semantic attributes may include dish tastes, restaurant environments, and the like. Emotional polarity may include positive, negative, neutral, and the like. Assuming the comment is "this restaurant's shredded fish is particularly good for eating, i.e., the seats are somewhat small", it can be analyzed that the comment relates to the taste of the dish, with positive emotional polarity, and to the restaurant environment, with negative emotional polarity.
In the prior art, a semantic analysis model based on deep learning can be adopted to perform semantic analysis on a text, for emotion analysis in the above example, the semantic analysis model based on deep learning takes semantic analysis on the text as a multi-classification problem, that is, different semantic attributes are defined as different classifications, the same semantic attribute but with different emotion polarities are also defined as different classifications, the semantic analysis model can output a result of classifying the text, and one text can belong to a plurality of different classifications at the same time.
However, in the prior art, when the semantic analysis model is trained, different semantic attributes are often completely independent, and correlation between different voice attributes is ignored, so that the semantic analysis accuracy of the trained semantic analysis model is low when the trained semantic analysis model is applied.
Disclosure of Invention
The embodiment of the specification provides a training method, semantic analysis and a device for a semantic analysis model, which are used for solving the problem that the semantic analysis accuracy of the semantic analysis model which cannot be trained in the prior art is low when the semantic analysis model is applied.
The embodiment of the specification adopts the following technical scheme:
the training method of the semantic analysis model provided by the specification comprises the following steps:
obtaining a sample text, and determining each participle contained in the sample text;
determining a word vector corresponding to each participle according to a semantic analysis model to be trained;
aiming at each semantic attribute, determining a first feature expression vector of the sample text related to the semantic attribute according to an attention matrix which is contained in the semantic analysis model to be trained and corresponds to the semantic attribute and a word vector which corresponds to each participle;
determining a second feature representation vector of the sample text related to each semantic attribute according to a self-attention matrix which is contained in the semantic analysis model to be trained and used for representing correlation among different semantic attributes and a first feature representation vector of the sample text related to each semantic attribute;
determining a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and a second feature expression vector of each semantic attribute related to the sample text, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs;
and adjusting model parameters in the semantic analysis model according to the classification result and labels preset for the sample text so as to finish training the semantic analysis model.
Optionally, determining a word vector corresponding to each participle according to the semantic analysis model to be trained, which specifically includes:
and inputting each participle into a semantic representation layer in the semantic analysis model to obtain a bidirectional semantic representation vector corresponding to each participle output by the semantic representation layer as a word vector corresponding to each participle, wherein the semantic representation layer at least comprises a sub-model for outputting the bidirectional semantic representation vector, and the sub-model comprises a BERT model.
Optionally, determining that the sample text relates to the first feature representation vector of the semantic attribute specifically includes:
inputting a word vector corresponding to each participle into an attribute representation layer in the semantic analysis model;
carrying out attention weighting on a word vector corresponding to each participle through an attention matrix corresponding to the semantic attribute contained in the attribute representation layer;
and determining a first feature expression vector of the sample text related to the semantic attribute according to the word vector corresponding to each participle after attention weighting.
Optionally, determining that the sample text relates to a second feature representation vector of each semantic attribute specifically includes:
inputting a first feature representation vector of the sample text related to each semantic attribute into an attribute relevance representation layer in the speech analysis model;
self-attention weighting a first feature representation vector of the sample text relating to each semantic attribute by a self-attention matrix included in the attribute relevance representation layer for representing relevance between different semantic attributes;
and determining a second feature representation vector of the sample text related to each semantic attribute according to the first feature representation vectors after self attention weighting.
Optionally, determining a classification result output by the semantic training model to be trained specifically includes:
inputting a second feature representation vector of the sample text related to each semantic attribute into a classification layer in the semantic analysis model;
and classifying the sample texts according to each second feature expression vector and the classification parameter which is contained in the classification layer and corresponds to each semantic attribute to obtain a classification result output by the classification layer.
Optionally, adjusting the model parameters in the semantic analysis model specifically includes:
determining a first loss corresponding to the classification result according to the classification result and a preset label aiming at the sample text;
determining a correlation coefficient between two semantic attributes in the attribute combination and a difference value of classification parameters contained in the classification layer and corresponding to the two semantic attributes in the attribute combination aiming at the attribute combination consisting of any two semantic attributes, and determining a second loss corresponding to the attribute combination according to the correlation coefficient and the difference value;
determining the comprehensive loss according to the first loss and the second loss corresponding to each attribute combination;
and training the semantic analysis model by taking the minimization of the comprehensive loss as a training target.
Optionally, determining a correlation coefficient between two semantic attributes in the attribute combination specifically includes:
determining a training set where the sample text is located;
determining a preset label for each text in the training set;
determining labeling values corresponding to emotion polarities on two semantic attributes in the attribute combination, wherein the labeling values are contained in each preset label of each text;
and determining a correlation coefficient between the two semantic attributes in the attribute combination according to the labeling value.
The semantic analysis method provided by the specification comprises the following steps:
acquiring a text to be analyzed;
inputting the text to be analyzed into a semantic analysis model obtained by training by adopting the training method of the semantic analysis model;
and obtaining a classification result output by the semantic analysis model, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs.
The present specification provides a training apparatus for a semantic analysis model, including:
the acquisition module is used for acquiring a sample text and determining each participle contained in the sample text;
the semantic representation module is used for determining a word vector corresponding to each participle according to a semantic analysis model to be trained;
the attribute representation module is used for determining a first feature representation vector of the sample text related to each semantic attribute according to an attention matrix which is contained in the semantic analysis model to be trained and corresponds to the semantic attribute and a word vector which corresponds to each participle;
an attribute relevance representation module, configured to determine, according to a self-attention matrix included in the semantic analysis model to be trained and used for representing relevance between different semantic attributes, and a first feature representation vector of the sample text related to each semantic attribute, a second feature representation vector of the sample text related to each semantic attribute;
the classification module is used for determining a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and a second feature expression vector of each semantic attribute related to the sample text, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs;
and the training module is used for adjusting model parameters in the semantic analysis model according to the classification result and labels preset aiming at the sample text so as to finish training the semantic analysis model.
The present specification provides a semantic analysis device, including:
the acquisition module is used for acquiring a text to be analyzed;
an input module, configured to input the text to be analyzed into a semantic analysis model obtained by training according to the method of any one of claims 1 to 7;
and the analysis module is used for obtaining a classification result output by the semantic analysis model, and the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs.
The present specification provides a computer-readable storage medium storing a computer program, which when executed by a processor implements the above-described training method or semantic analysis method for a semantic analysis model.
The present specification provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements a training method or a semantic analysis method of the semantic analysis model when executing the program.
The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:
in the training method for the semantic analysis model provided in the embodiment of the present specification, since the self-attention matrix for representing the correlation between different semantic attributes is added to the model, the second feature expression vector of the sample text related to each semantic attribute is obtained through the self-attention matrix, the sample text is classified based on the second feature expression vector, and finally the semantic analysis model is trained according to the classification result, so that the semantic analysis model learns the influence of the correlation between different semantic attributes on the classification result, therefore, when the trained semantic analysis model classifies the text to be analyzed, the factors of the correlation between different semantic attributes are also taken into consideration, and the semantic analysis is more accurate.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a training process of a semantic analysis model provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a semantic analysis model architecture provided by an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of a training apparatus for a semantic analysis model provided in an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a semantic analysis apparatus provided in an embodiment of the present disclosure;
fig. 5 is a schematic view of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In an actual application scenario, different semantic attributes have a certain correlation. For example, if the comment of the user is "this store is not well found", the semantic attributes include "whether the location is obvious", "whether the location is in a business district", and "whether the traffic is convenient", according to the semantic analysis method in the prior art, the three semantic attributes are separated independently, the classification result of the comment of the user is only the semantic attribute belonging to "whether the location is obvious", and the emotional polarity is negative. In fact, the comments may also mean that the shop is not in a business district (negative emotional polarity) or the traffic is inconvenient (negative emotional polarity), and it can be seen that when a comment belongs to a semantic attribute, it may also belong to other related semantic attributes, and the emotional polarities are similar. Obviously, in the prior art, the semantic attributes are independently separated, so that the result of semantic analysis is not accurate enough.
The training method of the semantic analysis model provided by the specification aims to mine the correlation among different semantic attributes, and when one text belongs to one semantic attribute, the probability that the text belongs to other related semantic attributes is increased and is reflected on the final classification result.
In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person skilled in the art without making any inventive step based on the embodiments in the description belong to the protection scope of the present application.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a training process of a semantic analysis model provided in an embodiment of the present specification, which may specifically include the following steps:
s100: a sample text is obtained and each participle contained in the sample text is determined.
In this embodiment of the present specification, a plurality of texts may be acquired from a corpus, and a training set composed of the acquired plurality of texts may be determined, so that each text in the training set may be used as a sample text, and steps S100 to S110 shown in fig. 1 may be performed on the sample text.
Specifically, in step S100, when determining the word segmentation included in a sample text, the word segmentation process may be performed on the sample text to obtain each word segmentation included in the sample text. When the sample text is subjected to word segmentation, any word segmentation method can be adopted, and of course, each character in the sample text can also be treated as a word segmentation, namely, a single word is formed into words. The present specification does not limit the way in which words are processed.
S102: and determining a word vector corresponding to each participle according to the semantic analysis model to be trained.
In this embodiment, the semantic analysis model may include at least four layers, which are: the system comprises a semantic representation layer, an attribute relevance representation layer and a classification layer. As shown in fig. 2.
At least a sub-model for outputting bidirectional semantic representation vectors, such as a bert (bidirectional Encoder retrieval from transformations) model, is included in the semantic representation layer. Then, in step S102, each segmented word may be input into a semantic representation layer in the semantic analysis model, and a bidirectional semantic representation vector corresponding to each segmented word output by the semantic representation layer is obtained as a word vector corresponding to each segmented word.
Of course, the model for outputting the bi-directional semantic representation vector includes other models besides the BERT model described above, and this specification does not limit this.
S104: and aiming at each semantic attribute, determining a first feature expression vector of the sample text related to the semantic attribute according to an attention matrix which is contained in the semantic analysis model to be trained and corresponds to the semantic attribute and a word vector which corresponds to each participle.
In the semantic analysis model shown in fig. 2, the attribute characterization layer at least includes an attention matrix corresponding to each semantic attribute.
Then, in step S104, the word vector corresponding to each participle may be input into an attribute representation layer in the semantic analysis model, the attention matrix corresponding to the semantic attribute included in the attribute representation layer is used to perform attention weighting on the word vector corresponding to each participle, and a first feature expression vector of the sample text related to the semantic attribute is determined according to the attention weighted word vector corresponding to each participle.
It should be noted that the first feature expression vector described above can characterize the probability that the sample text relates to the semantic attribute and the emotion polarity on the semantic attribute.
S106: according to a self-attention matrix which is contained in the semantic analysis model to be trained and used for representing correlation among different semantic attributes and a first feature representation vector of the sample text related to each semantic attribute, a second feature representation vector of the sample text related to each semantic attribute is determined.
In the embodiment of the present specification, at least a self-attention matrix is included in an attribute correlation representation layer in the semantic analysis model, the self-attention matrix is used for representing correlation between different semantic attributes, and the form of the self-attention matrix may be: element R in the matrixijRepresenting the correlation of the ith semantic attribute and the jth semantic attribute, the stronger the correlation, RijThe larger the value of (A) and the smaller the opposite.
Then in step S106, the first feature expression vectors of the sample text related to each semantic attribute may be input into an attribute relevance expression layer in the speech analysis model, the first feature expression vectors of the sample text related to each semantic attribute may be self-attention weighted by the above-mentioned self-attention matrix included in the attribute relevance expression layer, and a second feature expression vector of the sample text related to each semantic attribute may be determined according to each first feature expression vector after self-attention weighting.
It should be noted that the second feature expression vector may also represent the probability that the sample text relates to each semantic attribute and the emotion polarity on the semantic attribute, but unlike the first feature expression vector, the first feature expression vector is obtained by weighting the word vector by using the attention matrix corresponding to each semantic attribute, which is independent of each other, and therefore, the probability that the sample text characterized by the second feature expression vector relates to each semantic attribute and the emotion polarity on the semantic attribute do not consider the correlation between different semantic attributes. And the second feature expression vector is obtained by weighting the first feature expression vector by using a self-attention matrix for expressing the correlation between different semantic attributes, which is equivalent to a factor of the correlation between different semantic attributes introduced by the self-attention matrix, so that the probability of the sample text represented by the second feature expression vector related to each semantic attribute and the emotion polarity on the semantic attributes take the correlation between different semantic attributes into consideration.
S108: and determining a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and the second feature expression vector of each semantic attribute related to the sample text.
In fig. 2, the classification layers at least include a hidden layer, a fully connected layer, and a softmax layer (the hidden layer, the fully connected layer, and the softmax layer are not shown in fig. 2), in step S108, the second feature representation vector relating to each semantic attribute of the sample text may be sequentially input into the hidden layer, the fully connected layer, and the softmax layer in the classification layers, and the sample text is classified according to the classification parameters corresponding to each semantic attribute included in each second feature representation vector and the hidden layer, the fully connected layer, and the softmax layer in the classification layers, so as to obtain a classification result output by the classification layers.
The classification result described in this specification at least includes the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs.
Specifically, the emotional polarity may be quantified by a numerical value, e.g., a numerical value closer to 1 indicates that the emotional polarity tends to be more positive, a numerical value closer to-1 indicates that the emotional polarity tends to be more negative, and a numerical value closer to 0 indicates that the emotional polarity tends to be neutral.
S110: and adjusting model parameters in the semantic analysis model according to the classification result and labels preset for the sample text so as to finish the training of the semantic analysis model.
In the embodiment of the present specification, the model parameters to be adjusted at least include the classification parameters described above, and may further include the attention matrix and the self-attention matrix described above. The model parameters in the semantic analysis model can be adjusted by using a traditional training method. That is, the loss (hereinafter referred to as a first loss) corresponding to the classification result is determined directly from the classification result obtained in step S108 and the label preset for the sample text, and the model parameters in the semantic analysis model are adjusted with the first loss minimized as the training target, so as to complete the training of the semantic analysis model.
Because the self-attention matrix used for expressing the correlation among different semantic attributes is added into the semantic analysis model, the semantic analysis model obtained by training through the traditional training method can analyze the semantics of the text to be analyzed more accurately.
Further, in order to further improve the semantic analysis capability of the trained semantic analysis model, in determining the loss, in addition to determining the first loss, the correlation between different semantic attributes may be taken into consideration of the loss.
Specifically, for an attribute combination composed of any two semantic attributes, a correlation coefficient between the two semantic attributes in the attribute combination and a difference value of classification parameters included in the classification layer and corresponding to the two semantic attributes in the attribute combination are determined, and a second loss corresponding to the attribute combination is determined according to the correlation coefficient and the difference value. And finally, determining the comprehensive loss according to the first loss and the second loss corresponding to each attribute combination, and adjusting model parameters in the semantic analysis model by taking the minimization of the comprehensive loss as a training target so as to finish the training of the semantic analysis model.
The above calculation formula of the comprehensive loss may be:L=Lclassification+∑rts|wt-ws|。
wherein L is the combined loss, LclassificationIs a first loss, rtsIs a correlation coefficient between the t semantic attribute and the s semantic attribute, wtFor the classification parameter contained in the classification layer corresponding to the t-th semantic attribute, wsThe classification parameters contained in the classification layer and corresponding to the s-th semantic attribute.
The core idea of the above method for determining the second loss is that: if the correlation between two semantic attributes is strong, the values of the classification parameters corresponding to the two semantic attributes contained in the classification layer should be relatively close, and conversely, the values of the classification parameters corresponding to the two semantic attributes should be relatively different. In the iterative training process of the semantic analysis model, the values of the classification parameters corresponding to each semantic attribute contained in the classification layer are also continuously adjusted (including the attention matrix corresponding to each semantic attribute in the attribute characterization layer and the self-attention moment matrix in the attribute relevance representation layer are also continuously adjusted in the training process), and the correlation coefficient r between the two semantic attributestsTherefore, when the calculation formula of the comprehensive loss is adopted to calculate the comprehensive loss, the aim that the classification parameter values corresponding to the two semantic attributes are relatively close to each other when the correlation between the two semantic attributes is relatively strong can be achieved as long as the minimum comprehensive loss is taken as a training target.
Furthermore, the method for determining the correlation coefficient between two semantic attributes may specifically be: determining a training set where the sample text is located, determining a preset label for each text in the training set, determining a label value corresponding to the emotion polarity on the two semantic attributes in the attribute combination contained in the preset label for each text, and determining a correlation coefficient between the two semantic attributes in the attribute combination according to the label value.
Specifically, the pearson coefficient between the labeled values corresponding to the emotion polarities on the two semantic attributes may be determined according to the labeled values corresponding to the emotion polarities on the two semantic attributes in the attribute combination included in the label of each text in the training set, and the pearson coefficient may be used as the correlation coefficient between the two semantic attributes.
Of course, other methods may be used to determine the correlation coefficient between the two semantic attributes, as long as the determined correlation coefficient is positively correlated with the correlation between the two semantic attributes.
Correspondingly, the embodiment of the present specification further provides a semantic analysis method, which specifically may be: obtaining a text to be analyzed, inputting the text to be analyzed into a semantic analysis model obtained by training with the training method of the semantic analysis model, and obtaining a classification result output by the semantic analysis model, wherein the classification result comprises a semantic attribute to which the sample text belongs and an emotion polarity corresponding to the semantic attribute to which the sample text belongs.
Based on the same idea, the embodiment of the present specification further provides a training device and a semantic analysis device of a corresponding semantic analysis model, as shown in fig. 3 to 4.
Fig. 3 is a schematic structural diagram of a training apparatus for a semantic analysis model provided in an embodiment of the present specification, including:
an obtaining module 301, configured to obtain a sample text, and determine each participle included in the sample text;
a semantic representation module 302, configured to determine, according to a semantic analysis model to be trained, a word vector corresponding to each participle;
an attribute characterization module 303, configured to determine, for each semantic attribute, a first feature representation vector of the sample text related to the semantic attribute according to an attention matrix included in the to-be-trained semantic analysis model and a word vector corresponding to each participle;
an attribute relevance representation module 304, configured to determine, according to a self-attention matrix included in the semantic analysis model to be trained and used for representing relevance between different semantic attributes, and the first feature representation vector of the sample text related to each semantic attribute, a second feature representation vector of the sample text related to each semantic attribute;
the classification module 305 is configured to determine a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and the second feature expression vector of each semantic attribute related to the sample text, where the classification result includes the semantic attribute to which the sample text belongs and an emotion polarity corresponding to the semantic attribute to which the sample text belongs;
a training module 306, configured to adjust a model parameter in the semantic analysis model according to the classification result and a preset label for the sample text, so as to complete training of the semantic analysis model.
The semantic representation module 302 is specifically configured to input each participle into a semantic representation layer in the semantic analysis model, obtain a bidirectional semantic representation vector corresponding to each participle output by the semantic representation layer, as a word vector corresponding to each participle, where the semantic representation layer at least includes a sub-model for outputting the bidirectional semantic representation vector, and the sub-model includes a BERT model.
The attribute characterization module 303 is specifically configured to input a word vector corresponding to each participle into an attribute characterization layer in the semantic analysis model; carrying out attention weighting on a word vector corresponding to each participle through an attention matrix corresponding to the semantic attribute contained in the attribute representation layer; and determining a first feature expression vector of the sample text related to the semantic attribute according to the word vector corresponding to each participle after attention weighting.
The attribute relevance representation module 304 is specifically configured to input a first feature representation vector of the sample text related to each semantic attribute into an attribute relevance representation layer in the speech analysis model; self-attention weighting a first feature representation vector of the sample text relating to each semantic attribute by a self-attention matrix included in the attribute relevance representation layer for representing relevance between different semantic attributes; and determining a second feature representation vector of the sample text related to each semantic attribute according to the first feature representation vectors after self attention weighting.
The classification module 305 is specifically configured to input the second feature representation vector of the sample text related to each semantic attribute into a classification layer in the semantic analysis model; and classifying the sample texts according to each second feature expression vector and the classification parameter which is contained in the classification layer and corresponds to each semantic attribute to obtain a classification result output by the classification layer.
The training module 306 is specifically configured to determine, according to the classification result and a preset label for the sample text, a first loss corresponding to the classification result; determining a correlation coefficient between two semantic attributes in the attribute combination and a difference value of classification parameters contained in the classification layer and corresponding to the two semantic attributes in the attribute combination aiming at the attribute combination consisting of any two semantic attributes, and determining a second loss corresponding to the attribute combination according to the correlation coefficient and the difference value; determining the comprehensive loss according to the first loss and the second loss corresponding to each attribute combination; and training the semantic analysis model by taking the minimization of the comprehensive loss as a training target.
The training module 306 is specifically configured to determine a training set in which the sample text is located; determining a preset label for each text in the training set; determining labeling values corresponding to emotion polarities on two semantic attributes in the attribute combination, wherein the labeling values are contained in each preset label of each text; and determining a correlation coefficient between the two semantic attributes in the attribute combination according to the labeling value.
Fig. 4 is a schematic structural diagram of a semantic analysis apparatus provided in an embodiment of the present specification, including:
an obtaining module 401, configured to obtain a text to be analyzed;
an input module 402, configured to input the text to be analyzed into a semantic analysis model obtained by training with the above training method of the semantic analysis model;
an analyzing module 403, configured to obtain a classification result output by the semantic analysis model, where the classification result includes a semantic attribute to which the sample text belongs and an emotion polarity corresponding to the semantic attribute to which the sample text belongs.
The present specification further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program can be used to execute the above training method for the semantic analysis model provided in fig. 1 or the above semantic analysis method.
The embodiment of the present specification also provides a schematic structural diagram of the electronic device shown in fig. 5. As shown in fig. 5, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the training method of the semantic analysis model described in fig. 1 or the semantic analysis method described above. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. A method for training a semantic analysis model, the method comprising:
determining a word vector corresponding to each participle in a sample text according to a semantic analysis model to be trained;
aiming at each semantic attribute, determining a first feature expression vector of the sample text related to the semantic attribute according to an attention matrix which is contained in the semantic analysis model to be trained and corresponds to the semantic attribute and a word vector which corresponds to each participle;
determining a second feature representation vector of the sample text related to each semantic attribute according to a self-attention matrix which is contained in the semantic analysis model to be trained and used for representing correlation among different semantic attributes and a first feature representation vector of the sample text related to each semantic attribute;
determining a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and a second feature expression vector of each semantic attribute related to the sample text, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs;
and adjusting model parameters in the semantic analysis model according to the classification result and labels preset for the sample text so as to finish training the semantic analysis model.
2. The method of claim 1, wherein determining a word vector corresponding to each participle according to a semantic analysis model to be trained specifically comprises:
and inputting each participle into a semantic representation layer in the semantic analysis model to obtain a bidirectional semantic representation vector corresponding to each participle output by the semantic representation layer as a word vector corresponding to each participle, wherein the semantic representation layer at least comprises a sub-model for outputting the bidirectional semantic representation vector, and the sub-model comprises a BERT model.
3. The method of claim 1, wherein determining that the sample text relates to a first feature representation vector for the semantic attribute comprises:
inputting a word vector corresponding to each participle into an attribute representation layer in the semantic analysis model;
carrying out attention weighting on a word vector corresponding to each participle through an attention matrix corresponding to the semantic attribute contained in the attribute representation layer;
and determining a first feature expression vector of the sample text related to the semantic attribute according to the word vector corresponding to each participle after attention weighting.
4. The method of claim 1, wherein determining the sample text relates to a second feature representation vector for each semantic attribute comprises:
inputting a first feature representation vector of the sample text related to each semantic attribute into an attribute relevance representation layer in the speech analysis model;
self-attention weighting a first feature representation vector of the sample text relating to each semantic attribute by a self-attention matrix included in the attribute relevance representation layer for representing relevance between different semantic attributes;
and determining a second feature representation vector of the sample text related to each semantic attribute according to the first feature representation vectors after self attention weighting.
5. The method according to claim 1, wherein determining the classification result output by the semantic training model to be trained specifically comprises:
inputting a second feature representation vector of the sample text related to each semantic attribute into a classification layer in the semantic analysis model;
and classifying the sample texts according to each second feature expression vector and the classification parameter which is contained in the classification layer and corresponds to each semantic attribute to obtain a classification result output by the classification layer.
6. The method of claim 5, wherein adjusting the model parameters in the semantic analysis model comprises:
determining a first loss corresponding to the classification result according to the classification result and a preset label aiming at the sample text;
determining a correlation coefficient between two semantic attributes in the attribute combination and a difference value of classification parameters contained in the classification layer and corresponding to the two semantic attributes in the attribute combination aiming at the attribute combination consisting of any two semantic attributes, and determining a second loss corresponding to the attribute combination according to the correlation coefficient and the difference value;
determining the comprehensive loss according to the first loss and the second loss corresponding to each attribute combination;
and adjusting model parameters in the semantic analysis model by taking the minimization of the comprehensive loss as a training target.
7. The method of claim 6, wherein determining a correlation coefficient between two semantic attributes in the combination of attributes comprises:
determining a training set where the sample text is located;
determining a preset label for each text in the training set;
determining labeling values corresponding to emotion polarities on two semantic attributes in the attribute combination, wherein the labeling values are contained in each preset label of each text;
and determining a correlation coefficient between the two semantic attributes in the attribute combination according to the labeling value.
8. A method of semantic analysis, the method comprising:
acquiring a text to be analyzed;
inputting the text to be analyzed into a semantic analysis model obtained by training according to the method of any one of claims 1-7;
and obtaining a classification result output by the semantic analysis model, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs.
9. An apparatus for training a semantic analysis model, the apparatus comprising:
the acquisition module is used for acquiring a sample text and determining each participle contained in the sample text;
the semantic representation module is used for determining a word vector corresponding to each participle according to a semantic analysis model to be trained;
the attribute representation module is used for determining a first feature representation vector of the sample text related to each semantic attribute according to an attention matrix which is contained in the semantic analysis model to be trained and corresponds to the semantic attribute and a word vector which corresponds to each participle;
an attribute relevance representation module, configured to determine, according to a self-attention matrix included in the semantic analysis model to be trained and used for representing relevance between different semantic attributes, and a first feature representation vector of the sample text related to each semantic attribute, a second feature representation vector of the sample text related to each semantic attribute;
the classification module is used for determining a classification result output by the semantic training model to be trained according to the semantic analysis model to be trained and a second feature expression vector of each semantic attribute related to the sample text, wherein the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs;
and the training module is used for adjusting model parameters in the semantic analysis model according to the classification result and labels preset aiming at the sample text so as to finish training the semantic analysis model.
10. A semantic analysis apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a text to be analyzed;
an input module, configured to input the text to be analyzed into a semantic analysis model obtained by training according to the method of any one of claims 1 to 7;
and the analysis module is used for obtaining a classification result output by the semantic analysis model, and the classification result comprises the semantic attribute to which the sample text belongs and the emotion polarity corresponding to the semantic attribute to which the sample text belongs.
11. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7 or implements the method of claim 8.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-7 or implements the method of claim 8 when executing the program.
CN201911348581.7A 2019-12-24 2019-12-24 Training method of semantic analysis model, semantic analysis method and device Withdrawn CN111144126A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911348581.7A CN111144126A (en) 2019-12-24 2019-12-24 Training method of semantic analysis model, semantic analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911348581.7A CN111144126A (en) 2019-12-24 2019-12-24 Training method of semantic analysis model, semantic analysis method and device

Publications (1)

Publication Number Publication Date
CN111144126A true CN111144126A (en) 2020-05-12

Family

ID=70520109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911348581.7A Withdrawn CN111144126A (en) 2019-12-24 2019-12-24 Training method of semantic analysis model, semantic analysis method and device

Country Status (1)

Country Link
CN (1) CN111144126A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579765A (en) * 2020-12-18 2021-03-30 中国平安人寿保险股份有限公司 Data screening method, device, equipment and storage medium based on Boolean expression
CN112699297A (en) * 2020-12-23 2021-04-23 平安银行股份有限公司 Service recommendation method, device and equipment based on user portrait and storage medium
CN112699237A (en) * 2020-12-24 2021-04-23 百度在线网络技术(北京)有限公司 Label determination method, device and storage medium
CN113762303A (en) * 2020-11-23 2021-12-07 北京沃东天骏信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN114707591A (en) * 2022-03-28 2022-07-05 北京百度网讯科技有限公司 Data processing method and training method and device of data processing model
CN115174947A (en) * 2022-06-28 2022-10-11 广州博冠信息科技有限公司 Live video extraction method and device, storage medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN109543180A (en) * 2018-11-08 2019-03-29 中山大学 A kind of text emotion analysis method based on attention mechanism
CN109543722A (en) * 2018-11-05 2019-03-29 中山大学 A kind of emotion trend forecasting method based on sentiment analysis model
CN109815490A (en) * 2019-01-04 2019-05-28 平安科技(深圳)有限公司 Text analyzing method, apparatus, equipment and storage medium
CN109948165A (en) * 2019-04-24 2019-06-28 吉林大学 Fine granularity feeling polarities prediction technique based on mixing attention network
CN110321932A (en) * 2019-06-10 2019-10-11 浙江大学 A kind of whole city city air quality index estimation method based on depth multisource data fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN109543722A (en) * 2018-11-05 2019-03-29 中山大学 A kind of emotion trend forecasting method based on sentiment analysis model
CN109543180A (en) * 2018-11-08 2019-03-29 中山大学 A kind of text emotion analysis method based on attention mechanism
CN109815490A (en) * 2019-01-04 2019-05-28 平安科技(深圳)有限公司 Text analyzing method, apparatus, equipment and storage medium
CN109948165A (en) * 2019-04-24 2019-06-28 吉林大学 Fine granularity feeling polarities prediction technique based on mixing attention network
CN110321932A (en) * 2019-06-10 2019-10-11 浙江大学 A kind of whole city city air quality index estimation method based on depth multisource data fusion

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762303A (en) * 2020-11-23 2021-12-07 北京沃东天骏信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112579765A (en) * 2020-12-18 2021-03-30 中国平安人寿保险股份有限公司 Data screening method, device, equipment and storage medium based on Boolean expression
CN112699297A (en) * 2020-12-23 2021-04-23 平安银行股份有限公司 Service recommendation method, device and equipment based on user portrait and storage medium
CN112699237A (en) * 2020-12-24 2021-04-23 百度在线网络技术(北京)有限公司 Label determination method, device and storage medium
CN114707591A (en) * 2022-03-28 2022-07-05 北京百度网讯科技有限公司 Data processing method and training method and device of data processing model
CN114707591B (en) * 2022-03-28 2023-06-02 北京百度网讯科技有限公司 Data processing method and training method and device of data processing model
CN115174947A (en) * 2022-06-28 2022-10-11 广州博冠信息科技有限公司 Live video extraction method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN111144126A (en) Training method of semantic analysis model, semantic analysis method and device
CN109992771B (en) Text generation method and device
CN115952272B (en) Method, device and equipment for generating dialogue information and readable storage medium
CN111739520B (en) Speech recognition model training method, speech recognition method and device
CN112861522B (en) Aspect-level emotion analysis method, system and model based on dual-attention mechanism
CN110728147B (en) Model training method and named entity recognition method
CN112417093B (en) Model training method and device
CN113221555A (en) Keyword identification method, device and equipment based on multitask model
CN112308113A (en) Target identification method, device and medium based on semi-supervision
CN113344098A (en) Model training method and device
CN111144093A (en) Intelligent text processing method and device, electronic equipment and storage medium
CN112597301A (en) Voice intention recognition method and device
CN111652286A (en) Object identification method, device and medium based on graph embedding
CN112667803A (en) Text emotion classification method and device
CN107577660B (en) Category information identification method and device and server
CN116151355B (en) Method, device, medium and equipment for model training and service execution
CN117216271A (en) Article text processing method, device and equipment
CN113887234B (en) Model training and recommending method and device
CN115456114A (en) Method, device, medium and equipment for model training and business execution
CN113344197A (en) Training method of recognition model, service execution method and device
CN113343132A (en) Model training method, information display method and device
CN116630480B (en) Interactive text-driven image editing method and device and electronic equipment
CN113642305A (en) Text generation method and device, storage medium and electronic equipment
CN116795972B (en) Model training method and device, storage medium and electronic equipment
CN115017915B (en) Model training and task execution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200512

WW01 Invention patent application withdrawn after publication