CN113378543B - Data analysis method, method for training data analysis model and electronic equipment - Google Patents

Data analysis method, method for training data analysis model and electronic equipment Download PDF

Info

Publication number
CN113378543B
CN113378543B CN202110717930.9A CN202110717930A CN113378543B CN 113378543 B CN113378543 B CN 113378543B CN 202110717930 A CN202110717930 A CN 202110717930A CN 113378543 B CN113378543 B CN 113378543B
Authority
CN
China
Prior art keywords
statement
setting
vector
subsequence
position identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110717930.9A
Other languages
Chinese (zh)
Other versions
CN113378543A (en
Inventor
郑少杰
彭明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202110717930.9A priority Critical patent/CN113378543B/en
Publication of CN113378543A publication Critical patent/CN113378543A/en
Application granted granted Critical
Publication of CN113378543B publication Critical patent/CN113378543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a data analysis method, a method for training a data analysis model and electronic equipment, wherein the data analysis method comprises the following steps: inputting at least one first set statement into the data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a set word in each first set statement; calculating a loss value of the data analysis model by using a set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each first set statement, and based on the prediction category, the initial position identifier and the cut-off position identifier corresponding to the corresponding set word; updating model parameters of a data analysis model based on the loss values; outputting the trained data analysis model to obtain a first model; and inputting the first sentence into the first model to obtain the emotion category, the initial position identifier and the ending position identifier corresponding to the first keyword in the first sentence.

Description

Data analysis method, method for training data analysis model and electronic equipment
Technical Field
The invention relates to the technical field of computers, in particular to a data analysis method, a method for training a data analysis model and electronic equipment.
Background
With the development of computer technology, more and more technologies (e.g., big data, artificial intelligence, block chain, etc.) are applied in the financial field, and the traditional financial industry is gradually changing to financial technology, however, the financial technology also puts higher demands on the technologies due to the requirements of the financial industry on security and real-time performance. In the field of financial technology, in order to know the evaluation of a user on a certain product or a certain service, the name of an entity and keywords of an evaluation entity need to be extracted from feedback information of the user, wherein the entity comprises a person name, a place name, an organization name, the product, the service and the like.
In the related art, natural Language Processing (NLP) technology is used to process the feedback information of the client to extract the keywords of the entity to be evaluated from the feedback information of the client, but the accuracy of the extracted keywords is low, and the real evaluation of the entity by the user cannot be obtained.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data analysis method, a method for training a data analysis model, and an electronic device, so as to solve the technical problem that the accuracy of extracted keywords in the related art is low and the real evaluation of a user on an entity cannot be obtained.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
the embodiment of the invention provides a data analysis method, which comprises the following steps:
inputting at least one first setting statement into a data analysis model to obtain a prediction category, a starting position identifier and a stopping position identifier corresponding to a setting word in each first setting statement in the at least one first setting statement; the first setting statement is obtained by splicing a setting entity and a setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of the evaluation set entity;
calculating a loss value of the data analysis model by using a set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each of the at least one first set sentence, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding set word;
updating model parameters of the data analysis model based on the loss values;
outputting the trained data analysis model to obtain a first model;
inputting a first sentence into the first model to obtain an emotion category, a starting position identifier and a stopping position identifier corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information.
In the above solution, the set loss function includes a first sub-function, a second sub-function, a third sub-function, a first weight, a second weight, and a third weight;
the calculating the loss value of the data analysis model by using the set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier and the end position identifier corresponding to the corresponding setting word includes:
calculating a first loss value between a calibration category corresponding to a set word in a first set statement and a corresponding prediction category based on the first subfunction and the first weight;
calculating a second loss value between a first calibration position identifier corresponding to a set word in a first set statement and a corresponding initial position identifier based on the second subfunction and the second weight;
calculating a third loss value between a second calibration position identifier corresponding to the set word in the first set statement and a corresponding cut-off position identifier based on the third subfunction and the third weight;
calculating a loss value of the data analysis model based on a first loss value, a second loss value and a third loss value corresponding to a first set statement; wherein the second weight and the third weight are both greater than the first weight.
In the above scheme, the data analysis model includes a feature extraction model and a full connection layer; the inputting at least one first setting statement into the data analysis model to obtain the prediction category, the start position identifier and the end position identifier corresponding to the setting word in each first setting statement in the at least one first setting statement includes:
inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector is used for representing the global features of a first set statement, and the first subsequence is formed by vectors corresponding to each word in the set statement;
inputting a first vector corresponding to each first setting statement into the full-connection layer to obtain a prediction category corresponding to the setting words in each first setting statement;
and determining a starting position identifier and a stopping position identifier corresponding to the set words in each first set statement based on the first subsequence corresponding to each first set statement.
In the above scheme, the determining, based on the first subsequence corresponding to each first setting statement, the start position identifier and the end position identifier corresponding to the setting word in each first setting statement includes:
converting a first subsequence corresponding to the first setting statement into at least two second vectors;
adding each vector in the first subsequence corresponding to the first setting statement to the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector;
calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair comprises a first attention vector for representing the probability of the starting position and a second attention vector for representing the probability of the cut-off position;
determining an initial position identifier corresponding to a set word in a first set statement based on the mean value of the first attention degree vector corresponding to each vector in a second subsequence corresponding to the first set statement;
and determining a cut-off position identifier corresponding to the set word in the first set statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first set statement.
In the above solution, the converting the first subsequence corresponding to the first setting statement into at least two second vectors includes at least two of the following:
calculating the mean value of vectors in a first subsequence corresponding to the first set statement to obtain a second vector;
determining a minimum value corresponding to each dimension from vectors in a first subsequence corresponding to a first set statement to obtain a second vector;
and determining the maximum value corresponding to each dimension from the vectors in the first subsequence corresponding to the first setting statement to obtain a second vector.
In the above solution, the calculating, based on each second vector and the second subsequence corresponding to the first setting statement, a focus vector pair corresponding to each vector in the corresponding second subsequence includes:
respectively calculating a first attention degree vector and a second attention degree vector corresponding to each vector in the corresponding second subsequence based on a second vector and a second subsequence corresponding to the first setting statement by adopting a first setting function and a second setting function; wherein, the first and the second end of the pipe are connected with each other,
the model parameters in the first and second set functions are not shared.
In the scheme, the second keyword is determined from the second set sentence based on the starting position mark and the ending position mark corresponding to the second keyword in the second set sentence;
calculating the accuracy of the data analysis model based on the first word number and the second word number corresponding to the second setting statement;
under the condition that the accuracy is greater than or equal to a set threshold value, outputting a trained data analysis model;
the first word number represents the word number included by the intersection of the set words corresponding to the second set statement and the second keyword; the second word number represents the word number included in the union set of the setting words corresponding to the second setting statement and the second keyword.
The embodiment of the invention also provides a method for training the data analysis model, which comprises the following steps:
inputting at least one first setting statement into a data analysis model to obtain a prediction category, a starting position identifier and a stopping position identifier corresponding to a setting word in each first setting statement in the at least one first setting statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of an evaluation set entity;
calculating a loss value of the data analysis model based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding setting word;
updating model parameters of the data analysis based on the loss values.
An embodiment of the present invention further provides an electronic device, including:
the training unit is used for inputting at least one first set statement into the data analysis model to obtain a prediction category, an initial position identifier and a stop position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of the evaluation set entity;
a calculating unit, configured to calculate a loss value of the data analysis model based on the calibration category, the first calibration position identifier, and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier, and the end position identifier corresponding to the corresponding setting word;
an updating unit for updating model parameters of the data analysis based on the loss values;
the output unit is used for outputting the trained data analysis model to obtain a first model;
the extraction unit is used for inputting a first sentence into the first model to obtain an emotion category, a starting position mark and a cut-off position mark corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information.
An embodiment of the present application further provides an electronic device, including: a processor and a memory for storing a computer program capable of running on the processor,
wherein the processor is configured to execute the steps of the information display method when running the computer program.
Embodiments of the present application further provide a storage medium on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above-mentioned data analysis method or method for training a data analysis model.
In the embodiment of the invention, the prediction type, the initial position identifier and the ending position identifier corresponding to the set words in the first set sentence are used for training the data analysis model, so that the trained data analysis model can accurately output the emotion type to which the keyword of the entity to be evaluated belongs in the first sentence to be analyzed, and the initial position identifier and the ending position identifier located in the first sentence, thereby determining the corresponding keyword from the first sentence based on the output initial position identifier and ending position identifier, improving the accuracy of the extracted keyword, determining the real evaluation of the entity by the user based on the determined keyword and the emotion type to which the keyword belongs, and improving the reliability of the obtained real evaluation.
Drawings
Fig. 1 is a schematic flow chart illustrating an implementation of a data analysis method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of sample data in a sample library according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating an implementation flow of a data analysis model processing a first setting statement in the data analysis method according to the embodiment of the present invention;
FIG. 4 is a diagram illustrating a data analysis model processing a first configuration statement according to an embodiment of the present invention;
fig. 5 is a schematic flow chart illustrating an implementation of determining a location identifier in a data analysis method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a portion of the test results provided by an embodiment of the present invention;
FIG. 7 is a flow chart illustrating an implementation of a method for providing a training data analysis model according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to another embodiment of the present invention;
fig. 10 is a schematic diagram of a hardware component structure of an electronic device according to an embodiment of the present invention.
Detailed Description
In an application scenario of an intelligent dialog, an entity name and a keyword of an evaluation entity need to be extracted from voice dialog content, and the keyword of the evaluation entity includes: fast speed, good quality, convenient traffic, low cost and the like.
Illustratively, the intelligent dialog is as follows:
the robot comprises: will you be mr. From now on, i.e. the return visit specialist of company, want to make a simple return visit with you?
Customer: kaki, yes.
The robot comprises: asking you to go to the building, how well the status of the business consultant?
Customer: the beam is very good.
The robot comprises: that do you feel that he is not professional?
The client: go back to the bar.
The robot comprises: how well do you feel about the building?
Customer: the environment of the cell may also be such that traffic is not very convenient.
In the intelligent conversation, relevant evaluation on the buildings needs to be extracted, wherein the keywords for evaluating the environment are still, and the keywords for evaluating the traffic are not very convenient; in some scenarios, it may also be desirable to determine whether the associated rating is a positive rating or a negative rating.
In the data analysis method in the related art, in the process of extracting the keywords, the characteristics of a plurality of words before or after the keywords are ignored, the problem of incomplete extracted characteristics exists, and the influence of positive evaluation or negative evaluation on the extraction of the keywords is not considered, so that the accuracy of the extracted keywords is low, and the real evaluation of the user on the entity cannot be obtained.
Based on this, the embodiment of the present invention provides a data analysis method, in the training process, a prediction category, a start position identifier and a stop position identifier corresponding to a word set in a first set sentence are used to train a data analysis model, so that the trained data analysis model can accurately output an emotion category to which a keyword of an evaluation entity in the first sentence to be analyzed belongs, and the start position identifier and the stop position identifier in the first sentence, so that a corresponding keyword can be determined from the first sentence based on the output start position identifier and the stop position identifier, the accuracy of the extracted keyword is improved, the real evaluation of a user on the entity is determined based on the determined keyword and the emotion category to which the keyword belongs, and the reliability of the obtained real evaluation is improved.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not delimit the invention.
Fig. 1 is a schematic diagram of an implementation process of a data analysis method according to an embodiment of the present invention, where an execution subject of the process is an electronic device such as a terminal and a server. As shown in fig. 1, the data analysis method includes:
step 101: inputting at least one first set statement into a data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of the evaluation set entity.
Here, the electronic device inputs at least one first setting statement into the data analysis model, and processes each first setting statement in the at least one first setting statement by using the data analysis model to obtain a prediction category, a start position identifier, and a stop position identifier corresponding to a setting word in each first setting statement corresponding to each first setting statement. Wherein the content of the first and second substances,
the first setting statement is obtained by splicing the setting entity and the corresponding setting statement through the first setting identification and the second setting identification. The first setting identification is positioned in front of the setting entity; the second setting mark is positioned between the setting entity and the corresponding setting statement and is used for separating the setting entity from the corresponding setting statement. In actual application, the format of the first setting statement is as follows: [ CLS ] setting entity [ SEP ] setting statement. The CLS is corresponding to a first setting identification, and the SEP is corresponding to a second setting identification.
Illustratively, the first set statement may be: [ CLS ] concealer function [ SEP ] was good, concealer function was poor, and overall was good. The set words in the first set sentence are somewhat different, with the corresponding start position identified as 13 and the corresponding end position identified as 16.
The first setting statement is determined from a sample library, a plurality of samples are stored in the sample library, and each sample comprises a setting entity, a setting statement, a calibration category corresponding to a setting word in the setting statement, a first calibration position identifier and a second calibration position identifier. The first calibration position mark corresponds to a calibration start position mark, and the second calibration position mark corresponds to a stop position mark. The sample library may exist in a local database of the electronic device or in a remote database.
Illustratively, the training samples stored in the sample library are shown in fig. 2. The marked emotion categories in fig. 2 include a positive category, a neutral category, and a negative category, and in actual application, 1 is used to represent the positive category, that is, the keyword of the evaluation setting entity belongs to the positive category, which is positive evaluation; the negative class is represented by-1, namely, the keywords of the evaluation setting entity belong to the negative class, and are negative evaluations; neutral species are represented by 0.
In fig. 2, the calibration type corresponding to the words is set to be 1 quickly, inexpensively, particularly well, very nice, etc.; the calibration type corresponding to the set words such as less, more convenient and less is-1; the nominal type flag corresponding to the word is set to be 0 in general and more or less.
It should be noted that, when the starting position identifier corresponding to the setting word is greater than the ending position identifier, the first setting sentence is characterized in that the setting sentence in the first setting sentence does not include the keyword of the evaluation setting entity.
Processing each first setting statement in at least one first setting statement by using a data analysis model, wherein the processing comprises the following steps:
vectorizing the first set statement to obtain a vector sequence corresponding to the first set statement, and encoding the vector sequence corresponding to the first set statement at least once to obtain an encoded vector sequence; processing a first vector in the coded vector sequence to obtain a prediction category corresponding to a set word in a first set statement; and processing the vector corresponding to the setting statement included in the first setting statement in the coded vector sequence to obtain the initial position identifier and the ending position identifier corresponding to the setting word in the first setting statement. Wherein the first vector characterizes a global feature of the first set statement.
The vector sequence is composed of a plurality of word vectors, which are the results of vectorizing words in a sentence. The first setting identification and the second setting identification correspond to a word vector respectively, and each word in the setting entity and the setting statement corresponds to a word vector. The purpose of encoding the vector sequence corresponding to the first setting statement is to generate information interaction among all vectors, and the encoded vector sequence represents the characteristic information of the corresponding first setting statement.
In practical application, in the vector sequence corresponding to the first setting statement, the vector corresponding to the first setting identifier is located at the forefront end of the vector sequence. And in the process of coding the word vector sequence by the data analysis model, converging the extracted global features of the first setting statement into the word vector corresponding to the first setting identifier.
In practical application, the feature extraction model is a Bidirectional coding Representation from transforms (BERT) based on a converter model; wherein the content of the first and second substances,
BERT is refined by a Transformer model (Transformer), and Attention Mechanism (Attention Mechanism) in BERT is bidirectional; that is, the feature information of each word or character string in the sentence input to BERT may be fused to the feature information of a word preceding the word or character string, or may be fused to the feature information of a word following the word or character string.
In the field of natural language processing, attention mechanism is embodied in information fusion of a word in a sentence with a word Vector (Char Vector) of other words in the sentence.
In some embodiments, the data analysis model includes a feature extraction model and a full connection layer, as shown in fig. 3, the inputting at least one first setting statement into the data analysis model to obtain a prediction category, a start position identifier, and a stop position identifier corresponding to a setting word in each of the at least one first setting statement includes:
step 301: inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector represents the global features of the first setting statement, and the first subsequence is formed by vectors corresponding to each word in the setting statement.
Here, the electronic device inputs at least one first setting statement into the feature extraction model, performs vectorization on each first setting statement through the feature extraction model to obtain at least one vector sequence corresponding to each first setting statement, and performs at least one encoding on the at least one vector sequence corresponding to each first setting statement to obtain an encoded first vector sequence corresponding to each first setting statement. Wherein the first vector sequence comprises a first vector, a first subsequence, and a third subsequence. The first vector characterizes a global feature of the first set statement; the first subsequence consists of vectors corresponding to each word in the set statements in the first set statements; the third subsequence is composed of the setting entity in the first setting statement and the vector corresponding to the second setting identifier.
In practical application, the first vector corresponds to a first vector in a first vector sequence, and in the first vector sequence, the first vector, the third subsequence and the first subsequence are sequentially arranged.
When in actual use, the coating is specially usedThe extraction model is characterized as BERT, and fig. 4 shows a schematic diagram of the data analysis model processing the first set statement. As shown in fig. 4, the data analysis model performs vectorization on the first setting statement to obtain 3 vector sequences, and performs at least one encoding on the 3 vector sequences to obtain a first vector sequence. The vector sequence 1 represents the position of each word or setting mark in the first statement, and the setting mark comprises a first setting mark and a second setting mark. The vector sequence 2 characterizes the set entity, and is composed of vectors corresponding to each word in the set entity. The vector sequence 3 characterizes the features of the set statement in the first set statement, the vector sequence 3 being formed by a vector E characterizing the global features of the set statement CLS And the vector corresponding to each word in the setting statement.
In practical applications, vector sequence 1, vector sequence 2, and vector sequence 3 in fig. 4 are aligned, so that some of the vectors in vector sequence 2 and vector sequence 3 are empty.
Since the first setting flag CLS is set in front of the setting entity in the first statement and the second setting flag SEP is set behind the setting entity, the first vector is T in fig. 4 cls The first subsequence consisting of T 5 、T 6 、T 7 、T 8 And T n The third subsequence consisting of T 1 、T 2 、T 3 、T 4 And T adjacent to T4 sep And (4) forming.
In some embodiments, the position of each word or setting identifier in the first sentence is represented by two position vector sequences, that is, two sub-sequences, where each word or setting identifier is located in the corresponding first string. Therefore, in the process of training the data analysis model, the electronic equipment can separately adjust the vectors in the two subsequences, so that the start position identifier and the stop position identifier corresponding to the set words in the first set sentence can be separately trained. The first string includes a first setting identifier, a setting entity, a second setting identifier, or a string obtained by cutting a setting sentence in a first setting sentence according to punctuation marks.
The subsequence 1 corresponding to the position vector sequence comprises a vector for representing the position 1, and the subsequence 2 corresponding to the position vector sequence comprises a vector for representing the position 2. Of course vectors other than the ones characterizing position 1 and position 2 may be in sub-sequence 1 or sub-sequence 2 corresponding to the sequence of position vectors. Position 1 indicates that the corresponding word is the first word in the corresponding first string, and position 2 indicates that the corresponding word is the last word in the corresponding first string; the position 1 and the position 2 corresponding to the first setting identifier are the same, and the position 1 and the position 2 corresponding to the second setting identifier are also the same.
Step 302: and inputting the first vector corresponding to each first setting statement into the full-connection layer to obtain the prediction category corresponding to the setting words in each first setting statement.
Here, when the electronic device acquires the first vector sequence corresponding to the first setting statement output by the feature extraction model, the electronic device inputs the first vector in the first vector sequence corresponding to the first setting statement to the full connection layer for processing, obtains a probability distribution of the calibration categories corresponding to the setting terms, and determines the calibration category corresponding to the maximum probability value in the probability distribution as the prediction category corresponding to the setting terms in the first setting statement.
Step 303: and determining a starting position identifier and a stopping position identifier corresponding to the set words in each first set statement based on the first subsequence corresponding to each first set statement.
When the electronic equipment acquires a first subsequence in a first vector sequence corresponding to a first set statement output by the feature extraction model, processing vectors in the first subsequence by using an attention mechanism, and determining probability distribution of a starting position and probability distribution of a stopping position corresponding to a set word in the first set statement; determining the initial position corresponding to the maximum probability value in the probability distribution of the initial position as the initial position corresponding to the corresponding set word, and outputting the initial position identification corresponding to the set word in the first set sentence; and determining the cut-off position corresponding to the maximum probability value in the probability distribution of the cut-off positions as the cut-off position corresponding to the corresponding set word, and outputting a cut-off position identifier corresponding to the set word in the first set statement. And the position corresponding to the maximum probability value is the focus position concerned by the corresponding set entity.
In this embodiment, because the first vector characterizes the global feature of the first setting statement, the prediction category corresponding to the setting word in the first setting statement can be accurately determined by processing the first vector through the full connection layer; the attention mechanism is adopted to process the first subsequence, the focus position concerned by the set entity can be accurately determined, the starting position identification and the ending position identification corresponding to the set word corresponding to the set entity are determined based on the focus position concerned by the set entity, the position of the set word in the first set sentence is determined, the keywords evaluating the set entity are extracted from the first set sentence, and the accuracy of the extracted set word can be improved. The accuracy of the extracted keywords can be improved. By setting the focus position concerned by the entity to determine the starting position mark and the ending position mark corresponding to the set word, the association characteristics between the set entity and the set word can be extracted, therefore, when the keyword of the evaluation entity is extracted from the evaluation information by using the trained data analysis model, even if the evaluation information comprises a plurality of set entities and a plurality of keywords, the keyword corresponding to each set entity can be accurately identified, and the accuracy of the extracted keyword can be improved. In addition, the data analysis model can execute a classification task and an extraction task of set words, and the robustness of the model can be improved.
The vectors in the first vector set corresponding to the first setting statement are column vectors. In some embodiments, as shown in fig. 5, the determining, based on the first subsequence corresponding to each first setting statement, a start position identifier and an end position identifier corresponding to the setting word in each first setting statement includes:
step 501: and converting the first subsequence corresponding to the first setting statement into at least two second vectors.
Here, the electronic device performs a combining process on elements contained in the vectors in the first subsequence, thereby converting the first subsequence into at least two second vectors.
In some embodiments, the converting the first subsequence corresponding to the first setting statement into at least two second vectors includes at least two of:
calculating the mean value of vectors in a first subsequence corresponding to the first set statement to obtain a second vector;
determining a minimum value corresponding to each dimension from vectors in a first subsequence corresponding to a first set statement to obtain a second vector;
and determining the maximum value corresponding to each dimension from the vectors in the first subsequence corresponding to the first setting statement to obtain a second vector.
Here, the electronic device calculates a mean value for each dimension based on the value for each dimension in each vector in the first subsequence, resulting in a second vector.
The electronic equipment determines the minimum value corresponding to each dimension based on the value of each dimension in each vector in the first subsequence, and combines the determined minimum values corresponding to each dimension to obtain a second vector.
The electronic equipment determines a maximum value corresponding to each dimension based on the value of each dimension in each vector in the first subsequence, and combines the determined maximum values corresponding to each dimension to obtain a second vector.
In practical application, 3 second vectors are determined by adopting the 3 modes.
Step 502: adding each vector in the first subsequence corresponding to the first set statement and the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector.
The electronic equipment determines a random vector to obtain a third vector; and adding each vector in the first subsequence to the third vector to obtain a second subsequence. Wherein the dimension of the third vector is the same as the dimension of each vector in the first subsequence.
Step 503: calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair includes a first attention vector characterizing the start position probability and a second attention vector characterizing the stop position probability.
Here, the electronic device calculates, for each of the at least two second vectors, a corresponding attention vector pair for each vector in the corresponding second subsequence based on the second vector corresponding to the first setting statement and each vector in the second subsequence using the set attention formula, thereby obtaining at least two attention vector pairs for each vector in the second subsequence. The number of attention degree vector pairs corresponding to each vector is equal to the number of second vectors.
In practical application, the set attention formula is as follows:
Figure BDA0003135693390000131
Figure BDA0003135693390000132
wherein, a i The attention degree vector is used for representing the attention degree of the second vector to the ith vector in the second subsequence; exp (e) i ) Characterization e i An exponential function of (a); v denotes the third vector, V T Characterizing the transpose of V; v, W and U are model parameters to be trained;
Figure BDA0003135693390000141
characterizing the ith vector in the second subsequence; e entity The second vector is characterized and L characterizes the total number of vectors in the second subsequence. It should be noted that each vector in the first vector sequence, the third vector and the attention vector have the same size. In practical application, each vector in the first vector sequence, the third vector and the attention degree vector are 768-dimensional column vectors.
In order to improve the accuracy of the start position identifier and the stop position identifier corresponding to the set word output by the data analysis model, in some embodiments, the calculating, based on each second vector and the second subsequence corresponding to the first set statement, a pair of attention vectors corresponding to each vector in the corresponding second subsequence includes:
respectively calculating a first attention degree vector and a second attention degree vector corresponding to each vector in the corresponding second subsequence based on a second vector and a second subsequence corresponding to the first setting statement by adopting a first setting function and a second setting function; wherein the model parameters in the first and second set functions are not shared.
The electronic equipment calculates a first attention degree vector corresponding to each vector in the corresponding second subsequence based on the second vector and the second subsequence corresponding to the first setting statement by adopting a first setting function; and calculating a second attention vector corresponding to each vector in the corresponding second subsequence by adopting a second setting function.
In actual application, the first setting function and the second setting function correspond to equations 1 and 2 in step 503; the model parameters not shared by the first setting function and the second setting function refer to the three model parameters V, W, and U in the above equation 2.
It should be noted that the initial values of the model parameters in the first setting function and the second setting function may be the same, and the model parameters in the first setting function and the second setting function are not shared in the process of training the data analysis model. Adjusting model parameters in a first set function based on a loss value between a first calibration position identifier corresponding to a corresponding set word and a corresponding initial position identifier; and adjusting the model parameters in the second setting function based on the loss value between the second calibration position identifier corresponding to the corresponding setting word and the corresponding cut-off position identifier.
In this embodiment, the first setting function and the second setting function do not share the model parameter, so that the start position identifier and the stop position identifier corresponding to the set word can be trained separately, and the training efficiency can be improved.
Step 504: and determining the starting position identifier corresponding to the setting words in the first setting statement based on the average value of the first attention degree vector corresponding to each vector in the second subsequence corresponding to the first setting statement.
Here, when determining at least two attention degree vector pairs corresponding to each vector in the second subsequence corresponding to the first setting statement, the electronic device determines a first attention degree vector corresponding to each vector in the second subsequence from each of the two attention degree vector pairs; performing mean value operation on all the first attention degree vectors corresponding to each vector in the second subsequence to obtain the mean value of the first attention degree vectors corresponding to each vector in the second subsequence; determining the maximum mean value from the mean values of the first attention degree vectors corresponding to each vector in the second subsequence, determining the position of the vector corresponding to the maximum mean value in the first setting statement as the initial position of the setting word, and outputting the initial position identification corresponding to the setting word in the first setting statement.
In practical application, because the total number of vectors in the first vector sequence corresponding to the first setting statement is equal to the total number of words and setting identifiers in the first setting statement, and the vectors in the first vector sequence are arranged according to the sorting order of the words or the setting identifiers in the first setting statement, the electronic device can output the starting position identifier corresponding to the setting word in the first setting statement by using the sequence number of the vector corresponding to the maximum average value of the first attention vector in the corresponding first vector sequence corresponding to the first setting statement.
Step 505: and determining a cut-off position identifier corresponding to the set word in the first set statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first set statement.
Here, the electronic device performs a mean operation on all second attention vectors corresponding to each vector in the second subsequence to obtain a mean value of the second attention vectors corresponding to each vector in the second subsequence; determining the maximum mean value from the mean values of the second attention degree vectors corresponding to each vector in the second subsequence, determining the position of the vector corresponding to the maximum mean value in the first set statement as the cut-off position of the set word, and outputting the cut-off position identification corresponding to the set word in the first set statement.
In practical application, the electronic device may output the ending position identifier corresponding to the setting word in the first setting statement from the vector corresponding to the maximum average value of the second attention degree vector in the sequence number of the first vector corresponding to the corresponding first setting statement.
In this embodiment, the vectors in the first subsequence corresponding to the first setting statement are combined, so that the first subsequence is converted into at least two second vectors; adding each vector in the first subsequence corresponding to the first setting statement to the third vector to obtain a second subsequence; and processing each second vector and the second subsequence in the at least two second vectors by using an attention mechanism, determining at least two attention vector pairs corresponding to each vector in the second subsequence of the second vector pairs, and determining a start position identifier and a stop position identifier corresponding to the set words in the first set statement based on the at least two attention vector pairs. The vectors in the first subsequence are combined when the second vector is determined, and the characteristics of a plurality of words before or after the set word are comprehensively considered, so that the characteristics extracted from the set word of the first set word are more complete, the starting position identification and the ending position identification are determined based on the complete characteristics, and the accuracy of the extracted set word can be improved. In addition, considering that most of the keywords of the evaluation setting entity are adjectives, each vector in the first subsequence is added with the third vector to obtain a second subsequence, which is equivalent to adding extra features on the basis of the first subsequence, so that the situation that the extracted keywords have more characters and fewer characters can be reduced.
Step 102: and calculating a loss value of the data analysis model by using a set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each of the at least one first set statement, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding set word.
The electronic device calculates a first loss value between the calibration category and the prediction category corresponding to the setting word in the first setting statement by using the set loss function, calculates a second loss value between the first calibration position identifier and the corresponding start position identifier corresponding to the setting word in the first setting statement, calculates a third loss value between the second calibration position identifier and the corresponding stop position identifier corresponding to the setting word in the first setting statement, and calculates the sum of the first loss value, the second loss value and the third loss value to obtain a loss value of the data analysis model.
In some embodiments, the set penalty function comprises a first sub-function, a second sub-function, a third sub-function, a first weight, a second weight, and a third weight; the calculating the loss value of the data analysis model by using the set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each of the at least one first set sentence, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding set word includes:
calculating a first loss value between a calibration category corresponding to a set word in a first set statement and a corresponding prediction category based on the first subfunction and the first weight;
calculating a second loss value between a first calibration position identifier corresponding to a set word in a first set statement and a corresponding initial position identifier based on the second subfunction and the second weight;
calculating a third loss value between a second calibration position identifier corresponding to a set word in the first set statement and a corresponding cut-off position identifier based on the third subfunction and the third weight;
calculating a loss value of the data analysis model based on a first loss value, a second loss value and a third loss value corresponding to a first set statement; wherein the second weight and the third weight are both greater than the first weight.
Here, the first, second, and third sub-functions are all cross-entropy functions. The sum of the first weight, the second weight, and the third weight is equal to 1, and the second weight and the third weight are both greater than the first weight. The first weight, the second weight, and the third weight are adjustable during the training process.
The electronic equipment substitutes the calibration type and the corresponding prediction type corresponding to the set words in the first set sentence into the first subfunction to calculate a first function value, and determines the product of the first function value and the first weight as a first loss value.
And the electronic equipment substitutes the first calibration position identification corresponding to the set word in the first set sentence and the corresponding initial position identification into a second subfunction to calculate a second function value, and determines the product of the second function value and the second weight as a second loss value.
And the electronic equipment substitutes the first calibration position identification corresponding to the set word in the first set sentence and the corresponding initial position identification into a third subfunction to calculate a third function value, and determines the product of the third function value and a third weight as a third loss value.
Illustratively, the set loss function is:
Figure BDA0003135693390000171
wherein, w 1 Is a first weight of the weight set to be a first weight,
Figure BDA0003135693390000172
is a first sub-function; p (x) i ) Characterizing the calibration category corresponding to the set words in the ith first set statement; q (x) i ) Representing the prediction category corresponding to the set words in the ith first set statement; n represents the total number of first setting statements input into the data analysis model in the same batch; w is a 2 In order to be the second weight, the weight is,
Figure BDA0003135693390000173
is a second sub-function; p (y) i ) Characterizing set words in the ith first set statementThe first nominal position mark, q (y), corresponding to the phrase i ) Representing the initial position identification corresponding to the set word in the ith first set statement; w is a 3 Is a third weight that is a function of the first weight,
Figure BDA0003135693390000181
is a third sub-function; p (z) i ) Characterizing a second calibration position identifier corresponding to a set word in the ith first setting statement; q (z) i ) And representing the corresponding cut-off position identification of the set word in the ith first set statement.
In actual application, the second weight and the third weight are the same, w 1 :w 2 :w 3 =1:5:5。
In this embodiment, by adjusting the first weight, the second weight, and the third weight, semantic information in a setting sentence in the first setting sentence is fully utilized, accuracy of an initial position identifier and a final position identifier corresponding to an output setting word is improved, and accuracy of the extracted keyword is further improved.
Step 103: updating model parameters of the data analysis model based on the loss values.
Here, the electronic device updates the model parameters of the data analysis model based on the calculated loss values to improve the accuracy of the prediction result output by the data analysis model. In practical application, the electronic device updates the third vector in the data analysis model, V, W and U in the above equation 2, and adjusts and sets the first weight, the second weight and the third weight in the loss function based on the loss value.
Here, an update stop condition may be set, and when the update stop condition is satisfied, the weight parameter obtained by the last update may be determined as the weight parameter used by the trained data analysis model. The stop condition is updated as the set number of training sessions. Of course, the update stop condition is not limited to this, and may be, for example, a set Average accuracy (mAP) or the like. Wherein the content of the first and second substances,
in the process of training the data analysis model, adam is used as an optimizer to optimize the model parameters of the data analysis model. Fruit of Chinese wolfberryWhen the method is applied, the learning rate of the data analysis model is 3e -5 . The larger the learning rate is, the larger the influence of the output error on the model parameters is, and the faster the model parameters are updated.
After the data analysis model is trained, step 104 is performed to put the data analysis model into use.
In some embodiments, after training the data analysis model, the method further comprises:
determining a second keyword from a second set sentence based on a starting position mark and a stop position mark corresponding to the second keyword in the second set sentence;
calculating the accuracy of the data analysis model based on the first word number and the second word number corresponding to the second setting statement;
outputting a trained data analysis model under the condition that the accuracy is greater than or equal to a set threshold;
the first word number represents the word number included by the intersection of the set words corresponding to the second set statement and the second keyword; the second word number represents the word number included in the union set of the setting words corresponding to the second setting statement and the second keyword.
Here, the second setting statement is a test sample, and the electronic device tests the accuracy of the trained data analysis model using the second setting statement.
The electronic equipment inputs the second set sentence into the trained data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a second keyword in the second set sentence; and extracting the second keyword from the second set statement based on the starting position mark and the ending position mark corresponding to the second keyword.
Under the condition that a second keyword is determined from a second set sentence, word cutting is carried out on the set word and the second keyword corresponding to the second set sentence, so that the set word and the second keyword are cut into words, the same words in the set word and the second keyword are determined, the intersection of the set word and the second keyword is obtained, the word number in the intersection is calculated, and the first word number is obtained; and performing duplication removal processing on all the characters obtained by the word segmentation to obtain a union set of the set word and the second keyword, and calculating the number of the characters in the union set to obtain the second number of the characters.
Under the condition that the first word number and the second word number corresponding to the second setting statement are calculated, determining the quotient of the first word number and the second word number as the accuracy of the data analysis model, and judging whether the accuracy is smaller than a setting threshold value or not; under the condition that the accuracy is smaller than the set threshold, executing the steps 101 to 103 again, and continuing to train the data analysis model; and stopping training the data analysis model when the accuracy is greater than or equal to the set threshold value, and outputting the trained data analysis model.
And the electronic equipment takes the weight parameter obtained after the final update as the weight parameter used by the trained data analysis model.
It should be noted that the setting words, the first keywords, and the second keywords mentioned in the embodiments of the present invention generally refer to words or phrases.
In practical application, the formula is utilized
Figure BDA0003135693390000191
The accuracy of the data analysis model was calculated. Wherein J (A, B) is called Jaccard similarity coefficient and represents the similarity between A and B; a is corresponding to a set word in a second set statement, B is corresponding to a second keyword, and A ^ B represents the intersection of the set word and the second keyword; and A U B represents a union of the set words and the second key words.
Illustratively, when a is a price way and B is a price way, the word segmentation of a and B results in a = [ price, grid, way = [ ]]B = [ valence, lattice, public =]Calculating A ≈ B =3, aomegaB =4,
Figure BDA0003135693390000201
wherein, the larger the Jaccard similarity coefficient is, the more similar the two character strings are, and the smaller the Jaccard similarity coefficient is, the more dissimilar the two character strings are. The model is evaluated by using the Jaccard similarity coefficient, so that the effect of extracting the keywords by the comfort analysis model can be well embodied.
In practical application, when the trained data analysis model is tested by adopting a test set, the Jaccard similarity coefficient is 0.798. Some of the test results are shown in fig. 6. In fig. 6, the keywords extracted in the black frame are correct, except for an error.
Step 104: and outputting the trained data analysis model to obtain a first model.
Step 105: inputting a first sentence into the first model to obtain an emotion category, a starting position identifier and a cut-off position identifier corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information.
Step 105 is similar to step 101, and the implementation process of step 105 refers to the related description in step 101, which is not described herein again.
In the embodiment, in the training process, the prediction type, the starting position identifier and the ending position identifier corresponding to the words are set in the first set sentence, and the data analysis model is trained, so that the trained data analysis model can accurately output the emotion type to which the keyword of the entity to be evaluated belongs in the first sentence to be analyzed, and the starting position identifier and the ending position identifier located in the first sentence, and therefore the corresponding keyword can be extracted from the first sentence based on the output starting position identifier and the ending position identifier, the accuracy of the extracted keyword is improved, the real evaluation of the entity by the user is determined based on the determined keyword and the emotion type to which the keyword belongs, and the reliability of the obtained real evaluation is improved.
Fig. 7 is a schematic diagram of an implementation process of the method for training a data analysis model according to the embodiment of the present invention, where an execution subject of the process is an electronic device such as a terminal and a server. It should be noted that the electronic device in the embodiment corresponding to the training data analysis model in this embodiment may be the same as or different from the electronic device executing the data analysis method. As shown in fig. 7, the method of training a data analysis model includes:
step 701: inputting at least one first set statement into a data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the setting words are keywords of the evaluation setting entity.
Step 702: and calculating a loss value of the data analysis model based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier and the ending position identifier corresponding to the corresponding setting word.
Step 703: updating model parameters of the data analysis based on the loss values.
Steps 701 to 703 are the same as steps 101 to 103 in the embodiment corresponding to fig. 1, and the relevant description in steps 101 to 103 is referred to for the implementation process, which is not repeated herein.
In this embodiment, in the training process, the data analysis model is trained by using the prediction category, the start position identifier, and the end position identifier corresponding to the word set in the first set sentence, so as to obtain the trained data analysis model. The trained data analysis model can accurately output the emotion category to which the keyword of the entity to be evaluated belongs in the first sentence to be analyzed, and the start position identification and the stop position identification which are located in the first sentence, so that the corresponding keyword can be determined from the first sentence based on the output start position identification and the stop position identification, the accuracy of the extracted keyword is improved, the real evaluation of the entity by a user is determined based on the determined keyword and the emotion category to which the keyword belongs, and the reliability of the obtained real evaluation is improved.
In order to implement the data analysis method according to the embodiment of the present invention, an embodiment of the present invention further provides an electronic device, as shown in fig. 8, where the electronic device includes:
a training unit 81, configured to input at least one first setting statement to a data analysis model, to obtain a prediction category, a start position identifier, and a stop position identifier corresponding to a setting word in each first setting statement in the at least one first setting statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of an evaluation set entity;
a calculating unit 82, configured to calculate a loss value of the data analysis model based on the calibration category, the first calibration position identifier, and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier, and the end position identifier corresponding to the corresponding setting word;
an updating unit 83 for updating model parameters of the data analysis based on the loss values;
an output unit 84, configured to output the trained data analysis model to obtain a first model;
the extracting unit 85 is configured to input a first sentence into the first model, and obtain an emotion category, a start position identifier, and a stop position identifier corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information.
In some embodiments, the set penalty function comprises a first sub-function, a second sub-function, a third sub-function, a first weight, a second weight, and a third weight; the calculating unit 82 is specifically configured to:
calculating a first loss value between a calibration category corresponding to a set word in a first set statement and a corresponding prediction category based on the first subfunction and the first weight;
calculating a second loss value between a first calibration position identifier corresponding to a set word in a first set statement and a corresponding initial position identifier based on the second subfunction and the second weight;
calculating a third loss value between a second calibration position identifier corresponding to the set word in the first set statement and a corresponding cut-off position identifier based on the third subfunction and the third weight;
calculating a loss value of the data analysis model based on a first loss value, a second loss value and a third loss value corresponding to a first set statement; wherein the second weight and the third weight are both greater than the first weight.
In some embodiments, the data analysis model comprises a feature extraction model and a fully connected layer; the training unit 81 is specifically configured to:
inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector represents the global features of the first setting statement, and the first subsequence is formed by vectors corresponding to each word in the setting statement;
inputting a first vector corresponding to each first setting statement into the full-connection layer to obtain a prediction category corresponding to a setting word in each first setting statement;
and determining a starting position identifier and a stopping position identifier corresponding to the set words in each first set statement based on the first subsequence corresponding to each first set statement.
In some embodiments, the training unit 81 is specifically configured to:
converting a first subsequence corresponding to the first setting statement into at least two second vectors;
adding each vector in the first subsequence corresponding to the first setting statement to the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector;
calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair comprises a first attention vector for representing the starting position probability and a second attention vector for representing the ending position probability;
determining an initial position identifier corresponding to a set word in a first set statement based on the mean value of the first attention degree vector corresponding to each vector in a second subsequence corresponding to the first set statement;
and determining a cut-off position identifier corresponding to the setting words in the first setting statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first setting statement.
In some embodiments, the training unit 81 is specifically configured to perform two of the following:
calculating the average value of vectors in a first subsequence corresponding to the first set statement to obtain a second vector;
determining a minimum value corresponding to each dimension from vectors in a first subsequence corresponding to a first set statement to obtain a second vector;
and determining the maximum value corresponding to each dimension from the vectors in the first subsequence corresponding to the first setting statement to obtain a second vector.
In some embodiments, training unit 81 is specifically configured to:
respectively calculating a first attention degree vector and a second attention degree vector corresponding to each vector in the corresponding second subsequence based on a second vector and a second subsequence corresponding to the first setting statement by adopting a first setting function and a second setting function; wherein the content of the first and second substances,
the model parameters in the first and second set functions are not shared.
In some embodiments, the electronic device further comprises a test unit for:
determining a second keyword from a second set sentence based on a starting position mark and a stop position mark corresponding to the second keyword in the second set sentence;
calculating the accuracy of the data analysis model based on the first word number and the second word number corresponding to the second setting statement;
outputting a trained data analysis model under the condition that the accuracy is greater than or equal to a set threshold;
the first word number represents the number of words included in the intersection of the set words corresponding to the second set statement and the second keyword; the second word number represents the word number included in the union set of the setting words corresponding to the second setting sentence and the second keyword.
In practical applications, each Unit included in the electronic device may be implemented by a Processor in the electronic device, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), a Micro Control Unit (MCU), or a Programmable Gate Array (FPGA); or jointly by a processor and a communication interface in the electronic device. Of course, the processor needs to run the program stored in the memory to realize the functions of the above-described program modules.
It should be noted that: in the above embodiment, when performing data analysis, the electronic device is only illustrated by dividing the program modules, and in practical applications, the processing may be distributed to different program modules according to needs, that is, the internal structure of the apparatus is divided into different program modules to complete all or part of the processing described above. In addition, the electronic device and the data analysis method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
In order to implement the method for training a data analysis model according to the embodiment of the present invention, an embodiment of the present invention further provides an electronic device, as shown in fig. 9, where the electronic device includes:
a training unit 91, configured to input at least one first setting statement to a data analysis model, to obtain a prediction category, a start position identifier, and a stop position identifier corresponding to a setting word in each first setting statement in the at least one first setting statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of an evaluation set entity;
a calculating unit 92, configured to calculate a loss value of the data analysis model based on the calibration category, the first calibration position identifier, and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier, and the end position identifier corresponding to the corresponding setting word;
an updating unit 93 for updating model parameters of the data analysis based on the loss values.
In practical applications, the training unit 91, the calculating unit 92 and the updating unit 93 may be implemented by a processor in an electronic device, such as a CPU, a DSP, an MCU or an FPGA.
It should be noted that: in the electronic device provided in the above embodiment, when the data analysis model is trained, only the division of the program modules is used as an example, and in practical applications, the processing distribution may be completed by different program modules according to needs, that is, the internal structure of the apparatus is divided into different program modules to complete all or part of the processing described above. In addition, the electronic device provided in the above embodiment and the method embodiment for training the data analysis model belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiment, and are not described herein again.
Based on the hardware implementation of the program module, in order to implement the method according to the embodiment of the present invention, an embodiment of the present invention further provides an electronic device. Fig. 10 is a schematic diagram of a hardware component structure of an electronic device according to an embodiment of the present invention, and as shown in fig. 10, the electronic device 10 includes:
a communication interface 101 capable of performing information interaction with other devices such as network devices and the like;
and the processor 102 is connected with the communication interface 101 to implement information interaction with other devices, and is used for executing a data analysis method provided by one or more of the above technical solutions or a method for training a data analysis model when running a computer program. And the computer program is stored on the memory 103.
Of course, in practice, the various components in the electronic device 10 are coupled together by the bus system 104. It is understood that the bus system 104 is used to enable connected communication between these components. The bus system 104 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 104 in fig. 10.
The memory 103 in embodiments of the present invention is used to store various types of data to support the operation of the electronic device 10. Examples of such data include: any computer program for operating on the electronic device 10.
It will be appreciated that the memory 103 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a magnetic random access Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration, and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), synchronous Static Random Access Memory (SSRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), double Data Rate Synchronous Random Access Memory (ESDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), enhanced Synchronous Random Access Memory (DRAM), synchronous Random Access Memory (DRAM), direct Random Access Memory (DRmb Access Memory). The memory 103 described in embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed by the above-mentioned embodiments of the present invention may be applied to the processor 102, or may be implemented by the processor 102. The processor 102 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 102. The processor 102 described above may be a general purpose processor, DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, or the like. Processor 102 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 103, and the processor 102 reads the program in the memory 103 and performs the steps of the foregoing method in combination with the hardware thereof.
Optionally, when the processor 102 executes the program, the corresponding process implemented by the terminal in each method according to the embodiment of the present invention is implemented, and for brevity, is not described again here.
In an exemplary embodiment, the embodiment of the present invention further provides a storage medium, i.e. a computer storage medium, specifically a computer readable storage medium, for example, including the first memory 113 storing a computer program, which is executable by the processor 102 of the terminal to perform the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps of implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer-readable storage medium, and when executed, executes the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The technical means and the technical features described in the embodiments of the present invention may be arbitrarily combined without conflict.
In addition, in the present examples, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a particular order or sequence.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (9)

1. A method of data analysis, comprising:
inputting at least one first set statement into a data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and a setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of the evaluation set entity;
calculating a loss value of the data analysis model by using a set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each of the at least one first set sentence, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding set word;
updating model parameters of the data analysis model based on the loss values;
outputting the trained data analysis model to obtain a first model;
inputting a first sentence into the first model to obtain an emotion category, a starting position identifier and a stopping position identifier corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information; wherein the content of the first and second substances,
the data analysis model comprises a feature extraction model; the starting position identification and the ending position identification corresponding to the set words in each first set statement are obtained in the following modes:
inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector is used for representing the global features of the first setting statement, and the first subsequence is formed by vectors corresponding to each word in the first setting statement;
based on the first subsequence corresponding to each first setting statement, determining the starting position identifier and the ending position identifier corresponding to the setting words in each first setting statement, including:
converting a first subsequence corresponding to the first setting statement into at least two second vectors;
adding each vector in the first subsequence corresponding to the first set statement and the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector;
calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair comprises a first attention vector for representing the starting position probability and a second attention vector for representing the ending position probability;
determining an initial position identifier corresponding to a set word in a first set statement based on the mean value of the first attention degree vector corresponding to each vector in a second subsequence corresponding to the first set statement;
and determining a cut-off position identifier corresponding to the setting words in the first setting statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first setting statement.
2. The method of claim 1, wherein the set loss function comprises a first sub-function, a second sub-function, a third sub-function, a first weight, a second weight, and a third weight;
the calculating the loss value of the data analysis model by using the set loss function based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the set word in each of the at least one first set sentence, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding set word includes:
calculating a first loss value between a calibration category corresponding to a set word in a first set statement and a corresponding prediction category based on the first subfunction and the first weight;
calculating a second loss value between a first calibration position identifier corresponding to a set word in a first set statement and a corresponding initial position identifier based on the second subfunction and the second weight;
calculating a third loss value between a second calibration position identifier corresponding to the set word in the first set statement and a corresponding cut-off position identifier based on the third subfunction and the third weight;
calculating a loss value of the data analysis model based on a first loss value, a second loss value and a third loss value corresponding to a first set statement; wherein the second weight and the third weight are both greater than the first weight.
3. The method of claim 1 or 2, wherein the data analysis model further comprises a fully connected layer; inputting at least one first setting statement into a data analysis model to obtain a prediction category corresponding to a setting word in each first setting statement in the at least one first setting statement, wherein the prediction category comprises:
and inputting the first vector corresponding to each first setting statement into the full-connection layer to obtain the prediction category corresponding to the setting words in each first setting statement.
4. The method of claim 1, wherein converting the first subsequence corresponding to the first configuration statement into at least two second vectors comprises at least two of:
calculating the average value of vectors in a first subsequence corresponding to the first set statement to obtain a second vector;
determining a minimum value corresponding to each dimension from vectors in a first subsequence corresponding to a first set statement to obtain a second vector;
and determining the maximum value corresponding to each dimension from the vectors in the first subsequence corresponding to the first setting statement to obtain a second vector.
5. The method according to claim 4, wherein calculating, based on each second vector and the second subsequence corresponding to the first setting statement, a corresponding attention vector pair for each vector in the corresponding second subsequence comprises:
respectively calculating a first attention degree vector and a second attention degree vector corresponding to each vector in the corresponding second subsequence based on a second vector and a second subsequence corresponding to the first setting statement by adopting a first setting function and a second setting function; wherein the content of the first and second substances,
the model parameters in the first and second set functions are not shared.
6. The method of claim 1, further comprising:
determining a second keyword from a second set sentence based on a starting position mark and a stop position mark corresponding to the second keyword in the second set sentence;
calculating the accuracy of the data analysis model based on the first word number and the second word number corresponding to the second setting statement;
outputting a trained data analysis model under the condition that the accuracy is greater than or equal to a set threshold;
the first word number represents the word number included by the intersection of the set words corresponding to the second set statement and the second keyword; the second word number represents the word number included in the union set of the setting words corresponding to the second setting sentence and the second keyword.
7. A method of training a data analysis model, comprising:
inputting at least one first set statement into a data analysis model to obtain a prediction category, an initial position identifier and a cut-off position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of an evaluation set entity;
calculating a loss value of the data analysis model based on the calibration category, the first calibration position identifier and the second calibration position identifier corresponding to the setting word in each of the at least one first setting sentence, and based on the prediction category, the start position identifier and the stop position identifier corresponding to the corresponding setting word;
updating model parameters of the data analysis based on the loss values; wherein, the first and the second end of the pipe are connected with each other,
the data analysis model comprises a feature extraction model; the starting position identification and the ending position identification corresponding to the set words in each first set statement are obtained in the following modes:
inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector is used for representing the global features of the first setting statement, and the first subsequence is formed by vectors corresponding to each word in the first setting statement;
based on the first subsequence corresponding to each first setting statement, determining the starting position identifier and the ending position identifier corresponding to the setting words in each first setting statement, including:
converting a first subsequence corresponding to the first setting statement into at least two second vectors;
adding each vector in the first subsequence corresponding to the first set statement and the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector;
calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair comprises a first attention vector for representing the starting position probability and a second attention vector for representing the ending position probability;
determining an initial position identifier corresponding to a set word in a first set statement based on the mean value of the first attention degree vector corresponding to each vector in a second subsequence corresponding to the first set statement;
and determining a cut-off position identifier corresponding to the setting words in the first setting statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first setting statement.
8. An electronic device, comprising:
the training unit is used for inputting at least one first set statement into the data analysis model to obtain a prediction category, an initial position identifier and a stop position identifier corresponding to a set word in each first set statement in the at least one first set statement; the first setting statement is obtained by splicing a setting entity and the setting statement, and the setting statement represents evaluation information of the setting entity; the set words are keywords of an evaluation set entity;
a calculating unit, configured to calculate a loss value of the data analysis model based on the calibration category, the first calibration position identifier, and the second calibration position identifier corresponding to the setting word in each of the at least one first setting statement, and based on the prediction category, the start position identifier, and the end position identifier corresponding to the corresponding setting word;
an updating unit for updating model parameters of the data analysis based on the loss values;
the output unit is used for outputting the trained data analysis model to obtain a first model;
the extraction unit is used for inputting a first sentence into the first model to obtain an emotion category, a starting position mark and a cut-off position mark corresponding to a first keyword in the first sentence; the first statement is obtained by splicing a first entity and corresponding evaluation information; wherein the content of the first and second substances,
the data analysis model comprises a feature extraction model, and the training unit is specifically configured to:
inputting at least one first set statement into the feature extraction model for processing to obtain a first vector sequence corresponding to each first set statement in the at least one first set statement; the first vector sequence comprises a first vector and a first subsequence, the first vector represents the global features of the first setting statement, and the first subsequence is formed by vectors corresponding to each word in the first setting statement;
based on the first subsequence corresponding to each first setting statement, determining the starting position identifier and the ending position identifier corresponding to the setting words in each first setting statement, including:
converting a first subsequence corresponding to the first setting statement into at least two second vectors;
adding each vector in the first subsequence corresponding to the first setting statement to the third vector to obtain a second subsequence; the third vector is the model parameter and represents a randomly initialized vector;
calculating attention degree vector pairs corresponding to each vector in the corresponding second subsequence based on each second vector and the corresponding second subsequence of the first setting statement; the attention vector pair comprises a first attention vector for representing the starting position probability and a second attention vector for representing the ending position probability;
determining an initial position identifier corresponding to a set word in a first set statement based on the mean value of the first attention degree vector corresponding to each vector in a second subsequence corresponding to the first set statement;
and determining a cut-off position identifier corresponding to the setting words in the first setting statement based on the mean value of the second attention degree vector corresponding to each vector in the second subsequence corresponding to the first setting statement.
9. An electronic device, comprising: a processor and a memory for storing a computer program capable of running on the processor,
wherein the processor is configured to execute at least one of the following when running the computer program:
the steps of the method of any one of claims 1 to 6;
the method steps of claim 7.
CN202110717930.9A 2021-06-28 2021-06-28 Data analysis method, method for training data analysis model and electronic equipment Active CN113378543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110717930.9A CN113378543B (en) 2021-06-28 2021-06-28 Data analysis method, method for training data analysis model and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110717930.9A CN113378543B (en) 2021-06-28 2021-06-28 Data analysis method, method for training data analysis model and electronic equipment

Publications (2)

Publication Number Publication Date
CN113378543A CN113378543A (en) 2021-09-10
CN113378543B true CN113378543B (en) 2022-12-27

Family

ID=77579557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110717930.9A Active CN113378543B (en) 2021-06-28 2021-06-28 Data analysis method, method for training data analysis model and electronic equipment

Country Status (1)

Country Link
CN (1) CN113378543B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400494A (en) * 2020-03-16 2020-07-10 江南大学 Sentiment analysis method based on GCN-Attention
CN112800768A (en) * 2021-02-03 2021-05-14 北京金山数字娱乐科技有限公司 Training method and device for nested named entity recognition model

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200159863A1 (en) * 2018-11-20 2020-05-21 Sap Se Memory networks for fine-grain opinion mining
US10726207B2 (en) * 2018-11-27 2020-07-28 Sap Se Exploiting document knowledge for aspect-level sentiment classification
CN110415071B (en) * 2019-07-03 2024-02-27 西南交通大学 Automobile competitive product comparison method based on viewpoint mining analysis
CN110489523B (en) * 2019-07-31 2021-12-17 西安理工大学 Fine-grained emotion analysis method based on online shopping evaluation
CN110502626B (en) * 2019-08-27 2023-04-07 重庆大学 Aspect level emotion analysis method based on convolutional neural network
US11501187B2 (en) * 2019-09-24 2022-11-15 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN110955750A (en) * 2019-11-11 2020-04-03 北京三快在线科技有限公司 Combined identification method and device for comment area and emotion polarity, and electronic equipment
CN111274398B (en) * 2020-01-20 2022-06-14 福州大学 Method and system for analyzing comment emotion of aspect-level user product
CN112069320B (en) * 2020-09-10 2022-06-28 东北大学秦皇岛分校 Span-based fine-grained sentiment analysis method
CN112699240A (en) * 2020-12-31 2021-04-23 荆门汇易佳信息科技有限公司 Intelligent dynamic mining and classifying method for Chinese emotional characteristic words

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111400494A (en) * 2020-03-16 2020-07-10 江南大学 Sentiment analysis method based on GCN-Attention
CN112800768A (en) * 2021-02-03 2021-05-14 北京金山数字娱乐科技有限公司 Training method and device for nested named entity recognition model

Also Published As

Publication number Publication date
CN113378543A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
WO2022088672A1 (en) Machine reading comprehension method and apparatus based on bert, and device and storage medium
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN111444340A (en) Text classification and recommendation method, device, equipment and storage medium
CN111931517B (en) Text translation method, device, electronic equipment and storage medium
CN109902301B (en) Deep neural network-based relationship reasoning method, device and equipment
CN109857846B (en) Method and device for matching user question and knowledge point
CN111159412B (en) Classification method, classification device, electronic equipment and readable storage medium
CN111783450B (en) Phrase extraction method and device in corpus text, storage medium and electronic equipment
WO2020232898A1 (en) Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
CN111930914A (en) Question generation method and device, electronic equipment and computer-readable storage medium
CN113407677B (en) Method, apparatus, device and storage medium for evaluating consultation dialogue quality
CN112613293B (en) Digest generation method, digest generation device, electronic equipment and storage medium
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
JP2023536773A (en) Text quality evaluation model training method and text quality determination method, device, electronic device, storage medium and computer program
CN114742016B (en) Chapter-level event extraction method and device based on multi-granularity entity different composition
CN116304748A (en) Text similarity calculation method, system, equipment and medium
CN114661881A (en) Event extraction method, device and equipment based on question-answering mode
CN111368066B (en) Method, apparatus and computer readable storage medium for obtaining dialogue abstract
CN114202443A (en) Policy classification method, device, equipment and storage medium
CN117828024A (en) Plug-in retrieval method, device, storage medium and equipment
CN113705207A (en) Grammar error recognition method and device
CN112599211A (en) Medical entity relationship extraction method and device
US20230070966A1 (en) Method for processing question, electronic device and storage medium
CN113378543B (en) Data analysis method, method for training data analysis model and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant