CN117574916A - Temporary report semantic analysis method and system - Google Patents

Temporary report semantic analysis method and system Download PDF

Info

Publication number
CN117574916A
CN117574916A CN202311706316.8A CN202311706316A CN117574916A CN 117574916 A CN117574916 A CN 117574916A CN 202311706316 A CN202311706316 A CN 202311706316A CN 117574916 A CN117574916 A CN 117574916A
Authority
CN
China
Prior art keywords
semantic
report
text
features
semantic features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311706316.8A
Other languages
Chinese (zh)
Other versions
CN117574916B (en
Inventor
王钊
吴晨阳
蒋翠清
丁勇
陈波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202311706316.8A priority Critical patent/CN117574916B/en
Publication of CN117574916A publication Critical patent/CN117574916A/en
Application granted granted Critical
Publication of CN117574916B publication Critical patent/CN117574916B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention provides a temporary report semantic analysis method and a temporary report semantic analysis system. And acquiring the temporary report analysis data by carrying out content analysis on the temporary report sample data. Temporary report semantic features of the temporary report sample data and analysis data semantic features of the temporary report analysis data are acquired. And acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature, and respectively characterizing a plurality of semantic information of the temporary report sample data. According to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features, a target loss function is obtained, and a pre-constructed semantic prediction model is trained based on the target loss function so as to conduct semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model. The surface semantics and the potential semantics of the temporary report can be accurately extracted, and the accuracy of the semantic analysis of the temporary report is improved.

Description

Temporary report semantic analysis method and system
Technical Field
The invention relates to the technical field of temporary report semantic analysis, in particular to a temporary report semantic analysis method and a temporary report semantic analysis system.
Background
The temporary report of the enterprise is an important file which is timely disclosed when a major event which can have a great influence on the price of the certificate occurs. Compared with regular reports, the temporary report is more focused on timeliness and importance of information, and the information coverage range is wide and the timeliness is strong. Temporary reports serve as an important source of non-financial information for the business, including important information for the performance and business activities of the business. Therefore, mining valid semantic information from the temporary report enables grasping significant events and business conditions that occur in the company.
The prior art mainly comprises topic modeling and word vector modeling when semantic mining is performed on a temporary report. The prior art may obtain a word vector for each word in each provisional report document and weight the words in the document by taking into account the importance of the words. By calculating the average embedded vector of all temporary reports of a company as semantic features.
However, the quantization of the complete semantics of the temporary report in the prior art may introduce false relevance and even noise, and cannot realize the refinement of the text core semantics. Meanwhile, the temporary report generally contains two layers of semantic information, surface semantics and latent semantics, and the latent semantics can represent the influence degree of the event in the temporary report. The prior art can only extract the surface semantics of the temporary report, can not extract the potential semantics of the temporary report, ignores the influence of events in the temporary report, and leads to poor accuracy of semantic analysis.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a temporary report semantic analysis method and a temporary report semantic analysis system, which solve the technical problem that the prior art cannot extract the potential semantics of a temporary report, so that the semantic analysis effect is poor.
(II) technical scheme
In order to achieve the above purpose, the invention is realized by the following technical scheme:
the invention provides a temporary report semantic analysis method for solving the technical problem, which is executed by a computer and comprises the following steps:
acquiring temporary report sample data, and performing content analysis on the temporary report sample data to acquire temporary report analysis data;
acquiring temporary report semantic features of the temporary report sample data and acquiring analysis data semantic features of the temporary report analysis data;
acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
acquiring a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
training a pre-constructed semantic prediction model based on the target loss function so as to carry out semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
Preferably, the acquiring the temporary report semantic feature of the temporary report sample data includes:
acquiring a first document embedding vector of the temporary report sample data based on a BERT pre-training model;
processing the first document embedding vector based on the BiGRU model to generate temporary report semantic features.
Preferably, the acquiring the analysis data semantic feature of the temporary report analysis data includes:
acquiring a second document embedded vector of the temporary report analysis data based on a BERT pre-training model;
and processing the second document embedded vector based on the BiGRU model to generate analysis data semantic features.
Preferably, the acquiring the first text semantic feature and the second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature includes:
inputting the temporary report semantic features and the analysis data semantic features into a preset first function, and acquiring first text semantic features according to the first function;
the first function is:
I(c (CR) ,c (Emph)* )+I(c (AI) ,c (Emph)* )≥I(c (CR) ,c (Emph) )+I(c (AI) ,c (Emph) )
wherein,
i () represents mutual information;
c (CR) representing temporal reporting of semantic features, c (AI) Representing the semantic features of the analysis data;
c (Emph) representing a first text semantic feature, representing c (Emph)* An optimal solution representing the first text semantic feature;
inputting the temporary report semantic features and the analysis data semantic features into a preset second function, and acquiring second text semantic features according to the second function;
the second function is:
I(c (AI) ,c (Insight)* )-I(c (CR) ,c (Insight)* )≥I(c (AI) ,c (Insight) )-I(c (CR) ,c (Insight) )
wherein,
c (Insight) representing a second text semantic feature, c (Insight)* Representing an optimal solution of the second text semantic feature.
Preferably, obtaining the objective loss function according to the temporary report semantic feature, the analysis data semantic feature, the first text semantic feature and the second text semantic feature includes:
acquiring semantic feature mutual information according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
and acquiring a target loss function based on the semantic feature mutual information.
Preferably, the obtaining semantic feature mutual information includes:
acquiring first mutual information based on the temporary report semantic features and the first text semantic features;
acquiring second mutual information based on the analysis data semantic features and the first text semantic features;
acquiring third mutual information based on the analysis data semantic features and the second text semantic features;
fourth mutual information is acquired based on the temporary report semantic features and the second text semantic features.
Preferably, the obtaining the objective loss function based on the semantic feature mutual information includes:
acquiring convergence mutual information based on the first mutual information and the second mutual information, and acquiring divergence mutual information according to the third mutual information and the fourth mutual information;
MI Convergent =I 1 +I 2
MI Divergent =I 3 -I 4
wherein,
MI Convergent representing convergence mutual information, MI Divergent Representing divergent mutual information;
I 1 representing first mutual information, I 2 Representing the second mutual information, I 3 Representing third mutual information, I 4 Representing fourth mutual information;
acquiring a target loss function based on the divergent mutual information and the convergent mutual information; the objective loss function is:
Loss MI =-(MI Convergent +MI Divergent )
wherein,
Loss MI representing the target loss function.
Preferably, training a pre-constructed semantic prediction model based on the objective loss function includes:
acquiring an initial loss function of a pre-constructed semantic prediction model;
integrating the initial loss function and the target loss function to obtain a model loss function;
training a pre-constructed semantic prediction model according to the model loss function.
Preferably, the pre-constructed semantic prediction model comprises a large language model, a BERT pre-training model, a BiGRU model, a text semantic feature generation unit and a semantic prediction unit;
the large language model is used for carrying out content analysis on the temporary report to be analyzed so as to generate analysis data;
the BERT pre-training model is used for acquiring the temporary report to be analyzed and the document embedding vector of the analysis data;
the BiGRU model is used for processing the document embedding vector to generate temporary report semantic features and analyze data semantic features;
the text semantic feature generation unit is used for processing the temporary report semantic features and the analysis data semantic features to generate text semantic features;
the semantic prediction unit comprises three full-connection layers which are connected in sequence and is used for carrying out semantic prediction on text semantic features.
The invention provides a temporary report semantic analysis system for solving the technical problem, which comprises the following components:
the data acquisition module is configured to acquire temporary report sample data and conduct content analysis on the temporary report sample data so as to acquire temporary report analysis data;
a semantic feature acquisition module configured to acquire provisional report semantic features of the provisional report sample data and to acquire analysis data semantic features of the provisional report analysis data;
a text semantic feature acquisition module configured to acquire a first text semantic feature and a second text semantic feature based on the provisional report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
the loss function acquisition module is configured to acquire a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
the model training module is configured to train a pre-constructed semantic prediction model based on the target loss function so as to carry out semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
(III) beneficial effects
The invention provides a temporary report semantic analysis method and a temporary report semantic analysis system. Compared with the prior art, the method has the following beneficial effects:
the invention can acquire the temporary report analysis data by carrying out content analysis on the temporary report sample data. Temporary report semantic features of the temporary report sample data and analysis data semantic features of the temporary report analysis data are acquired. And acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature, and respectively characterizing a plurality of semantic information of the temporary report sample data. According to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features, a target loss function is obtained, and a pre-constructed semantic prediction model is trained based on the target loss function so as to conduct semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model. The surface semantics and the potential semantics of the temporary report can be accurately extracted, and the accuracy of the semantic analysis of the temporary report is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for semantic analysis of a temporary report according to an embodiment of the present invention;
FIG. 2 illustrates an overall schematic of a semantic prediction model in some embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the application solves the problem of poor semantic analysis effect in the prior art by providing the method and the system for the semantic analysis of the temporary report, and improves the accuracy of the semantic analysis of the temporary report.
The technical scheme in the embodiment of the application aims to solve the technical problems, and the overall thought is as follows:
according to the embodiment of the invention, the temporary report analysis data can be obtained by carrying out content analysis on the temporary report sample data. Temporary report semantic features of the temporary report sample data and analysis data semantic features of the temporary report analysis data are acquired. And acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature, and respectively characterizing a plurality of semantic information of the temporary report sample data. According to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features, a target loss function is obtained, and a pre-constructed semantic prediction model is trained based on the target loss function so as to conduct semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model. The surface semantics and the potential semantics of the temporary report can be accurately extracted, and the accuracy of the semantic analysis of the temporary report is improved.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a temporary report semantic analysis method, which is executed by a computer, and fig. 1 is a schematic flow chart of the temporary report semantic analysis method provided by the embodiment of the invention. The method comprises the following steps:
s1, acquiring temporary report sample data, and performing content analysis on the temporary report sample data to acquire temporary report analysis data;
s2, acquiring temporary report semantic features of the temporary report sample data and acquiring analysis data semantic features of the temporary report analysis data;
s3, acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
s4, acquiring a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
and S5, training a pre-constructed semantic prediction model based on the target loss function so as to perform semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
Specific analyses are performed for each step as follows.
In step S1, provisional report sample data is acquired, and content analysis is performed on the provisional report sample data to acquire provisional report analysis data.
The temporary report can deliver business operations and trends, so that it is necessary to perform semantic analysis on the temporary report to obtain important information therein.
However, provisional reports typically contain two layers of semantic information, one being the surface semantics of the provisional report, i.e., an objective statement of a company-related significant event; the other is the underlying semantics of the temporary announcement, i.e. the impact of the event in the temporary report. Only the semantics in the temporary report are extracted, indicating that the surface semantics can be obtained, and the potential semantics cannot be obtained.
Therefore, in the embodiment of the application, the content of the temporary report is analyzed to extract the potential semantics of the temporary report, and the potential semantics are converted into the form of content analysis text.
A portion of the temporary report may be acquired as sample data. And obtaining analysis data of the temporary report by carrying out content analysis on the sample data. In the embodiment of the application, the large language model can be utilized to analyze the contents of the temporary report sample data. The large language model may employ ChatGPT.
In order to better guide a large language model and ensure the accuracy and reliability of generating analysis reports, the invention introduces instruction engineering to design hints for focus context by explicitly specifying instructions, context, input and output indicators. The invention transmits n temporary reports to a large language model with predefined instructions, analyzes the content of each temporary report by using the large language model, and collects the analysis report of the large language model corresponding to each temporary report.
In step S2, provisional report semantic features of the provisional report sample data are acquired, and analysis data semantic features of the provisional report analysis data are acquired.
Both the provisional report sample data and the provisional report analysis data are in the form of text, so that the text can be transposed into the form of feature vectors for subsequent model processing.
In some embodiments, when acquiring the provisional report semantic feature, a first document embedding vector of the provisional report sample data may be acquired first based on a BERT pre-training model.
In the embodiments of the present application,representing n temporary reports of corresponding document embedding vectors that are sequentially revealed by a company over a period of time (e.g., one year), each vector having a dimension d, the reported document embedding vectors being generated by a BERT pre-training model.
The first document embedding vector may be processed using a biglu model to generate temporary report semantic features.
The BiGRU model, known as a bidirectional gating cyclic unit (Bidirectional Gated Recurrent Unit) model, is a neural network architecture for processing sequence data. The BiGRU model combines the gating characteristic of GRU and the context awareness capability of the bidirectional network, can simultaneously consider forward and backward information when processing sequence data, and is commonly used for natural language processing and time sequence analysis.
The biglu module incorporates context information into the semantic representation at the reporting level, using two differently oriented GRU units. The process can be expressed as:
wherein GRU (. Cndot.) represents GRU units.And->Representing hidden outputs of the forward and backward GRU units, respectively. />Is->And->Is a combination of (a) and (b).
Report level inputFirst, a linear layer and a nonlinear activation function are passed to obtain candidate states u t1 . The SoftMax function is then applied to normalize the attention score. Final corporate level temporal reporting semantic feature c (CR) Is a weighted average of all the provisional report level feature inputs. Generating c (CR) The process of (2) is as follows:
where f (·) represents the nonlinear activation function. W (W) u1 ,b u1 And u w1 Is a trainable parameter.
The first document embedding vector may be processed by the GRU unit and the global attention mechanism to generate temporary report semantic features.
In some embodiments, a second document embedding vector of the temporary report analysis data may be first obtained based on the BERT pre-training model when obtaining the analysis data semantic features.
In the embodiments of the present application,the document embedding vectors representing n analytical reports based on the large language model, each vector having a dimension d, are generated by the BERT pre-training model.
And processing the second document embedded vector through a BiGRU model to generate analysis data semantic features.
The biglu module incorporates context information into the semantic features at the reporting level, using two differently oriented GRU units. The process can be expressed as:
wherein GRU (. Cndot.) represents GRU units.And->Representing hidden outputs of the forward and backward GRU units, respectively. />Is->And->Is a combination of (a) and (b).
Analysis report level input for large language model generationFirst, a linear layer and a nonlinear activation function are passed to obtain candidate states u t2 . The SoftMax function is then applied to normalize the attention score. Final corporate level analytical data semantic feature c (AI) Is a weighted average of all the analysis report level inputs. Generating c (AI) The process of (2) is as follows:
where f (·) represents the nonlinear activation function. W (W) u2 ,b u2 And u w2 Is a trainable parameter.
The second document embedding vector may be processed by the GRU unit and the global attention mechanism to generate analytical data semantic features.
In step S3, a first text semantic feature and a second text semantic feature are obtained based on the provisional report semantic feature and the analysis data semantic feature.
The first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data.
In this embodiment of the present application, the first semantic information is set to be the surface semantic of the temporary report, and important information in the temporary report may also be intuitively represented, so that the first semantic information is also called text semantic representing emphasis, so as to emphasize a significant event in the temporary report. Setting the second semantic information as the potential semantic of the snack report, the semantic can be inferred after the content analysis of the temporary report is needed, and therefore, the semantic information is also called text semantic meaning reasoning to infer the influence of the event in the temporary report.
After acquiring the temporary report semantic feature c (CR) And analysis data semantic feature c of analysis report generated by large language model (AI) After that, they are combined [ c (CR) ,c (AI) ]To obtain a first text semantic feature (denoted c (Emph) ) And a second text semantic feature (denoted c (Insight) )。
Specifically, the text semantic features are obtained by the following steps:
in view of the company-level semantic features of the analysis contents generated by the temporary report and the large language model, the semantic features representing emphasis refer to features having the highest sum of mutual information with the two types of semantic features, and the semantic features representing reasoning refer to features having the highest difference between the mutual information with the analysis contents generated by the large language model and the mutual information with the temporary report.
Thus, the present application example gives [ c ] (CR) ,c (AI) ]And for generating c (Emph) Is a first function f of Emph (. Cndot.) Convergence mutual information criterion is formulated to find the best c satisfying the following conditions (Emph)*
Inputting the temporary report semantic features and the analysis data semantic features into a preset first function, and acquiring first text semantic features according to the first function.
The first function is:
I(c (CR) ,c (Emph)* )+I(c (AI) ,c (Emph)* )≥I(c (CR) ,c (Emph) )+I(c (AI) ,c (Emph) )
wherein,
i () represents mutual information;
c (CR) representing temporal reporting of semantic features, c (AI) Representing the semantic features of the analysis data;
c (Emph) representing a first text semantic feature, representing c (Emph)* Representing an optimal solution of the first text semantic feature.
Given [ c ] (CR) ,c (AI) ]And for generating c (Insight) Is a second function f of Insight (. Cndot.) the formulation of divergent mutual information criteria aims to find the best c satisfying the following conditions (Insight)*
Inputting the temporary report semantic features and the analysis data semantic features into a preset second function, and acquiring second text semantic features according to the second function.
The second function is:
I)c (AI) ,c (Insight)* )-I(c (CR) ,c (Insight)* )≥I(c (AI) ,c (Insight) )-I(c (CR) ,c (Insight) )
wherein,
c (Insight) representing a second text semantic feature, c (Insight)* Representing an optimal solution of the second text semantic feature.
In some embodiments, the mutual information may be obtained as follows.
A set of functions modeled by a network parameterized by the parameter θ∈θ, the mutual information neural estimator I (X; Z) is defined as follows:
mutual information estimator I based on neural network Θ The estimation process of (X; Z) is as follows. First, b small batches of samples are extracted from the joint distribution,at the same time, b small batches of samples are extracted from the Z-edge distribution, < >>Wherein->Obtained by reordering the jointly distributed samples and simply removing x. Then, the defined lower bound is calculated:
parameters of the neural network estimator are correspondingly optimized through gradient rising:
in step S4, a target loss function is obtained according to the temporary report semantic feature, the analysis data semantic feature, the first text semantic feature and the second text semantic feature. Specifically, the method comprises the following steps:
s401, acquiring semantic feature mutual information according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features.
The neural network estimator is used in the embodiments of the present application to estimate mutual information.
Acquiring first mutual information based on the temporary report semantic feature and the first text semantic feature, denoted as I (c) (CR) ,c (Emph)* )。
Acquiring second mutual information based on the analysis data semantic features and the first text semantic features, denoted as I (c) (AI) ,c (Emph)* )。
Acquiring third mutual information based on the analysis data semantic features and the second text semantic features, denoted as I (c) (AI) ,c (Insight)* )。
Acquiring fourth mutual information based on the temporary report semantic feature and the second text semantic feature, denoted as I (c) (CR) ,c (Insight)* )。
S402, acquiring a target loss function based on the semantic feature mutual information.
First, convergent mutual information is acquired based on the first mutual information and the second mutual information, and divergent mutual information is acquired according to the third mutual information and the fourth mutual information.
MI Cnvergent =I 1 +I 2
MI Divergent =I 3 -I 4
Wherein,
MI Convergent representing convergence mutual information, MI Divergent Representing divergent mutual information;
I 1 representing first mutual information, I 2 Representing the second mutual information, I 3 Representing third mutual information, I 4 Representing fourth mutual information.
Can also be expressed as:
MI Convergent =I(c (CR) ,c (Emph)* )+I(c (AI) ,c (Emph)* )
MI Divergent =I(c (AI) ,c (Insight)* )-I(c (CR) ,c (Insight)* )
acquiring a target loss function based on the divergent mutual information and the convergent mutual information; the objective loss function is:
Loss MI =-(MI Convergent +MI Divergent )
wherein,
Loss MI representing the target loss function.
In step S5, training a pre-constructed semantic prediction model based on the objective loss function, so as to perform semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model. Specifically, the method comprises the following steps:
s501, acquiring an initial loss function of a pre-constructed semantic prediction model.
In the embodiment of the application, a semantic prediction model is pre-constructed to analyze a temporary report to be analyzed. The temporary report to be analyzed may be a public report of a corporate enterprise.
FIG. 2 illustrates an overall schematic of a semantic prediction model in some embodiments. The semantic prediction model comprises a large language model, a BERT pre-training model, a BiGRU model, a text semantic feature generation unit and a semantic prediction unit.
The large language model can analyze the content of the temporary report to be analyzed, for example, through generating AI, obtain the analysis report of the temporary report to be analyzed, so as to ensure that the potential semantics are extracted by using the analysis report later.
The temporary report to be analyzed and the analysis data can be input into a BERT pre-training model, and the BERT pre-training model can respectively acquire document embedded vectors of the temporary report to be analyzed and the analysis data and simultaneously input into a biglu model.
The biglu model can process the document embedding vectors to generate temporary report semantic features and analyze data semantic features. By utilizing the GRU unit and the global attention mechanism, the embedded vectors of the two documents are respectively processed, so that the temporary report semantic features and the analysis data semantic features are obtained.
The text semantic feature generating unit is used for processing the temporary report semantic features and the analysis data semantic features to generate text semantic features.
In particular, the temporary report semantic features may be processed by a pre-generated mutual information network (first function) representing the emphasized semantic features to generate first text semantic features characterizing the emphasis. The first text semantic feature is a text feature of the temporarily reported surface semantic.
The analysis data semantic features may be processed through a pre-generated mutual information network (second function) representing the inference semantic features to generate second text semantic features characterizing the inference. The second text semantic feature is the text feature of the latent semantic of the temporary report.
Semantic prediction can be performed on the text semantic features through a final prediction layer, namely a semantic prediction unit. The semantic prediction unit comprises three fully connected layers which are connected in sequence. The first text feature representing the emphasized content and the second text feature representing the inferred content are spliced and input to three fully connected layers containing the activation function. The output of the third full connection layer is the prediction result of the final downstream task (such as classification, regression, etc.).
The pre-constructed semantic prediction model may have an initial loss function, which may be expressed as:
Loss FL =-α t (1-p t ) γ log(p t )
wherein,
p t is the predictive probability, alpha t Is the balance parameter and γ is the focus parameter.
S502, integrating the initial loss function and the target loss function to obtain a model loss function. Will Loss MI And Loss of FI As the total loss of the model, to optimize the model parameters for training. The model loss function is expressed as:
Loss Main =Loss FL +λLoss MI
where λ is the weight in the form of a hyper-parameter.
S503, training a pre-constructed semantic prediction model according to the model loss function.
The invention designs a dual semantic representation module which is guided by specific mutual information convergence and divergence criteria to accurately generate representation emphasis and reasoning text semantic features. Specifically, the emphasized representation is guided by the formulation criteria to maximize the sum of the mutual information of the temporarily reported representation and the representation of the analysis content generated by the large language model; while the inferred representation is guided by established guidelines to maximize the difference in mutual information between the representation of the analysis content and the representation of the temporary report generated by the large language model. The invention also designs a neural network-based mutual information estimator to calculate the required mutual information and uses an alternate training strategy to train with the main network.
The embodiment of the invention also provides a temporary report semantic analysis system, which comprises:
the data acquisition module is configured to acquire temporary report sample data and conduct content analysis on the temporary report sample data so as to acquire temporary report analysis data;
a semantic feature acquisition module configured to acquire provisional report semantic features of the provisional report sample data and to acquire analysis data semantic features of the provisional report analysis data;
a text semantic feature acquisition module configured to acquire a first text semantic feature and a second text semantic feature based on the provisional report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
the loss function acquisition module is configured to acquire a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
the model training module is configured to train a pre-constructed semantic prediction model based on the target loss function so as to carry out semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
It can be understood that the above temporary report semantic analysis system provided by the embodiment of the present invention corresponds to the above temporary report semantic analysis method, and the explanation, the examples, the beneficial effects, and the like of the relevant content may refer to the corresponding content in the temporary report semantic analysis method, which is not repeated herein.
In summary, compared with the prior art, the method has the following beneficial effects:
the invention can apply the information extraction and information reasoning capability of the large language model to extract the surface semantics and the potential semantics from the temporary report of the enterprise, thereby providing a more reliable basis for further analyzing and processing the text data disclosed by the enterprise.
According to the invention, the BERT pre-training model is utilized to extract text semantic features of the analysis report generated by the temporary report and the large language model, and then the BiGRU model and the global attention mechanism are fused, so that the differential weight of the multi-dimensional text features is self-adaptively given, thereby extracting important semantic features in the text, and improving the precision of text semantic feature construction.
The invention establishes a convergence and divergence mutual information criterion, calculates mutual information between text semantic features of the processed temporary report and an analysis report generated by a large language model by using a mutual information estimator based on a neural network, so that the mutual information is compatible with a deep learning method based on gradient descent.
It should be noted that, from the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by means of software plus necessary general hardware platform. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments. In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
In the context of this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A provisional report semantic analysis method, which is executed by a computer, characterized by comprising the steps of:
acquiring temporary report sample data, and performing content analysis on the temporary report sample data to acquire temporary report analysis data;
acquiring temporary report semantic features of the temporary report sample data and acquiring analysis data semantic features of the temporary report analysis data;
acquiring a first text semantic feature and a second text semantic feature based on the temporary report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
acquiring a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
training a pre-constructed semantic prediction model based on the target loss function so as to carry out semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
2. The provisional report semantic analysis method according to claim 1, wherein the obtaining the provisional report semantic features of the provisional report sample data includes:
acquiring a first document embedding vector of the temporary report sample data based on a BERT pre-training model;
processing the first document embedding vector based on the BiGRU model to generate temporary report semantic features.
3. The provisional report semantic analysis method according to claim 2, wherein the acquiring the analysis data semantic features of the provisional report analysis data includes:
acquiring a second document embedded vector of the temporary report analysis data based on a BERT pre-training model;
and processing the second document embedded vector based on the BiGRU model to generate analysis data semantic features.
4. The provisional report semantic analysis method of claim 1, wherein said obtaining a first text semantic feature and a second text semantic feature based on said provisional report semantic feature and said analysis data semantic feature comprises:
inputting the temporary report semantic features and the analysis data semantic features into a preset first function, and acquiring first text semantic features according to the first function;
the first function is:
I(c (CR) ,c (Emph)* )+I(c (AI) ,c (Emph)* )≥I(c (CR) ,c (Emph) )+I(c (AI) ,c (Emph) )
wherein,
i () represents mutual information;
c (CR) representing temporal reporting of semantic features, c (AI) Representing the semantic features of the analysis data;
c (Emph) representing a first text semantic feature, representing c (Emph)* An optimal solution representing the first text semantic feature;
inputting the temporary report semantic features and the analysis data semantic features into a preset second function, and acquiring second text semantic features according to the second function;
the second function is:
I(c (AI) ,c (Insight)* )-I(c (CR) ,c (Insight)* )≥I(c (AI) ,c (Insight) )-I(c (CR) ,c (Insight) )
wherein,
c (Insight) representing a second text semantic feature, c (Insight)* Representing an optimal solution of the second text semantic feature.
5. The provisional report semantic analysis method of claim 1, wherein obtaining an objective loss function based on the provisional report semantic feature, the analysis data semantic feature, the first text semantic feature, and the second text semantic feature comprises:
acquiring semantic feature mutual information according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
and acquiring a target loss function based on the semantic feature mutual information.
6. The method for temporarily analyzing the report semantics of claim 5, wherein the obtaining semantic feature mutual information includes:
acquiring first mutual information based on the temporary report semantic features and the first text semantic features;
acquiring second mutual information based on the analysis data semantic features and the first text semantic features;
acquiring third mutual information based on the analysis data semantic features and the second text semantic features;
fourth mutual information is acquired based on the temporary report semantic features and the second text semantic features.
7. The method of claim 6, wherein the obtaining the objective loss function based on the semantic feature mutual information comprises:
acquiring convergence mutual information based on the first mutual information and the second mutual information, and acquiring divergence mutual information according to the third mutual information and the fourth mutual information;
MI Convergent =I 1 +I 2
MI Divergent =I 3 -I 4
wherein,
MI Convergent representing convergence mutual information, MI Divergent Representing divergent mutual information;
I 1 representing first mutual information, I 2 Representing the second mutual information, I 3 Representing third mutual information, I 4 Representing fourth mutual information;
acquiring a target loss function based on the divergent mutual information and the convergent mutual information; the objective loss function is:
Loss MI =-(MI Convergent +MI Divergent )
wherein,
Loss MI representing the target loss function.
8. The method of claim 6, wherein training a pre-constructed semantic prediction model based on the objective loss function comprises:
acquiring an initial loss function of a pre-constructed semantic prediction model;
integrating the initial loss function and the target loss function to obtain a model loss function;
training a pre-constructed semantic prediction model according to the model loss function.
9. The provisional report semantic analysis method according to claim 3, wherein the pre-constructed semantic prediction model includes a large language model, a BERT pre-training model, a biglu model, a text semantic feature generation unit, and a semantic prediction unit;
the large language model is used for carrying out content analysis on the temporary report to be analyzed so as to generate analysis data;
the BERT pre-training model is used for acquiring the temporary report to be analyzed and the document embedding vector of the analysis data;
the BiGRU model is used for processing the document embedding vector to generate temporary report semantic features and analyze data semantic features;
the text semantic feature generation unit is used for processing the temporary report semantic features and the analysis data semantic features to generate text semantic features;
the semantic prediction unit comprises three full-connection layers which are connected in sequence and is used for carrying out semantic prediction on text semantic features.
10. A provisional report semantic analysis system, the system comprising:
the data acquisition module is configured to acquire temporary report sample data and conduct content analysis on the temporary report sample data so as to acquire temporary report analysis data;
a semantic feature acquisition module configured to acquire provisional report semantic features of the provisional report sample data and to acquire analysis data semantic features of the provisional report analysis data;
a text semantic feature acquisition module configured to acquire a first text semantic feature and a second text semantic feature based on the provisional report semantic feature and the analysis data semantic feature; the first text semantic features are used for representing first semantic information of the temporary report sample data, and the second text semantic features are used for representing second semantic information of the temporary report sample data;
the loss function acquisition module is configured to acquire a target loss function according to the temporary report semantic features, the analysis data semantic features, the first text semantic features and the second text semantic features;
the model training module is configured to train a pre-constructed semantic prediction model based on the target loss function so as to carry out semantic prediction on the temporary report to be analyzed according to the trained semantic prediction model.
CN202311706316.8A 2023-12-12 2023-12-12 Temporary report semantic analysis method and system Active CN117574916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311706316.8A CN117574916B (en) 2023-12-12 2023-12-12 Temporary report semantic analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311706316.8A CN117574916B (en) 2023-12-12 2023-12-12 Temporary report semantic analysis method and system

Publications (2)

Publication Number Publication Date
CN117574916A true CN117574916A (en) 2024-02-20
CN117574916B CN117574916B (en) 2024-05-10

Family

ID=89890046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311706316.8A Active CN117574916B (en) 2023-12-12 2023-12-12 Temporary report semantic analysis method and system

Country Status (1)

Country Link
CN (1) CN117574916B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717334A (en) * 2019-09-10 2020-01-21 上海理工大学 Text emotion analysis method based on BERT model and double-channel attention
CN112528668A (en) * 2020-11-27 2021-03-19 湖北大学 Deep emotion semantic recognition method, system, medium, computer equipment and terminal
CN112597761A (en) * 2020-12-07 2021-04-02 合肥工业大学 Temporary report semantic information mining method and device, storage medium and electronic equipment
CN114757182A (en) * 2022-04-06 2022-07-15 西安电子科技大学 BERT short text sentiment analysis method for improving training mode
CN116579347A (en) * 2023-03-07 2023-08-11 西安电子科技大学 Comment text emotion analysis method, system, equipment and medium based on dynamic semantic feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717334A (en) * 2019-09-10 2020-01-21 上海理工大学 Text emotion analysis method based on BERT model and double-channel attention
CN112528668A (en) * 2020-11-27 2021-03-19 湖北大学 Deep emotion semantic recognition method, system, medium, computer equipment and terminal
CN112597761A (en) * 2020-12-07 2021-04-02 合肥工业大学 Temporary report semantic information mining method and device, storage medium and electronic equipment
CN114757182A (en) * 2022-04-06 2022-07-15 西安电子科技大学 BERT short text sentiment analysis method for improving training mode
CN116579347A (en) * 2023-03-07 2023-08-11 西安电子科技大学 Comment text emotion analysis method, system, equipment and medium based on dynamic semantic feature fusion

Also Published As

Publication number Publication date
CN117574916B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
Vogelsang et al. Requirements engineering for machine learning: Perspectives from data scientists
Liu et al. Recognizing implicit discourse relations via repeated reading: Neural networks with multi-level attention
US11861307B2 (en) Request paraphrasing system, request paraphrasing model and request determining model training method, and dialogue system
Abro et al. Natural language understanding for argumentative dialogue systems in the opinion building domain
KR102100214B1 (en) Method and appratus for analysing sales conversation based on voice recognition
CN111382565A (en) Multi-label-based emotion-reason pair extraction method and system
Shiga et al. Modelling information needs in collaborative search conversations
Plepi et al. Context transformer with stacked pointer networks for conversational question answering over knowledge graphs
CN117033571A (en) Knowledge question-answering system construction method and system
Liu et al. DialTest: automated testing for recurrent-neural-network-driven dialogue systems
Wu et al. BERT for sentiment classification in software engineering
CN110427454A (en) Text mood analysis method and device, electronic equipment and non-transient storage media
Cai et al. An reinforcement learning-based speech censorship chatbot system
Liu et al. Cross-domain slot filling as machine reading comprehension: A new perspective
Amarasinghe et al. Generative pre-trained transformers for coding text data? An analysis with classroom orchestration data
Wu et al. Inferring users' emotions for human-mobile voice dialogue applications
CN117312562A (en) Training method, device, equipment and storage medium of content auditing model
KR20220066554A (en) Method, apparatus and computer program for buildding knowledge graph using qa model
CN117574916B (en) Temporary report semantic analysis method and system
US20230289528A1 (en) Method for constructing sentiment classification model based on metaphor identification
KR20210009266A (en) Method and appratus for analysing sales conversation based on voice recognition
Shovon et al. The performance of graph neural network in detecting fake news from social media feeds
Biri et al. Forecasting the future popularity of the anti-vax narrative on Twitter with machine learning
Qian et al. A multi-task MRC framework for Chinese emotion cause and experiencer extraction
Soares et al. Knowledge driven intelligent survey systems for linguists

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant