CN115618102A - Case similarity prediction method fusing knowledge representation model and application thereof - Google Patents

Case similarity prediction method fusing knowledge representation model and application thereof Download PDF

Info

Publication number
CN115618102A
CN115618102A CN202211200120.7A CN202211200120A CN115618102A CN 115618102 A CN115618102 A CN 115618102A CN 202211200120 A CN202211200120 A CN 202211200120A CN 115618102 A CN115618102 A CN 115618102A
Authority
CN
China
Prior art keywords
case
consultation
vector
similarity
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211200120.7A
Other languages
Chinese (zh)
Inventor
陈正光
武新超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202211200120.7A priority Critical patent/CN115618102A/en
Publication of CN115618102A publication Critical patent/CN115618102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses a case similarity prediction method fusing a knowledge representation model and application thereof, wherein the method comprises the following steps: 1. acquiring and preprocessing a government affair consultation case, 2, constructing a case similarity prediction model, 3, carrying out iterative training on the model, and 4, completing case similarity prediction work by using the trained model. The invention can comprehensively consider the characteristics of the prior consulting cases, extract the characteristics of the subject words by using the knowledge representation model and fuse the subject similarity and the case characteristic similarity of the cases, thereby more accurately finding the cases similar to the new cases in the prior cases and quickly and accurately responding to the public.

Description

Case similarity prediction method fusing knowledge representation model and application thereof
Technical Field
The invention belongs to the field of intelligent inquiry and answer of government affairs consultation, and particularly relates to a case similarity prediction method fusing a knowledge representation model and application thereof.
Background
The current government affair consultation field has two problems, on one hand, consultation response depends on manpower, and the consultation workload is large, the related field is wide, and the repeatability is high; on the other hand, various public-oriented government affair consultation platforms or channels exist for consultation, and the characteristic of diversified source of the consultation case channel cannot be fully considered. In addition, the subject of the government counseling as an important influencing factor in the case of the government counseling is not reflected.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a case similarity prediction method of a fusion knowledge representation model and application thereof, so that similar cases can be quickly obtained, the intelligent question-answering efficiency is improved, and the consultation can be quickly and accurately responded.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a case similarity prediction method of a fusion knowledge representation model, which is characterized by being applied to a government affair consultation knowledge base and comprising the following steps of:
step 1, acquiring and preprocessing a government affair consultation case:
obtaining any ith government affair consultation case R from government affair consultation knowledge base i =(q i ,T i ,b i ) (ii) a Wherein q is i Representing the ith Chinese consultation problem; t is i Representing the ith Chinese consultation question q i The subject term set of (2); b i Represents the ith Chinese consultation question q i A corresponding channel;
consulting case R of the ith government affair i Chinese consultation question q i Translating to English first and then to Chinese, thereby obtaining similar question q' i (ii) a From similar problems q' i And its corresponding channel b i And a topic word set T' i Constitute a similar government consultation case sequence R' i
Step 2, constructing a case similarity prediction model, which comprises the following steps: the system comprises a case feature extraction module, a theme representation module and a similarity calculation module;
step 3, the case feature extraction module comprises: the system comprises a text semantic representation unit, a channel representation unit and a case fusion representation unit; and the case sequence R i And R' i Inputting the Case feature extraction module to correspondingly obtain a Case vector Case i And corresponding similar Case vector Case' i
Step 3.1, the text semantic representation unit utilizes a bert model to consult the ith government affair case R i Chinese consultation question q i Performing word embedding processing to obtain all character vectors in the corresponding Chinese consultation problem, performing mean value operation on the character vectors to obtain a mean value vector which is used as the corresponding Chinese consultation problem q i Sentence vector O i
Step 3.2, the channel representation unit comprises a one-hot code embedding layer for embedding the ith government affair consultation case R i Consulting channel b i Carrying out single hot vector coding to obtain a single hot vector, and then carrying out word embedding processing to obtain the ith government affair consultation case R i Consultation channel b i Vector of (2) represents B i
Step 3.3, the case fusion expression unit consults the ith government affair case R i Sentence vector O of Chinese consultation problem i Vector B of consulting channel i Vector splicing is carried out to obtain the ith government affair consultation case R i Vector of (1) i
Step 3.4. Sequence of similar government counseling cases R' i Inputting the result into the feature extractor, and processing the result according to the processes from step 3.1 to step 3.3 to obtain a similar government affair consultation case sequence R' i Of similarity vector Case' i
Step 4, the theme representation module comprises a knowledge representation layer and an attention layer;
acquiring each subject term and a relation representation code thereof from a government affair consultation knowledge base and inputting the subject terms and the relation representation code into the knowledge representation layer, wherein the knowledge representation layer obtains a representation vector corresponding to each subject term by using a trans-E model;
the attention layer consults the ith government affair case R first i Chinese consultation question q i Subject word set T i Aligning the length L of the expression vector corresponding to each subject term to obtain a subject term set T i L subject word vectors; in the sentence vector O i Under the supervision of (1), processing the L thematic word vectors by using an Atttion mechanism to obtain attention weights of the L thematic word vectors, and finally, weighting the L thematic word vectors and the corresponding attention weights to obtain an ith government affair consultation case R i Is a topic representation vector C i
The attention layer consults case sequence R 'for similar government affairs' i After the subject term set in (1) is processed in the same way, a similar government affair consultation case sequence R 'is obtained' i Is a vector C' i
Constructing consultation case R of ith government affair i And similar government counseling case sequence R' i A positive sample pair of constituents;
constructing consultation case R of ith government affair i And jth government affairs consultation case sequence R j Forming a negative sample pair;
and 5, the similarity prediction module comprises two fully-connected layers, a fusion layer and a Sigmod function module:
step 5.1. Vector Case i And Case' i Obtaining a similarity expression vector D of the positive sample pair after splicing 1i
Will vector Case i Vector Case with other jth government consulting cases j Obtaining a non-similarity expression vector D of the negative sample pair after splicing 2i
Will D 1i And D 2i After the dimensionality reduction processing of two full connection layers is respectively carried out, scalar x is correspondingly obtained 1i ,x 2i For scalar x again 1i ,x 2i Obtaining the text similarity expression X of the positive sample pair after normalization processing 1i Text similarity representation X with negative exemplar pairs 2i
Step 5.2. The fusion layer calculates the Euclidean distance d (C) of the positive sample pair i ,C' i ) And Euclidean distance d (C) of the negative sample pair i ,C j ) (ii) a Then calculating the topic similarity of the positive sample pair
Figure BDA0003871774020000031
Topic similarity with negative example pairs
Figure BDA0003871774020000032
Figure BDA0003871774020000033
Thereby calculating a fusion similarity representation Y of the positive sample pair 1i =α·Z 1i +(1-α)X 1i Fused similarity with negative sample pairs represents Y 2i =α·Z 2i +(1-α)X 2i (ii) a Wherein α represents a weight, and α ∈ [0,1 ]];
Step 5.2, representing the fusion similarity by Y 1i And Y 2i Respectively inputting the two into a Sigmod function module for processing to obtain corresponding similarity prediction results sigma (Y) 1i ) And σ (Y) 2i );
Step 6, training a case similarity prediction model:
step 6.1 constructs the loss function L (W) using equation (1):
L(W)=L 1 (W)+L 2 (W) (1)
in the formula (1), L 1 (W) represents a loss function of the positive sample pair and is obtained from equation (2); l is a radical of an alcohol 2 (W) represents a loss function of the negative sample pair and is obtained from equation (3);
Figure BDA0003871774020000034
Figure BDA0003871774020000035
in the equations (2) and (3), σ (x) is a Sigmoid function,
Figure BDA0003871774020000036
is the true label value of the sample pair, if it is a positive sample pair
Figure BDA0003871774020000037
Is 1, if it is a negative sample pair, then
Figure BDA0003871774020000038
Is 0;
step 6.2, training a case similarity prediction model by using a gradient descent method, calculating the loss function L (W), and stopping training until the loss function L (W) converges or reaches the maximum times, so as to obtain an optimal case similarity prediction model represented by the fusion knowledge;
step 6.3 when acquiring a new government affair consultation case R * Firstly, inputting the trained case feature extraction module and the trained theme representation module for processing to obtain the case R * Vector of (1) * And its topic representation vector C * And with Case i And C i Inputting the two into a trained similarity prediction module for processing, and outputting a similarity prediction result sigma (Y) * ) If σ (Y) * ) 1, then new government affair consultation case R * The fused similarity of (A) represents Y * And pushing the first s maximum values as similar cases.
The invention is an electronic device comprising a memory and a processor, characterized in that the memory is used for storing a program enabling the processor to execute the case similarity prediction method for a merged knowledge representation model, and the processor is configured for executing the program stored in the memory.
The invention relates to a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, executes the steps of the case similarity prediction method for a fusion knowledge representation model.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a method for predicting case similarity by combining knowledge representation learning and text similarity calculation, which can effectively complete case similarity prediction work, push similar cases for government affair consultation to users and reduce repetitive work. Meanwhile, the invention considers that the government affair consultation theme plays a key role in the government affair consultation case, so the case similarity prediction of the theme similarity of the government affair consultation case is fused, the prediction accuracy can be effectively improved, and the method is more convenient and valuable in practical application.
2. According to the method, the concept of comparative learning is adopted, the existing cases are utilized to carry out similarity preprocessing, and the similarity prediction positive sample pair is constructed by the cases subjected to similarity preprocessing and the existing cases, so that a large amount of manual labeling cost is saved, the existing government affair consultation cases can be fully integrated, and the accuracy of case feature extraction is improved.
3. According to the method, the expression vector with the semantic relation characteristics of the subject words in the government affair consulting case is obtained through the knowledge expression learning model, the attention mechanism is utilized, the sentence vector is used as supervision, the subject expression vector which is more fit with the subject of the case is obtained, and the expression effect of the subject of the consulting case can be improved. In addition, the channel is brought into the case feature extraction range, and further the case topic similarity and the case feature similarity are interactively fused, so that a more effective government affair consultation case prediction method can be obtained.
Drawings
Fig. 1 is a diagram of an optimal case similarity prediction model represented by the fused knowledge of the present invention.
Detailed Description
In the embodiment, the case similarity prediction method fused with the knowledge representation model is applied to a government affair consultation knowledge base, the deep learning model is utilized, the existing consultation case characteristics are comprehensively considered, the knowledge representation model is used for obtaining the thematic word representation vector with the semantic relation characteristics, the attention is utilized to extract the case theme, and the interpretability of the model is improved; meanwhile, accurate case characteristic representation is obtained through learning by fusing a channel and problem characteristics through a deep neural network; and finally, the topic similarity and the case characteristic similarity of the cases are fused, so that the cases similar to the new cases in the existing cases can be found more accurately, the similar cases are pushed to users to realize intelligent question answering, and the users can respond to the public quickly and accurately. Specifically, as shown in fig. 1, the method includes the steps of:
step 1, acquiring and preprocessing a government affair consultation case:
obtaining any ith government affair consultation case R from government affair consultation knowledge base i =(q i ,T i ,b i ) (ii) a Wherein q is i Representing the ith Chinese consultation problem; t is a unit of i Represents the ith Chinese consultation question q i The subject term set of (2); b is a mixture of i Representing the ith Chinese consultation question q i A corresponding channel;
firstly, deleting a case with incomplete key information, then carrying out word segmentation, stop word removal, common name word standardization and information desensitization operation on text data, and then carrying out the ith government affair consultation case R i Chinese consultation question q i Translate to English before to Chinese to get similar problem q' i (ii) a From similar problems q' i And its corresponding channel b i And subject term set T' i Constructing a similar government consulting case sequence R' i (ii) a Consulting case sequence R 'for similar government affairs' i And carrying out the standardized processing of the common name words again to improve the standardization of the cases.
Step 2, constructing a case similarity prediction model, which comprises the following steps: the system comprises a case feature extraction module, a theme representation module and a similarity calculation module;
step 3, the case feature extraction module comprises: the system comprises a text semantic representation unit, a channel representation unit and a case fusion representation unit; and the case sequence R i And R' i An input Case feature extraction module for obtaining Case vector Case i And corresponding similar Case vector Case' i
Step 3.1, the text semantic representation unit utilizes a bert model and uses a pre-training model to consult the ith government affair case R i Chinese consultation question q i Performing word embedding processing to obtain all character vectors in the corresponding Chinese consultation problem, performing mean value operation on the character vectors to obtain a mean value vector which is used as the corresponding Chinese consultation problem q i Sentence vector O i
Step 3.2, the channel expression unit comprises a one-hot coding embedded layer, and the ith one is embedded in the channel expression unitGovernment affairs consulting case R i Consultation channel b i Carrying out single hot vector coding to obtain a single hot vector, and then carrying out word embedding processing to obtain the ith government affair consultation case R i Consultation channel b i Is represented by the vector of (A) i
Step 3.3, the case fusion representation unit consults the ith government affair case R i Sentence vector O of Chinese consultation question i Vector B of consulting channel i Vector splicing is carried out to obtain the ith government affair consultation case R i Vector of (1) i Because the structured degrees of the government affair consultation cases of different channels are different and the theme trends are different, the vector representation of the channel factors can be considered to more comprehensively represent the characteristics of the government affair consultation cases;
step 3.4. Sequence of similar government counseling cases R' i Inputting the result into a feature extractor, and processing the result according to the processes of the step 3.1 to the step 3.3 to obtain a similar government affair consulting case sequence R' i Of similarity vector Case' i
Step 4, the theme representation module comprises a knowledge representation layer and an attention layer;
acquiring each subject term and a relation representation code thereof from a government affair consultation knowledge base (the subject terms and the relation can be constructed by referring to an electronic government subject word list, a field professional term standard and the like in China), inputting the subject terms and the relation into a knowledge representation layer, obtaining a representation vector corresponding to each subject term by the knowledge representation layer by using a trans-E model, obtaining semantic relation characteristics among the subject terms by using the knowledge representation model, and representing the relation among the subject terms more suitably than using a term embedding technology;
attention tier consults case R for ith government affairs first i Chinese consultation question q i Subject word set T i Aligning the length L of the expression vector corresponding to each subject term to obtain a subject term set T i L subject word vectors; in sentence vector O i Under the supervision of (1), processing the L subject term vectors by using an Atttion mechanism to obtain the attention weights of the L subject term vectors, wherein obviously, the weights are more suitable for the case theme under the supervision of the consultation problem; finally, the L mastersWeighting the thematic word vector and the corresponding attention weight to obtain the ith government affair consultation case R i Is a topic representation vector C i
Notice layer to similar government counseling case sequence R' i The subject term set in (1) is processed in the same way to obtain a similar government affair consultation case sequence R' i Is vector C' i
Constructing consultation case R of ith government affair i And similar government counseling case sequence R' i A positive sample pair of constituents;
constructing consultation case R of ith government affair i And the jth government counseling case sequence R j Forming a negative sample pair, and forming the negative sample by adopting a case without the same subject term in the subject term set in order to avoid the situation that a positive sample possibly appears in the negative sample pair;
and 5, the similarity prediction module comprises two fully-connected layers, a fusion layer and a Sigmod function module:
step 5.1. Vector Case i And Case' i Obtaining a similarity expression vector D of the positive sample pair after splicing 1i
Will vector Case i Vector Case with other jth government consulting cases j Obtaining a non-similarity expression vector D of the negative sample pair after splicing 2i
Will D 1i And D 2i After the dimensionality reduction processing of two full connection layers is respectively carried out, scalar x is correspondingly obtained 1i ,x 2i For the scalar x 1i ,x 2i Obtaining the text similarity expression X of the positive sample pair after normalization processing 1i Text similarity representation X with negative exemplar pairs 2i
Step 5.2. The fusion layer calculates the Euclidean distance d (C) of the positive sample pair i ,C' i ) And Euclidean distance d (C) of the negative sample pair i ,C j ) (ii) a Then calculating the topic similarity of the positive sample pair
Figure BDA0003871774020000061
And negative examples of pairsDegree of similarity
Figure BDA0003871774020000062
Figure BDA0003871774020000063
Thereby calculating a fusion similarity representation Y of the positive sample pairs 1i =α·Z 1i +(1-α)X 1i Fused similarity with negative sample pairs represents Y 2i =α·Z 2i +(1-α)X 2i (ii) a Wherein α represents a weight, and α ∈ [0,1 ]];
Step 5.2, representing the fusion similarity by Y 1i And Y 2i Respectively input into a Sigmod function module for processing to obtain corresponding similarity prediction results sigma (Y) 1i ) And σ (Y) 2i );
Step 6, training a case similarity prediction model:
step 6.1 constructs the loss function L (W) using equation (1):
L(W)=L 1 (W)+L 2 (W) (1)
in the formula (1), L 1 (W) represents a loss function of the positive sample pair and is obtained from equation (2); l is 2 (W) represents a loss function of the negative sample pair and is obtained from equation (3);
Figure BDA0003871774020000064
Figure BDA0003871774020000065
in the equations (2) and (3), σ (x) is a Sigmoid function,
Figure BDA0003871774020000071
is the true label value of the sample pair, if it is a positive sample pair
Figure BDA0003871774020000072
Is 1, if it is a negative sample pair, then
Figure BDA0003871774020000073
Is 0;
step 6.2 in this embodiment, the positive and negative sample pair data set is divided into a training set, a validation set, and a test set according to the following steps of 1. Setting the maximum iteration times 100, training a case similarity prediction model by using a gradient descent method, calculating a loss function L (W) by using the gradient descent method, and stopping training until the loss function L (W) converges or reaches the maximum times, thereby obtaining an optimal case similarity prediction model represented by fused knowledge;
step 6.3 when acquiring a new government affair consultation case R * Firstly, inputting the case characteristic extraction module and the theme representation module after training for processing to obtain the case R * Vector of (1) * And its topic representation vector C * And with Case i And C i Inputting the two into a trained similarity prediction module for processing, and outputting a similarity prediction result sigma (Y) * ) If σ (Y) * ) 1, a new government affair consultation case R * The fused similarity of (A) represents Y * And pushing the first s maximum values as similar cases.
In this embodiment, an electronic device includes a memory for storing a program that enables a processor to execute a case similarity prediction method of the fused knowledge representation model, and a processor configured to execute the program stored in the memory.
In this embodiment, a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program performs the steps of the case similarity prediction method for a fusion knowledge representation model.

Claims (3)

1. A case similarity prediction method fusing a knowledge representation model is characterized by being applied to a government affair consultation knowledge base and comprising the following steps of:
step 1, acquiring and preprocessing a government affair consultation case:
consulting knowledge base from government affairsIn order to obtain any ith government affair consultation case R i =(q i ,T i ,b i ) (ii) a Wherein q is i Representing the ith Chinese consultation problem; t is a unit of i Representing the ith Chinese consultation question q i The topic word set of (1); b i Represents the ith Chinese consultation question q i A corresponding channel;
consulting case R of the ith government affair i Chinese consultation question q i Translate to English before to Chinese to get similar problem q' i (ii) a From similar problems q' i And its corresponding channel b i And a topic word set T' i Constructing a similar government consulting case sequence R' i
Step 2, constructing a case similarity prediction model, which comprises the following steps: the system comprises a case feature extraction module, a theme representation module and a similarity calculation module;
step 3, the case feature extraction module comprises: the system comprises a text semantic representation unit, a channel representation unit and a case fusion representation unit; and the case sequence R i And R' i Inputting the Case feature extraction module to correspondingly obtain a Case vector Case i And corresponding similar Case vector Case' i
Step 3.1, the text semantic representation unit utilizes a bert model to consult the ith government affair consultation case R i Chinese consultation question q i Performing word embedding processing to obtain all word vectors in the corresponding Chinese consultation problem, then performing mean value operation on the word vectors to obtain mean value vectors which are used as the corresponding Chinese consultation problem q i Sentence vector O of i
Step 3.2, the channel representation unit comprises a single-hot coding embedded layer, and the ith government affair consultation case R is embedded in the channel representation unit i Consultation channel b i Carrying out single hot vector coding to obtain a single hot vector, and then carrying out word embedding processing to obtain the ith government affair consultation case R i Consultation channel b i Is represented by the vector of (A) i
Step 3.3, the case fusion representation unit consults the ith government affair case R i Sentence vector O of Chinese consultation question i Consulting channelVector B of i Vector splicing is carried out to obtain the ith government affair consultation case R i Vector of (1) i
Step 3.4. Sequence of similar government counseling cases R' i Inputting the result into the feature extractor, and processing the result according to the processes of the step 3.1 to the step 3.3 to obtain a similar government affair consulting case sequence R' i Of similar vector Case' i
Step 4, the theme representation module comprises a knowledge representation layer and an attention layer;
acquiring each subject term and a relation representation code thereof from a government affair consultation knowledge base and inputting the subject terms and the relation representation code into the knowledge representation layer, wherein the knowledge representation layer obtains a representation vector corresponding to each subject term by using a trans-E model;
the attention layer consults the ith government affair case R first i Chinese consultation question q i Subject word set T i Aligning the length L of the expression vector corresponding to each subject term to obtain a subject term set T i L subject word vectors; in the sentence vector O i Under supervision of (3), processing the L subject term vectors by using an Attion mechanism to obtain attention weights of the L subject term vectors, and finally, weighting the L subject term vectors and the corresponding attention weights to obtain an ith government affair consultation case R i Is a topic representation vector C i
The attention layer consults case sequence R 'for similar government affairs' i The subject term set in (1) is processed in the same way to obtain a similar government affair consultation case sequence R' i Is a vector C' i
Constructing consultation case R of ith government affair i And similar government counseling case sequence R' i A positive sample pair of constituents;
constructing consultation case R of ith government affairs i And the jth government counseling case sequence R j Forming a negative sample pair;
and 5, the similarity prediction module comprises two fully-connected layers, a fusion layer and a Sigmod function module:
step 5.1. Vector Case i And Case' i Obtaining a similarity expression vector D of the positive sample pair after splicing 1i
Will vector Case i Vector Case with other jth government consulting cases j Obtaining a non-similarity expression vector D of the negative sample pair after splicing 2i
Will D 1i And D 2i After the dimensionality reduction processing of two full connection layers is respectively carried out, scalar x is correspondingly obtained 1i ,x 2i For scalar x again 1i ,x 2i Obtaining the text similarity expression X of the positive sample pair after normalization processing 1i Text similarity representation X with negative exemplar pairs 2i
Step 5.2. The fusion layer calculates the Euclidean distance d (C) of the positive sample pair i ,C′ i ) And Euclidean distance d (C) of the negative sample pair i ,C j ) (ii) a Then calculating the topic similarity of the positive sample pair
Figure FDA0003871774010000021
Topic similarity to negative exemplar pairs
Figure FDA0003871774010000022
Figure FDA0003871774010000023
Thereby calculating a fusion similarity representation Y of the positive sample pairs 1i =α·Z 1i +(1-α)X 1i Fused similarity with negative sample pairs represents Y 2i =α·Z 2i +(1-α)X 2i (ii) a Wherein α represents a weight, and α ∈ [0,1 ]];
Step 5.2, representing the fusion similarity by Y 1i And Y 2i Respectively input into a Sigmod function module for processing to obtain corresponding similarity prediction results sigma (Y) 1i ) And σ (Y) 2i );
Step 6, training a case similarity prediction model:
step 6.1 constructs the loss function L (W) using equation (1):
L(W)=L 1 (W)+L 2 (W) (1)
in the formula (1), L 1 (W) represents a loss function of the positive sample pair and is obtained from equation (2); l is a radical of an alcohol 2 (W) represents a loss function of the negative sample pair and is obtained from equation (3);
Figure FDA0003871774010000024
Figure FDA0003871774010000025
in the equations (2) and (3), σ (x) is a Sigmoid function,
Figure FDA0003871774010000031
is the true label value of the sample pair, if it is a positive sample pair
Figure FDA0003871774010000032
Is 1, if it is a negative sample pair, then
Figure FDA0003871774010000033
Is 0;
step 6.2, training a case similarity prediction model by using a gradient descent method, calculating the loss function L (W), and stopping training until the loss function L (W) converges or reaches the maximum times, so as to obtain an optimal case similarity prediction model represented by the fusion knowledge;
step 6.3 when acquiring a new government affair consultation case R * Firstly, inputting the trained case feature extraction module and the trained theme representation module for processing to obtain the case R * Vector of (1) Case * And its topic representation vector C * And with Case i And C i Inputting the two into a trained similarity prediction module for processing, and outputting a similarity prediction result sigma (Y) * ) If σ (Y) * ) 1, a new government affair consultation case R * Are similar to each otherDegree represents Y * And pushing the first s maximum values as similar cases.
2. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that enables the processor to perform the method of case similarity prediction for a fused knowledge representation model of claim 1, and the processor is configured to execute the program stored in the memory.
3. A computer-readable storage medium, having stored thereon a computer program, which, when being executed by a processor, carries out the steps of the method for case similarity prediction for a fused knowledge representation model according to claim 1.
CN202211200120.7A 2022-09-29 2022-09-29 Case similarity prediction method fusing knowledge representation model and application thereof Pending CN115618102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211200120.7A CN115618102A (en) 2022-09-29 2022-09-29 Case similarity prediction method fusing knowledge representation model and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211200120.7A CN115618102A (en) 2022-09-29 2022-09-29 Case similarity prediction method fusing knowledge representation model and application thereof

Publications (1)

Publication Number Publication Date
CN115618102A true CN115618102A (en) 2023-01-17

Family

ID=84861242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211200120.7A Pending CN115618102A (en) 2022-09-29 2022-09-29 Case similarity prediction method fusing knowledge representation model and application thereof

Country Status (1)

Country Link
CN (1) CN115618102A (en)

Similar Documents

Publication Publication Date Title
CN111737496A (en) Power equipment fault knowledge map construction method
CN111274394A (en) Method, device and equipment for extracting entity relationship and storage medium
CN113011533A (en) Text classification method and device, computer equipment and storage medium
WO2021051871A1 (en) Text extraction method, apparatus, and device, and storage medium
CN111368049A (en) Information acquisition method and device, electronic equipment and computer readable storage medium
CN116775847B (en) Question answering method and system based on knowledge graph and large language model
CN111274829B (en) Sequence labeling method utilizing cross-language information
CN113377897B (en) Multi-language medical term standard standardization system and method based on deep confrontation learning
CN111930942A (en) Text classification method, language model training method, device and equipment
CN110347802B (en) Text analysis method and device
CN110895559A (en) Model training method, text processing method, device and equipment
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
EP3629316A1 (en) Method, apparatus, device and computer readable medium for generating vqa training data
CN114020906A (en) Chinese medical text information matching method and system based on twin neural network
CN114298035A (en) Text recognition desensitization method and system thereof
CN110717021A (en) Input text and related device for obtaining artificial intelligence interview
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN113076744A (en) Cultural relic knowledge relation extraction method based on convolutional neural network
CN116069916A (en) Tourist attraction question-answering system
CN115618102A (en) Case similarity prediction method fusing knowledge representation model and application thereof
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN115965003A (en) Event information extraction method and event information extraction device
CN115858733A (en) Cross-language entity word retrieval method, device, equipment and storage medium
CN115062123A (en) Knowledge base question-answer pair generation method of conversation generation system
CN114186020A (en) Semantic association method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination