CN113064967B - Complaint reporting credibility analysis method based on deep migration network - Google Patents
Complaint reporting credibility analysis method based on deep migration network Download PDFInfo
- Publication number
- CN113064967B CN113064967B CN202110310932.6A CN202110310932A CN113064967B CN 113064967 B CN113064967 B CN 113064967B CN 202110310932 A CN202110310932 A CN 202110310932A CN 113064967 B CN113064967 B CN 113064967B
- Authority
- CN
- China
- Prior art keywords
- domain
- text
- feature
- features
- complaint
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 28
- 238000013508 migration Methods 0.000 title claims abstract description 27
- 230000005012 migration Effects 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 claims abstract description 46
- 239000013598 vector Substances 0.000 claims abstract description 26
- 230000007613 environmental effect Effects 0.000 claims abstract description 14
- 230000004927 fusion Effects 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims abstract description 11
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 4
- 230000007246 mechanism Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 16
- 230000006978 adaptation Effects 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 5
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000013523 data management Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 238000013480 data collection Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 238000007500 overflow downdraw method Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 238000002372 labelling Methods 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000013526 transfer learning Methods 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 238000003911 water pollution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention discloses a complaint reporting credibility analysis method based on a deep migration network, and belongs to the technical field of artificial intelligence. The method specifically comprises the following steps: firstly, respectively representing a microblog text, a complaint report text and a microblog text mixed complaint report text as a matrix through a Word2vec text vectorization model; then, inputting the vectorized text into three groups of bidirectional LSTM networks for feature extraction, and respectively extracting a source domain private feature vector, a source domain target domain shared feature vector and a target domain private feature vector; then, carrying out feature fusion on the shared features and private features of the source domain and the target domain respectively through a self-attention mechanism to obtain final source domain features and target domain features; and finally, inputting the source domain features and the target domain features into the multi-layer perceptron to output a final classification result. The method solves the problem that manual analysis is difficult and effective data labeling is lacking in the analysis of the reliability of complaint reporting, and provides a thought for the analysis of the reliability of environmental complaint reporting.
Description
Technical Field
The invention relates to an environment complaint reporting credibility analysis method, in particular to an environment complaint reporting credibility analysis method based on a deep migration network.
Background
The environmental complaint report refers to complaints of citizens on environmental pollution phenomena or events affecting the production and life of the citizens or violating the national relevant regulations. Complaints are typically described in text form for complaint reports. Among the many complaint reporting events are non-trusted complaint reporting events that tamper, exaggerate or graft facts. These complaints report can directly improve the difficulty of the management part in handling the water pollution event, reduce administrative efficiency. In order to improve the administrative management efficiency and avoid the waste of management resources, the administrative management department is urgent to analyze the credibility of the complaint reporting event of the netizen.
At present, related work for carrying out reliability analysis on complaint reporting events is rare in the field of water environment complaint reporting, and related work for carrying out complaint reporting reliability analysis based on complaint reporting text is relatively less. But in other areas there is a similar effort to perform trust analysis based on text content. After deep learning appears, various methods based on deep learning technology are proposed, and very good effects are obtained in the reliability analysis work based on text content, such as false news detection, rumor detection, etc. Machine learning and deep learning methods are mostly based on a large amount of data with confidence labels. Complaint report text data in environment-type complaint report credibility analysis often lacks credibility labels, and manual credibility analysis of complaint reports is very difficult.
In order to solve the problems, microblog text is used for assisting complaint reporting credibility analysis. The microblog text and the complaint report text are both expression of emotion and attitude of an author, and meanwhile, the microblog rumors and false complaint report are often falsified and distorted of facts, so that the microblog text and the complaint report text have certain semantic similarity; and combining with a semi-supervised transfer learning method, utilizing a transfer learning theory to transfer knowledge in the microblog text to a complaint report text credibility analysis process by using technologies such as feature transfer, field adaptation and the like, and improving performance indexes of the complaint report credibility analysis.
In conclusion, the analysis of the reporting credibility of the environmental complaints based on the deep migration network is an innovative research problem, and has important research significance and application value.
Disclosure of Invention
The invention aims to solve the problems that manual analysis is difficult and effective credibility labels are lacking in credibility analysis of environmental complaints, and an effective credibility analysis model cannot be trained. A deep migration network is proposed to solve the above problems. According to the method, a microblog text is used as a source domain, a complaint report text is used as a target domain, an effective feature extraction, feature migration and field adaptation method is designed, and the microblog text is used for assisting in complaint report credibility analysis.
The environment complaint reporting credibility analysis method based on the deep migration network comprises the following steps:
s1, data collection;
s2, preprocessing microblog text data (source domain) and complaint report text data (target domain);
s3, inputting the preprocessed text into a Word2vec model for Word vector training, and generating Word vectors;
s4, encoding the microblog text word vector and the complaint report text word vector, and respectively designing a source domain feature encoder, a domain sharing feature encoder and a target domain feature encoder to extract source domain private features, domain sharing features and target domain private features;
s5, field feature fusion: carrying out feature fusion on the source domain private feature and the domain sharing feature by using a self-attention method to obtain a source domain feature; and carrying out feature fusion on the private features of the target domain and the domain sharing features by using a self-attention method to obtain the features of the target domain.
S6, MK-MMD distance of the source domain feature and the target domain feature is calculated, and feature transformation is carried out on the source domain feature and the target domain feature, so that field adaptation is completed.
S7, the source domain features and the target domain features are processed through a multi-layer perceptron network to obtain classification results.
Drawings
FIG. 1 is a detailed schematic diagram of a method for analyzing the reliability of complaint report based on a deep migration network.
Fig. 2 is a schematic diagram of a bi-directional LSTM encoding process.
FIG. 3 is a flow chart of a method of analyzing complaint reporting credibility based on depth migration network.
Detailed Description
The invention provides a method for analyzing the reliability of reporting environmental complaints based on a deep migration network, which mainly comprises the following steps:
detailed description of the embodiments the present invention is described in detail with reference to fig. 1:
step S1, obtaining a microblog source text extracted from social media; extracting complaint report text data from a large water environment large data management platform, and constructing a data set:representing a source field (microblog text), where N S Representing the number of samples, +.>Representing a microblog text sample,/->The method is characterized in that the method is a microblog text credibility label; complaint report text data set:representing a target field (complaint report text), wherein +.>Represent training sample number, ++>For the number of test samples, +.>Reporting text samples for complaints, +.>And reporting the text credibility label for complaints.
Step S2, preprocessing microblog text data (source domain) and complaint report text data (target domain): preprocessing includes data cleaning and word segmentation, and does not include operation of deactivating words, and text after word segmentationExpressed as a set of word sequences:
where o ε { s, t }, s represents the source domain and t represents the target domain;for sentences->The included words; t (T) i Is the sentence length.
Step S3, text vectorization:
inputting the text subjected to pretreatment Word segmentation into a Word2vec model for Word training, and then vectorizing the textText sequence +.>One time input into Word2vec model to obtain +.>Is represented by a matrix of:where n is the number of texts, d is the dimension of the word vector, and the dimension of the generated word vector is 300 dimensions.
And S4, encoding the quantized text. Coding refers to a process of sending the vectorized text into a neural network to perform feature extraction, and three encoders are designed: source domain private feature encoderExtracting source domain (microblog text), target domain private feature encoder->Extracting target domain (complaint report text) and domain sharing feature encoder (E) c ) And extracting the sharing characteristics of the complaint report text and the microblog text, wherein the three encoders have identical network structures and are all based on a bidirectional LSTM network. As shown in fig. 2, the specific encoding process is as follows:
step S401, for text after vectorizationOutput of LSTM model connecting the front and back directions ∈>And->As output of Bi-LSTM at time t:
wherein,is T i In the time steps, inputting at the t time step; c t Is the unit state of LSTM at t time, h t The output of the t time step is calculated by the formula (2):
wherein W is f ,W i ,W o ,W c As a weight matrix, b f ,b i ,b o ,b c Is a bias vector. Sigma is a sigmoid function, and by element-wise multiplication. f (f) t I is a forgetful door t O is an input door t Is an output gate. In the whole process, the door f is forgotten first t Some information of the previous state is selectively filtered out. Then input gate i t Deciding which data is updated; LSTM cell state c t By forgetting the history information and adding new informationThe old state is covered by the new state value, and the state update is completed. Finally, the output gate o t Determining output information, outputting h at the current time step t Through o t Filtering the information to obtain the product.
Step S402, taking the output of the last time stepAnd->As the encoding output result of the i-th sentence:
wherein,is->Forward hidden layer output of text sequence, +.>For text->Outputting the sequence to an implicit layer; />Encoding the output text for LSTM networks>I.e. the output of the encoder.
Step S403, three groups of encoders extract the domain sharing feature e respectively c ∈R n1×m =[e 1 ,e 2 ,...,e n1 ]The method comprises the steps of carrying out a first treatment on the surface of the The source domain private feature and the target domain private feature encoder output are respectively as follows Wherein m is the dimension of the Bi-LSTM output vector, ">n2=N s ,/>Are the number of texts.
Step S5, field feature fusion: the domain sharing feature encoder extracts sharing features of the source domain and the target domain. The domain private feature encoder can extract domain private features, and overcomes the defect that the shared feature extractor cannot extract specific domain information. In order to obtain the shared information of the source domain and the target domain and keep more complete specific domain information, the specific domain is required to be usedInformation of (2)And sharing domain information e c Fusion is performed.
Step S501, matrix W V Key matrix W K Query matrix W Q Dot product with the input vector and score the result:
wherein b is { c, p }, c represents domain sharing, p is domain privacy; the product is a scaling dot product; d is a constant (typically a word vector dimension) set to prevent the number after the dot product from becoming excessive, typically the dimension of the input word vector;
step S502, performing Softmax normalization operation on the scores to obtain attention weights
Step S503, multiplying the self-attention weight point by the value vector to obtain the final source domain feature e o (target domain feature):
wherein o ε { s, t } s represents the source domain, t represents the target domain, e o Is a fused feature.
Step S6, field adaptation: source domain feature e after domain feature fusion s And target domain feature e t Is different, so to e s And e t And performing field adaptation. The field adaptation aims to realize field adaptation and enable data distribution of two fields to be converged. Domain adaptation by means of feature alignment, i.e. distributing data of source domain and target domain by means of feature transformationAnd (5) converging. And calculating the distance between the source domain and the target domain data by an MK-MMD method, adding the distance into a loss function, and updating the network weight together with the label loss to realize domain adaptation. The MK-MMD distance formula of the source domain and the target domain is:
wherein, a mapping phi (·) exists in a regenerated Hilbert space H to map the primary variables into RKHS, MMD 2 (e s ,e t ) Is the distance between the source domain feature and the target domain feature.
And S7, credibility classification, namely sending the source domain characteristics and the target domain characteristics into the MLP network to output classification results, and updating network parameters according to classification loss and field adaptation loss.
Step S701, source domain feature e after domain feature fusion s And target domain feature e t Feeding MLP:
is a predictive vector, i.e., a predictive result; MLP represents a multi-layer perceptron; />And->Representing a predicted probability; sigmoid is the activation function.
Step S702, calculating a loss function according to the classification result to update network parameters, wherein the deep migration network learns the data difference between the source domain and the target domain to realize domain adaptation, and learns the label loss. The final objective function (loss function of the entire network) is lost by the MK-MMD statistics source domain labels representing domain differences, so the loss function of the entire migration network is (9):
L=L cls +λL da (9)
wherein lambda is the adjustment parameter; l (L) da For adapting losses in the field, i.e. MMD 2 (e s ,e t );L cls For tag loss, including source domain tag lossAnd target Domain tag loss->Cross-entropy criterion is used in this classification task to reduce the loss function:
wherein y ε {0,1} is the confidence label; θ is a parameter that needs to be optimized.
The index of the accuracy of the reliability analysis of the model is the standardized AUC: in the task of classifying the reliability of the complaint and report of the water environment, we should pay more attention to avoiding the condition that the pollution time is not treated timely due to the occurrence of false judgment of the reliability complaint and report, namely, the True Positive Rate (TPR) is improved on the basis of low False Positive Rate (FPR) (the low reliability text is a positive sample, and the high reliability text is a negative sample). This task should be focused on considering the Area (AUC) of the upper partial region of the ROC curve when FPR.ltoreq.maxfpr FPR≤maxfpr ). When maxfpr is particularly small, the range of AUC variation is small and does not compare model performance well, so normalized AUC (sparc is used FPR≤maxfpr ):
Wherein s is max In the fpr experiment, fpr was taken as 0.05,so SPACC FPR≤fpr Varying between 0.5 and 1. Experimental results show that the LSTM-based coding network can well analyze the reliability of the environmental complaints.
The method adopts a method for extracting microblog source texts (comprising 133346 texts, wherein the total of the texts with high credibility is 66131 texts and the total of the texts with low credibility is 67215 texts) from social media and extracting complaint report text data (total 200K complaint report text data) from a large water environment big data management platform, wherein 1482 complaint reports with credibility labels are provided, and the complaint report text data comprises 889 complaint reports with high credibility and 593 complaint reports with low credibility.
As shown in Table 1, the experiments were characterized by the extractors CNN, transfomer, GRU-2, RNN, LSTM_Attention, and LSTM, respectively. "Attention" refers to the fusion of private features of both the source domain and the target domain with shared features; "Source_Attention" only merges Source domain private features and domain sharing features; "target_attribute" means that only the private feature and the domain sharing feature of the Target domain are fused; "No_Attention" means that feature fusion is not performed, and only domain sharing features are used. The deep migration network based on the bidirectional LSTM has the best performance in the task, and also proves the superiority of the deep migration network architecture and the feasibility of reporting credibility analysis by using microblog text to assist complaints. Ablation experiments were performed depending on whether feature fusion was performed using the attention mechanism. As shown in table 1, in the case of using the deep migration network, each feature extractor performs better than the method using only the domain shared feature after performing feature fusion by using the attention mechanism, and the effect of fusing the source domain private feature and the shared feature is better than that of fusing the target domain private feature and the shared feature.
Table 1 results of complaint reporting credibility classification experiments
In conclusion, the method can well utilize knowledge in the microblog text field to assist in complaint reporting reliability analysis, and can well complete a complaint reporting reliability analysis task.
Claims (9)
1. The environment complaint reporting credibility analysis method based on the deep migration network comprises the following specific steps:
s1, data collection;
s2, preprocessing a source domain and a target domain;
s3, inputting the preprocessed text into a Word2vec model for Word vector training, and generating Word vectors;
s4, encoding the microblog text and the complaint report text after text vectorization, and extracting high-level features;
s5, fusing the domain private features and the domain sharing features by using a self-attention method;
s6, calculating MK-MMD distances of the source domain features and the target domain features, performing feature transformation on the source domain features and the target domain features, and performing domain adaptation;
s7, obtaining a classification result through the multi-layer perceptron network by the source domain features and the target domain features;
the source domain is microblog text data, and the target domain is complaint report text data.
2. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
in step S1, obtaining a microblog source text extracted from social media; extracting complaint report text data from a large water environment large data management platform, and constructing a data set:representing a source domain, where N S Representing the number of samples, +.>Representing a microblog text sample,/->The method is characterized in that the method is a microblog text credibility label; complaint report text data set: />Representing a target domain, wherein->Represent training sample number, ++>For the number of test samples, +.>For the complaint report of a text sample,and reporting the text credibility label for complaints.
3. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
in step S2, the preprocessing includes data cleansing and word segmentation, without the de-stop word operation, the word segmentation text being represented as a set of words:where o εs, t s represents the source domain and t represents the target domain; />For sentences->The included words; t (T) i Is the sentence length.
4. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
in step S3, a Word2vec model is used for implementationPresent text vectorization and text after word segmentationRepresented as a matrix:where n is the number of text and d is the word vector dimension.
5. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 3, wherein the method comprises the following steps: text vectorization is achieved by using a Word2vec model, and the dimension of the generated Word vector d is 300 dimensions.
6. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
the encoder adopted in the step S4 is a Bi-directional long-short-term memory network Bi-LSTM, and three groups of encoders with identical network structures are used for extracting private features and shared features of a source domain and a target domain;
the specific coding mode is as follows:
step S401 uses bidirectional LSTM as the core module of the encoder for text sequencesOutput of LSTM model connecting the front and back directions ∈>And->As output of Bi-LSTM at time t:
where o ε { s, t }, s represents the source domain, t represents the target domain,is T i In the time steps, inputting at the t time step; c t Is the unit state of LSTM at t time, h t The output of the t time step is calculated by the formula (2):
wherein W is f ,W i ,W o ,W c As a weight matrix, b f ,b i ,b o ,b c Is a bias vector; sigma is a sigmoid function, and by; f (f) t I is a forgetful door t O is an input door t Is an output door; in the whole process, the door f is forgotten first t Selectively filtering out some information of a previous state; then input gate i t Deciding which data is updated; LSTM cell state c t By forgetting the history information and adding new informationThe old state is covered by the new state value, and the state update is completed; finally, the output gate o t Determining output information, outputting h at the current time step t Through o t Filtering the information to obtain the information;
step S402 takes the output of the last time stepAnd->As->The encoded output result of (2):
wherein,is->Forward hidden layer output of text sequence, +.>For text->Outputting the sequence to an implicit layer; />Encoding the output text for LSTM networks>I.e. the output of the encoder;
step S403, three groups of encoders extract the domain sharing feature e respectively c ∈R n1×m =[e 1 ,e 2 ,...,e n1 ]The method comprises the steps of carrying out a first treatment on the surface of the The source domain private feature and the target domain private feature encoder output are respectively as follows Wherein m is the dimension +.>n2=N s ,/>Are the number of texts.
7. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
in step S5, the domain feature fusion is to fuse the source domain private feature, the target domain private feature and the domain sharing feature through a self-attention mechanism, and the specific feature fusion method is as follows:
step S501, matrix W V Key matrix W K Query matrix W Q Dot product with the input vector and score the result:
wherein b is { c, p }, c represents domain sharing, p is domain privacy; the product is a scaling dot product; d is a constant set to prevent the numerical value after the dot product from becoming too large, and is usually the dimension of the word vector, and is usually the dimension of the input word vector;
step S502 performs Softmax normalization operation on the scoring to obtain attention weight:
step S503, multiplying the weight point by the value vector to obtain the final source domain feature and the target domain feature:
wherein o ε { s, t }, s represents the source domain, t represents the target domain, e o Is a post-fusion feature.
8. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
the domain adaptation described in step S6 refers to computing the source domain feature e using the maximum mean difference MK-MMD s And target domain feature e t And adding the distance to the loss function, and performing special processing along with the iterative processField adaptation is accomplished in sign transformation:
wherein, there is a mapping φ (-) in a regenerated Hilbert space Reproducing Kernel Hilbert Space in RKHS to map the primary variables into RKHS.
9. The method for analyzing the reliability of environmental complaint reporting based on the deep migration network according to claim 1, wherein the method is characterized by comprising the following steps:
in step S7, the source domain feature e after the fusion of the domain features s And target domain feature e t And respectively sending the classified results to the MLP network to output the classified results:
is a predictive vector, i.e., a predictive result; MLP represents a multi-layer perceptron; />And->Representing a predicted probability; sigmoid is an activation function;
meanwhile, according to the classification result, a loss function is calculated to update network parameters, and the deep migration network learns the data difference between the source field and the target field to realize field adaptation, and learns label loss; and the final objective function, namely the loss function of the whole network, is lost by MK-MMD statistic source domain labels representing domain differences, so that the loss function of the whole migration network is as follows:
L=L cls +λL da (9)
wherein lambda is the adjustment parameter; l (L) da For adapting losses in the field, i.e. MMD 2 (e s ,e t );L cls For tag loss, including source domain tag lossAnd target Domain tag loss->Cross-entropy criterion is used in this classification task to reduce the loss function:
wherein y ε {0,1} is the confidence label; θ is a parameter that needs to be optimized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110310932.6A CN113064967B (en) | 2021-03-23 | 2021-03-23 | Complaint reporting credibility analysis method based on deep migration network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110310932.6A CN113064967B (en) | 2021-03-23 | 2021-03-23 | Complaint reporting credibility analysis method based on deep migration network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113064967A CN113064967A (en) | 2021-07-02 |
CN113064967B true CN113064967B (en) | 2024-03-22 |
Family
ID=76563241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110310932.6A Active CN113064967B (en) | 2021-03-23 | 2021-03-23 | Complaint reporting credibility analysis method based on deep migration network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113064967B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114969321B (en) * | 2022-03-14 | 2024-03-22 | 北京工业大学 | Environmental complaint reporting text classification method based on multi-weight self-training |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11194962B2 (en) * | 2019-06-05 | 2021-12-07 | Fmr Llc | Automated identification and classification of complaint-specific user interactions using a multilayer neural network |
-
2021
- 2021-03-23 CN CN202110310932.6A patent/CN113064967B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
Non-Patent Citations (1)
Title |
---|
基于深度迁移网络的 Twitter 谣言检测研究;刘勘 等;Data Analysis and Knowledge Discovery(第10期);47-55 * |
Also Published As
Publication number | Publication date |
---|---|
CN113064967A (en) | 2021-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108897857B (en) | Chinese text subject sentence generating method facing field | |
CN111709241B (en) | Named entity identification method oriented to network security field | |
CN110598005B (en) | Public safety event-oriented multi-source heterogeneous data knowledge graph construction method | |
Wei et al. | A target-guided neural memory model for stance detection in twitter | |
CN109918505B (en) | Network security event visualization method based on text processing | |
CN113673254B (en) | Knowledge distillation position detection method based on similarity maintenance | |
CN113051927B (en) | Social network emergency detection method based on multi-modal graph convolutional neural network | |
CN111026880B (en) | Joint learning-based judicial knowledge graph construction method | |
CN111753058A (en) | Text viewpoint mining method and system | |
Chen et al. | A deep learning method for judicial decision support | |
CN110909542A (en) | Intelligent semantic series-parallel analysis method and system | |
CN114462420A (en) | False news detection method based on feature fusion model | |
Yu et al. | Policy text classification algorithm based on BERT | |
CN113064967B (en) | Complaint reporting credibility analysis method based on deep migration network | |
CN116186350B (en) | Power transmission line engineering searching method and device based on knowledge graph and topic text | |
CN116843175A (en) | Contract term risk checking method, system, equipment and storage medium | |
CN113326371B (en) | Event extraction method integrating pre-training language model and anti-noise interference remote supervision information | |
CN113435190B (en) | Chapter relation extraction method integrating multilevel information extraction and noise reduction | |
CN110968795B (en) | Data association matching system of company image lifting system | |
CN113378571A (en) | Entity data relation extraction method of text data | |
CN115510245B (en) | Unstructured data-oriented domain knowledge extraction method | |
Chen et al. | A robust graph convolutional network for relation extraction by combining edge information | |
CN117807999B (en) | Domain self-adaptive named entity recognition method based on countermeasure learning | |
CN117573865A (en) | Rumor fuzzy detection method based on interpretable adaptive learning | |
Pan et al. | A Mix-model based Deep Learning for Text Sentiment Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |