CN116415005B - Relationship extraction method for academic network construction of scholars - Google Patents

Relationship extraction method for academic network construction of scholars Download PDF

Info

Publication number
CN116415005B
CN116415005B CN202310684297.7A CN202310684297A CN116415005B CN 116415005 B CN116415005 B CN 116415005B CN 202310684297 A CN202310684297 A CN 202310684297A CN 116415005 B CN116415005 B CN 116415005B
Authority
CN
China
Prior art keywords
model
teacher
loss
representing
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310684297.7A
Other languages
Chinese (zh)
Other versions
CN116415005A (en
Inventor
费洪晓
谭杨盈
杨柳
龙军
王子冬
黄文体
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202310684297.7A priority Critical patent/CN116415005B/en
Publication of CN116415005A publication Critical patent/CN116415005A/en
Application granted granted Critical
Publication of CN116415005B publication Critical patent/CN116415005B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The application discloses a relationship extraction method for academic network construction of scholars, which comprises the following steps: step S1: constructing a multi-teacher model comprising at least two teacher models, and calculating the weights of the teacher models; step S2: constructing a student model, calculating distillation target distribution according to flexible temperature, calculating knowledge distillation loss, and calculating total loss by combining the knowledge distillation loss and remote supervision loss; and S3, performing relation extraction, training the student model to obtain a relation extraction model, and performing relation extraction on the input data set by using the relation extraction model. The method has the advantages that the method sets flexible temperature in consideration of the difference of samples, uses the relative difference value of the information entropy of the samples in the hidden layer and the classifying layer, retains the effective information of the samples as far as possible, and reduces the softening uncertainty of the sample labels; in addition, the application introduces the thought of multiple teachers into relation extraction, and propagates richer and effective knowledge by constructing global relations of different teachers.

Description

Relationship extraction method for academic network construction of scholars
Technical Field
The application relates to the technical field of deep learning, in particular to a relationship extraction method for academic network construction of scholars.
Background
With iterative updates of academic research, the amount of academic text data is also growing explosively, and the way of manually labeling text data requires a significant time cost. Therefore, the automatic relation extraction of the multi-source academic text and the scientific construction of the academic network of the scholars are the problems to be solved urgently.
The concept of knowledge graph is proposed by google in 2012, and the effective data organization mode is widely applied to various knowledge fields such as finance, information, medical treatment and the like. Meanwhile, in order to integrate massive scholars information and academic resources, some academic knowledge maps are proposed, such as Aminer, aceMap, etc., and the academic knowledge maps integrate unstructured multisource information into a structured scholars academic network, so as to help to mine and integrate academic knowledge from massive academic texts, including papers, projects, subject information, etc. of scientific researchers; meanwhile, scientific and technological entities and semantic relations among data are mined from multi-source academic texts, and unified data structure organization forms among different data forms are constructed.
Constructing an academic network of the scholars requires organizing academic text information. The academic information of the scholars is widely existed in the real world, and is often obtained from homepages of the scholars, wikipedia knowledge base and various reports on the Internet, and most of data of the academic text is unstructured and cannot be subjected to data mining by using a unified template. The data has the characteristics of different data structures, various data types, data burst type iterative update and the like. The method comprises the steps of establishing a academic network of a scholars, wherein the problems of difficult relation extraction and sparse map information exist, a model for processing a natural language task at present is a single target task, available knowledge is limited, and the method has no good generalization capability in practical use. The problems of label noise and low data utilization rate mainly exist in the method.
In view of the foregoing, there is a great need for a relationship extraction method for academic network construction of scholars to solve the problems in the prior art.
Disclosure of Invention
The application aims to provide a relationship extraction method for academic network construction of scholars, which comprises the following specific technical scheme:
a relationship extraction method for academic network construction of scholars comprises the following steps:
step S1: constructing a multi-teacher model, wherein the multi-teacher model comprises at least two teacher models, the teacher models are trained through cross entropy loss, and the weight of the teacher models is calculated through the cross entropy loss and F1 score;
step S2: constructing a student model, calculating distillation target distribution of the student model according to the flexible temperature, calculating knowledge distillation loss by combining the distillation target distribution and the teacher model weight in the step S1, and calculating total loss of the student model by using the knowledge distillation loss and the remote supervision loss of the student model;
and S3, performing relation extraction, training the student model based on the total loss of the student model in the step S2 to obtain a relation extraction model, and performing relation extraction on the input data set by using the relation extraction model.
Preferably, in step S1, the cross entropy loss expression of the teacher model is as follows:
wherein ,cross entropy loss representing teacher model, +.>Representing the number of sentences>Representing the number of sentence categories>Representing remote supervision comments,/->Representing predictions of the teacher model.
Preferably, in step S1, the expression of the teacher model weight is as follows:
wherein ,is->Weights of bit teacher model, +.>Representing the number of teacher models, +.>Representing the super-parameters; />Indicate->Cross entropy loss of bit teacher model, +.>Indicate->F1 score for the bit teacher model.
Preferably, in step S2, the distillation target distribution includes sentence representation and prediction thereof by the teacher model, and the expression is as follows:
wherein ,representing a distillation target distribution of the student model; />Is a superparameter and->;/>Is->Sentence representations of the individual sentences; />Represent teacher model pair->Prediction of individual sentences.
Preferably, in step S2, a flexible temperature-based is usedThe prediction function computes the prediction of the sentence, expressed as follows:
wherein ,representation->A predictive function; />Representing a flexible temperature; />Representation->Vector;indicating teacher model predicts +.>Flexible temperature of individual sentences; />Indicating teacher model predicts +.>+.>Vector.
Preferably, in step S2, the flexible temperature is combined by the difference between the hidden layer information entropy and the classified layer information entropyThe function calculation and flexible temperature expression are as follows:
wherein , and />All represent superparameters, ">,/>;/>Indicate->Hidden layer information entropy of each sentence; />Indicate->Classification layer information entropy of each sentence.
Preferably, in step S2, the information entropy is calculated as follows:
wherein ,representing information entropy; />Is indicated at->Back->Predicting vector obtaining; />Representation->Vector.
Preferably, in step S2, the expression of the knowledge distillation loss is as follows:
wherein ,representing knowledge distillation loss; />Representing a distillation target distribution of the student model;representing student model vs. sentence->Is predicted by the computer; />Representing student model predictive sentences +.>Is->Vector; />Representing student model predictive sentences +.>Is provided.
Preferably, in step S2, the remote supervision loss calculation mode of the student model is the same as the cross entropy loss of the teacher model.
Preferably, in step S2, the total loss expression of the student model is as follows:
wherein ,representing the total loss of the student model; />Representing superparameters, wherein->;/>Representing the remote supervision loss of the student model.
The technical scheme of the application has the following beneficial effects:
(1) The application provides a multi-view flexible temperature calculation mode by utilizing the relative difference value of information of a hidden layer and classified layer. According to the application, the elastic temperature is set in consideration of different information amounts of each sample, so that the uncertainty of the traditional temperature on softening of the sample label is reduced while the effective information of the sample is reserved to the greatest extent.
(2) The application builds a multi-teacher model, and builds a global relation between teachers by adopting F1 score and cross entropy loss as indexes in the training process of the teacher model, so as to avoid misleading student models due to inaccurate knowledge; according to the application, a knowledge distillation model based on an attention mechanism is carried out by adopting a multi-teacher model and a student model, so that effective knowledge in the teacher model is extracted, the training speed of the model is improved, and the characteristic learning capacity of the model is improved.
(3) The application provides a machine learning architecture for parallel training of a relation extraction model and a relation extraction framework for remote supervision, solves the problem of efficient extraction of multi-source data on a pre-training language model, and reduces the computational power requirement of the model.
In addition to the objects, features and advantages described above, the present application has other objects, features and advantages. The present application will be described in further detail with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a flow chart of the steps of a relationship extraction method in a preferred embodiment of the present application;
FIG. 2 is a schematic diagram of remote supervision relation extraction in accordance with a preferred embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Examples:
referring to fig. 1, a relationship extraction method for academic network construction of a learner includes the following steps:
step S1: constructing a multi-teacher model, wherein the multi-teacher model comprises at least two teacher models, the teacher models are trained through cross entropy loss, and the weight of the teacher models is calculated through the cross entropy loss and F1 score;
step S2: constructing a student model, calculating distillation target distribution of the student model according to the flexible temperature, calculating knowledge distillation loss by combining the distillation target distribution and the teacher model weight in the step S1, and calculating total loss of the student model by using the knowledge distillation loss and the remote supervision loss of the student model;
and S3, performing relation extraction, training the student model based on the total loss of the student model in the step S2 to obtain a relation extraction model, and performing relation extraction on the input data set by using the relation extraction model.
The concrete explanation is as follows:
in this embodiment, the loss value and the F1 score of each teacher model need to be considered when constructing the multi-teacher model, and these two values reflect the degree of approach of each teacher model to the predicted value and the actual value of the sentence, so that the multi-teacher model of this embodiment can make more accurate judgment in the process of transmitting knowledge to the student model on the noise problem in the academic text facing massive remote supervision and labeling.
The teacher model in the present embodiment is extracted based on the sentence-level relationship, specifically, in step S1, the cross entropy loss expression of the teacher model is as follows:
wherein ,cross entropy loss representing teacher model, +.>Representing the number of sentences>Representing the number of sentence categories>Representing remote supervision comments,/->Representing predictions of the teacher model.
Further, in order to maximize the knowledge of different teacher models, the present embodiment assigns weights of different teacher models in the multi-viewpoint calculation by cross entropy loss and F1 score of the teacher models, specifically, in step S1, the expression of the teacher model weights is as follows:
wherein ,is->Weights of bit teacher model, +.>Representing the number of teacher models, +.>Representing the super-parameters; />Indicate->Position teachingCross entropy loss of the engineer model, < >>Indicate->F1 score for the bit teacher model.
As can be seen from the above expression, largerCorresponding->The smaller and the larger +.>Corresponding->The larger the cross entropy loss and F1 fraction pair are, the more the cross entropy loss and F1 fraction pair can be flexibly controlled according to training results>Is a contribution of (a).
It should be noted that, in this embodiment, the calculation of the F1 score belongs to a conventional technical means, which is not described herein.
Further, the present embodiment predicts sentences in the package using the teacher model, and aggregates the sentence representations in the package, specifically, in step S2, the distillation target distribution includes the sentence representations and the prediction thereof by the teacher model, and the expression of the distillation target distribution is as follows:
wherein ,representing a distillation target distribution of the student model; />Is super-parameterAnd->;/>Is->Sentence representations of the individual sentences; />Represent teacher model pair->Prediction of individual sentences.
Further, the use of fixed temperature settings for different instances may result in predictions that are too gentle while feature information is lost, and may result in difficult predictions that are not sufficiently softened. In this embodiment, it is preferable to use a flexible temperature-based in step S2The prediction function computes the prediction of the sentence, expressed as follows:
wherein ,representation->A predictive function; />Representing a flexible temperature; />Representation->Vector;indicating teacher model predicts +.>Flexible temperature of individual sentences; />Indicating teacher model predicts +.>+.>Vector.
From the saidThe predictive function shows that the higher the flexible temperature, the flatter the predictive result.
Further, in step S2, the flexible temperature is combined by the difference between the hidden layer information entropy and the classified layer information entropyThe function calculation and flexible temperature expression are as follows:
wherein , and />All represent superparameters, ">,/>;/>Indicate->Hidden layer information entropy of each sentence; />Indicate->Classification layer information entropy of each sentence.
Preferably, in step S2, the information entropy is calculated as follows:
wherein ,representing information entropy; />Is indicated at->Back->Predicting vector obtaining; />Representation->Vector.
In order that the prediction of the student model can approach the soft distribution of the teacher model, the present embodiment adopts Kullback-Leibler (KL) divergence as the knowledge distillation loss, and the present embodiment calculates the knowledge distillation loss in combination with the multi-teacher weight, specifically, in step S2, the expression of the knowledge distillation loss is as follows:
wherein ,representing knowledge distillation loss; />Representing a distillation target distribution of the student model;representing student model vs. sentence->Is predicted by the computer; />Representing student model predictive sentences +.>Is->Vector (S)>,/>Representation->Real set of dimensions, +.>Representing the number of sentence categories; />Representing student model predictive sentences +.>Flexible temperature->,/>Representing a set of real numbers.
In step S2, the remote supervision loss calculation mode of the student model is the same as the cross entropy loss of the teacher model, which is not described in detail in this embodiment.
Further, in step S2, the total loss expression of the student model is as follows:
wherein ,representing the total loss of the student model; />Representing superparameters, wherein->;/>Representing the remote supervision loss of the student model.
As shown in fig. 2, the embodiment realizes knowledge distillation through a multi-teacher model, learns different characteristic information of academic samples through different teacher models, obtains more accurate knowledge from knowledge of a plurality of teacher sources, and calculates distillation loss in the training process according to prediction values imitating the teacher.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (7)

1. The relationship extraction method for academic network construction of the scholars is characterized by comprising the following steps:
step S1: constructing a multi-teacher model, wherein the multi-teacher model comprises at least two teacher models, the teacher models are trained through cross entropy loss, and the weight of the teacher models is calculated through the cross entropy loss and F1 score;
step S2: constructing a student model, calculating distillation target distribution of the student model according to the flexible temperature, calculating knowledge distillation loss by combining the distillation target distribution and the teacher model weight in the step S1, and calculating total loss of the student model by using the knowledge distillation loss and the remote supervision loss of the student model;
step S3, performing relation extraction, training the student model based on the total loss of the student model in the step S2 to obtain a relation extraction model, and performing relation extraction on the input data set by using the relation extraction model;
in step S2, the distillation target distribution includes sentence representation and prediction thereof by the teacher model, and the expression is as follows:
wherein ,representing a distillation target distribution of the student model; beta is super parameter and beta is E [0,1 ]];/>Sentence representation for the kth sentence; />Representation ofPredicting a kth sentence by the teacher model;
the prediction of sentences is calculated using a flexible temperature-based Softmax prediction function, expressed as follows:
wherein ,representing a Softmax prediction function; />Representing a flexible temperature; z k Representing a logic vector; />The flexible temperature of the kth sentence predicted by the teacher model is represented; />A logic vector representing a kth sentence predicted by the teacher model;
the flexible temperature is calculated by combining the difference between the information entropy of the hidden layer and the information entropy of the classified layer and a sigmoid function, and the expression of the flexible temperature is as follows:
wherein, eta and mu both represent super parameters, eta>0,μ∈(0,1);Hidden layer information representing kth sentenceEntropy; e, e k And (5) representing the information entropy of the classification layer of the kth sentence.
2. The relationship extraction method according to claim 1, wherein in step S1, a cross entropy loss expression of the teacher model is as follows:
wherein ,LT Represents the cross entropy loss of the teacher model, N represents the number of sentences, O represents the number of sentence categories,representing remote supervision comments,/->Representing predictions of the teacher model.
3. The relationship extraction method according to claim 2, wherein in step S1, the expression of the teacher model weight is as follows:
wherein ,wq The weight of the Q-th teacher model is Q, the number of the teacher models is Q, and alpha represents the super parameter;represents cross entropy loss of the qth teacher model, f q The F1 score for the qth teacher model is represented.
4. A relation extracting method according to claim 3, wherein in step S2, the information entropy is calculated as follows:
e=-y·log(y);
wherein e represents information entropy; y represents the prediction of the logit vector after softmax; z represents the logic vector.
5. The relationship extraction method according to claim 4, wherein in step S2, the expression of the knowledge distillation loss is as follows:
wherein ,LKD Representing knowledge distillation loss;representing a distillation target distribution of the student model; /> Representing a student model's prediction of sentence k; />A logic vector representing a student model predicted sentence k; />Representing the flexible temperature of the student model predictive sentence k.
6. The method according to claim 5, wherein in step S2, the remote supervision loss calculation method of the student model is the same as the cross entropy loss of the teacher model.
7. The relationship extraction method according to claim 6, wherein in step S2, the total loss expression of the student model is as follows:
L=λL S +(1-λ)L KD
wherein L represents the total loss of the student model; lambda represents a superparameter where lambda e 0,1];L S Representing the remote supervision loss of the student model.
CN202310684297.7A 2023-06-12 2023-06-12 Relationship extraction method for academic network construction of scholars Active CN116415005B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310684297.7A CN116415005B (en) 2023-06-12 2023-06-12 Relationship extraction method for academic network construction of scholars

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310684297.7A CN116415005B (en) 2023-06-12 2023-06-12 Relationship extraction method for academic network construction of scholars

Publications (2)

Publication Number Publication Date
CN116415005A CN116415005A (en) 2023-07-11
CN116415005B true CN116415005B (en) 2023-08-18

Family

ID=87056362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310684297.7A Active CN116415005B (en) 2023-06-12 2023-06-12 Relationship extraction method for academic network construction of scholars

Country Status (1)

Country Link
CN (1) CN116415005B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894097B (en) * 2023-09-04 2023-12-22 中南大学 Knowledge graph label prediction method based on hypergraph modeling
CN117116408B (en) * 2023-10-25 2024-01-26 湖南科技大学 Relation extraction method for electronic medical record analysis

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488663A (en) * 2012-06-11 2014-01-01 国际商业机器公司 System and method for automatically detecting and interactively displaying information about entities, activities, and events from multiple-modality natural language sources
CN110910328A (en) * 2019-11-26 2020-03-24 电子科技大学 Defense method based on antagonism sample classification grade
CN111291185A (en) * 2020-01-21 2020-06-16 京东方科技集团股份有限公司 Information extraction method and device, electronic equipment and storage medium
CN113869512A (en) * 2021-10-09 2021-12-31 北京中科智眼科技有限公司 Supplementary label learning method based on self-supervision and self-distillation
CN114547300A (en) * 2022-02-18 2022-05-27 南京大学 Relationship classification method combining remote supervision and supervised
CN115495571A (en) * 2022-07-28 2022-12-20 南京航空航天大学 Method and device for evaluating influence of knowledge distillation on model backdoor attack
CN115544277A (en) * 2022-12-02 2022-12-30 东南大学 Rapid knowledge graph embedded model compression method based on iterative distillation
CN115618022A (en) * 2022-12-19 2023-01-17 中国科学技术大学 Low-resource relation extraction method based on data synthesis and two-stage self-training
WO2023017568A1 (en) * 2021-08-10 2023-02-16 日本電信電話株式会社 Learning device, inference device, learning method, and program
CN115907001A (en) * 2022-11-11 2023-04-04 中南大学 Knowledge distillation-based federal diagram learning method and automatic driving method
CN115995018A (en) * 2022-12-09 2023-04-21 厦门大学 Long tail distribution visual classification method based on sample perception distillation
JP2023523644A (en) * 2020-09-02 2023-06-06 之江実験室 A Compression Method and Platform for Pre-trained Language Models Based on Knowledge Distillation

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2595541A1 (en) * 2007-07-26 2009-01-26 Hamid Htami-Hanza Assisted knowledge discovery and publication system and method
US20160335343A1 (en) * 2015-05-12 2016-11-17 Culios Holding B.V. Method and apparatus for utilizing agro-food product hierarchical taxonomy
US11636337B2 (en) * 2019-03-22 2023-04-25 Royal Bank Of Canada System and method for knowledge distillation between neural networks
US20210158156A1 (en) * 2019-11-21 2021-05-27 Google Llc Distilling from Ensembles to Improve Reproducibility of Neural Networks
JP2023531263A (en) * 2020-06-29 2023-07-21 ロレアル Semantic Relations Maintaining Knowledge Distillation for Image-to-Image Transformation
US11907845B2 (en) * 2020-08-17 2024-02-20 International Business Machines Corporation Training teacher machine learning models using lossless and lossy branches
KR102406540B1 (en) * 2020-11-25 2022-06-08 인하대학교 산학협력단 A method of splitting and re-connecting neural networks for adaptive continual learning in dynamic environments

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488663A (en) * 2012-06-11 2014-01-01 国际商业机器公司 System and method for automatically detecting and interactively displaying information about entities, activities, and events from multiple-modality natural language sources
CN110910328A (en) * 2019-11-26 2020-03-24 电子科技大学 Defense method based on antagonism sample classification grade
CN111291185A (en) * 2020-01-21 2020-06-16 京东方科技集团股份有限公司 Information extraction method and device, electronic equipment and storage medium
JP2023523644A (en) * 2020-09-02 2023-06-06 之江実験室 A Compression Method and Platform for Pre-trained Language Models Based on Knowledge Distillation
WO2023017568A1 (en) * 2021-08-10 2023-02-16 日本電信電話株式会社 Learning device, inference device, learning method, and program
CN113869512A (en) * 2021-10-09 2021-12-31 北京中科智眼科技有限公司 Supplementary label learning method based on self-supervision and self-distillation
CN114547300A (en) * 2022-02-18 2022-05-27 南京大学 Relationship classification method combining remote supervision and supervised
CN115495571A (en) * 2022-07-28 2022-12-20 南京航空航天大学 Method and device for evaluating influence of knowledge distillation on model backdoor attack
CN115907001A (en) * 2022-11-11 2023-04-04 中南大学 Knowledge distillation-based federal diagram learning method and automatic driving method
CN115544277A (en) * 2022-12-02 2022-12-30 东南大学 Rapid knowledge graph embedded model compression method based on iterative distillation
CN115995018A (en) * 2022-12-09 2023-04-21 厦门大学 Long tail distribution visual classification method based on sample perception distillation
CN115618022A (en) * 2022-12-19 2023-01-17 中国科学技术大学 Low-resource relation extraction method based on data synthesis and two-stage self-training

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于知识蒸馏方法的行人属性识别研究;凌弘毅;《计算机应用与软件》;第181-184、193页 *

Also Published As

Publication number Publication date
CN116415005A (en) 2023-07-11

Similar Documents

Publication Publication Date Title
CN116415005B (en) Relationship extraction method for academic network construction of scholars
US11500905B2 (en) Probability mapping model for location of natural resources
CN106033462B (en) A kind of new word discovery method and system
CN104298651B (en) Biomedicine named entity recognition and protein interactive relationship extracting on-line method based on deep learning
US20160350288A1 (en) Multilingual embeddings for natural language processing
CN110516245A (en) Fine granularity sentiment analysis method, apparatus, computer equipment and storage medium
CN108846017A (en) The end-to-end classification method of extensive newsletter archive based on Bi-GRU and word vector
CN102411611B (en) Instant interactive text oriented event identifying and tracking method
CN109308323A (en) A kind of construction method, device and the equipment of causality knowledge base
CN110263165A (en) A kind of user comment sentiment analysis method based on semi-supervised learning
CN111339765A (en) Text quality evaluation method, text recommendation method and device, medium and equipment
CN111710428A (en) Biomedical text representation method for modeling global and local context interaction
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN116578705A (en) Microblog emotion classification method based on pre-training language model and integrated neural network
CN115374789A (en) Multi-granularity fusion aspect-level emotion analysis method based on pre-training model BERT
CN116860978B (en) Primary school Chinese personalized learning system based on knowledge graph and large model
CN107844474A (en) Disease data name entity recognition method and system based on stacking condition random field
Shan Social network text sentiment analysis method based on CNN-BiGRU in big data environment
CN109885827B (en) Deep learning-based named entity identification method and system
Al-Baity et al. Computational linguistics based emotion detection and classification model on social networking data
Charnine et al. Optimal automated method for collaborative development of universiry curricula
Fang et al. Self-adaptive topic model: A solution to the problem of “rich topics get richer”
CN113537372B (en) Address recognition method, device, equipment and storage medium
CN114417846B (en) Entity relation extraction method based on attention contribution degree
CN117632098B (en) AIGC-based intelligent building design system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant