CN112001171A - Case-related property knowledge base entity identification method based on ensemble learning - Google Patents

Case-related property knowledge base entity identification method based on ensemble learning Download PDF

Info

Publication number
CN112001171A
CN112001171A CN202010825763.5A CN202010825763A CN112001171A CN 112001171 A CN112001171 A CN 112001171A CN 202010825763 A CN202010825763 A CN 202010825763A CN 112001171 A CN112001171 A CN 112001171A
Authority
CN
China
Prior art keywords
property
learner
training
entity
knowledge base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010825763.5A
Other languages
Chinese (zh)
Inventor
林锋
蒋宗神
李攀峰
李元豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202010825763.5A priority Critical patent/CN112001171A/en
Publication of CN112001171A publication Critical patent/CN112001171A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Technology Law (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a case-related property knowledge base entity identification method based on ensemble learning, which comprises the following steps: carrying out training set pretreatment on a plurality of randomly selected legal documents related to the related assets according to entity categories; training the T learners according to the obtained training set to obtain learners; randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; calculating the classification accuracy of the corpora in the development set by using the trained learners, and constructing the weight of each learner by using the classification accuracy of each learner; dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method. The entity identification problem of a small-scale corpus and high accuracy requirement can be solved, and the relevant knowledge fusion of the processing of the property involved in the case can be automatically completed according to the existing legal provisions.

Description

Case-related property knowledge base entity identification method based on ensemble learning
Technical Field
The invention belongs to the technical field of entity identification of a property-related knowledge base, and particularly relates to a method for identifying a property-related knowledge base entity based on ensemble learning.
Background
The knowledge base can describe concepts, entities and relations thereof in the objective world in a structured form, and effective organization, management and understanding of mass information are completed. The potential of the knowledge base system in the applications of knowledge fusion, intelligent question answering, big data decision making and the like is widely concerned. The knowledge base is a huge network with entities as nodes, and comprises the entities, entity attributes and relationships among the entities. Entity identification is a core technology for knowledge base construction.
Entity identification refers to identifying entities with specific meanings from text and determining categories for the entities. Entity recognition plays an important role in a variety of natural language processing applications, such as information extraction, information retrieval, automatic text summarization, machine translation, knowledge bases, and the like. With respect to entity identification, considerable research has been conducted at home and abroad, and methods for entity identification can be roughly classified into three types: rule-based methods, traditional machine learning-based methods, and deep learning-based methods. Rule-based methods rely on a large number of manual rules and do not require corpus labeling. However, the rule making is time-consuming and labor-consuming, and needs to be supported by professional knowledge in some professional fields. The portability of rule-based approaches is limited and good performance needs to be achieved by updating the rules for text from new domains. Thus, this method is now slowly used less often. With the development of the traditional machine learning, a plurality of traditional machine learning methods are successfully applied to the entity recognition task, such as hidden markov models, maximum entropy models, conditional random fields, and the like. In addition to using machine learning algorithms alone, multiple methods may be combined to accomplish the entity recognition task. Deep learning-based methods, such as bidirectional long-and short-term memory neural network models, have also been successfully applied to entity recognition tasks. Compared with the traditional machine learning-based method, the deep learning-based method does not need elaborate feature engineering, can automatically capture the context dependence in the input text, and can be well represented.
However, the challenge faced by entity identification in the construction process of the property knowledge base involved in the case is different from the general entity identification method, and becomes a unique and challenging problem. Due to the particularity of the domain knowledge base, the following challenges are faced in completing entity identification during the process of creating the knowledge base: (1) the training corpus is few and single. The construction target of the case-related property knowledge base is to automatically complete knowledge extraction based on legal rules, and the corpus is mainly derived from the legal rules related to case-related property treatment in the formally implemented legal rules, so that the training corpus is far less than a general knowledge base and is also less than a general legal knowledge base when entity identification is carried out; (2) the requirement on identification accuracy is high. The application target of the property-related knowledge base is to provide support for front-line case handling personnel in judicial practice, which puts extremely high requirements on the correctness and accuracy of knowledge in the knowledge base. In order to ensure the correctness of the knowledge base and reduce subsequent work, the identification accuracy of the entity identification algorithm is much higher than that of the general knowledge base.
Disclosure of Invention
In order to solve the problems, the invention provides an entity identification method of a case-related property knowledge base based on integrated learning, which can solve the entity identification problems of a small-scale corpus and high accuracy requirements, and can automatically complete the relevant knowledge fusion of case-related property disposal in criminal cases according to the existing legal provisions.
In order to achieve the purpose, the invention adopts the technical scheme that: a method for identifying entities in a knowledge base of related to property based on ensemble learning comprises the steps of obtaining a legal document set related to the property, constructing a corpus according to the legal document related to the property, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process comprises the following steps:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set, and constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
Further, the entity classification includes a disposal unit, a worker of the disposal unit, a case-related person, a document, a property involved in the case, a disposal action, and a term or a title of a legal document.
Further, the training set preprocessing according to the selected legal documents related to the property related to the case comprises the following steps: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
Further, in the learner training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
Further, 4 learners are adopted in the learner training process, including: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
Further, in the development set building process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
Further, in the learner weight determination process, the method includes the steps of:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
Figure RE-GDA0002673044180000031
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
Figure RE-GDA0002673044180000032
further, in the step 3, in obtaining a final entity identification result by using a weighted voting method, a calculation formula of the weighted voting method is:
Figure RE-GDA0002673044180000033
wherein h isiFor the learner, x is the test sample, cjIn the form of an output tag, the tag is,
Figure 100002_1
is hiAt label cjAn output of wiIs a learning device hiThe weight of (c).
The beneficial effects of the technical scheme are as follows:
the method can solve the entity identification problem of small-scale corpus and high accuracy requirement. The scheme provides that a learner obtained by training a plurality of existing entity recognition algorithms on a training set carries out weighted voting. The result obtained by the method is better than that obtained by an independent entity identification method, and the identification effect is improved. A plurality of existing identification schemes are selected to respectively complete entity identification tasks, and then parallel integration is carried out to improve identification accuracy. The invention can automatically complete the fusion of the related knowledge of the handling of the property involved in the criminal case according to the prior law provision.
Drawings
FIG. 1 is a schematic flow chart of an integrated learning-based identification method for an entity of a property knowledge base involved in a case;
fig. 2 is a flowchart illustrating a learner weight determination process according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described with reference to the accompanying drawings.
In this embodiment, referring to fig. 1, the present invention provides an entity identification method for a knowledge base of assets involved in a case based on ensemble learning, which includes the steps of obtaining a legal document set related to the assets involved in the case, constructing a corpus according to the legal documents related to the assets involved in the case, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process includes the steps of:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set, and constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
The acquired legal document set related to the property involved in the case comprises 14 policy documents and laws and regulations related to the management and disposal of the property involved in the case.
In the above legal documents, the entity to be identified is classified to include a disposal unit, a worker of the disposal unit, a person related to a case, a document, a property related to the case, a disposal action and a provision or a title of the legal document.
(1) Treatment unit: including and not limited to: the national court of people, the national inspection institute, the public security organization, the ministry of public security, the department of justice, the ministry of finance, the department of finance, the highest national court of people, the basic national court of people, the national security organization, the inspection institute, the national library, the central national library, the case handling department, the custody department, the case handling unit, the higher government and law organization, the Chinese people's bank, the court, the council, the inspection and observation committee of the national inspection institute, the committee of trial, the legal assistance mechanism, the prison, the custody, the community correction mechanism, the guard house, the customs and the like.
(2) Staff/affiliated staff of the disposal unit: including and not limited to: case handling personnel, custodian personnel, supervisor personnel, inspection personnel, judicial staff, trial personnel, reconnaissance personnel, civilian accompanying and reviewing personnel, courtyard, inspection chief, public security organization responsible personnel, bookkeeping personnel, trial chief and the like.
(3) Case-related personnel: including and not limited to: parties, defendents, defendees, offsite, relatives, victims, litigants, prosecutes, criminal suspects, crimes, plaints, attorneys, legal agents, litigant participants, disputes, testifiers, appraisers, translators, conspires, stakeholders, attorneys on duty, guardians, reporters, referees, reporters, critiques, current criminals, major suspects, and the like.
(4) Document: including and not limited to: a decision, a notice, a certificate, a police officer's approval of an arrest, a national institute of quarantine prosecution, a national court decision, a detainment, a release certificate, an arrest, a promissory note, a notice, a decision, a case decision, a search certificate, a wanted statue, a legal document, etc.
(5) Relating to property: including and not limited to: property, property-related, document, mail, telegraph, deposit, remittance, bond, stock share, fund share, property, contraband, legal property, article, automobile, boat, money, case-related, deposit voucher, power certificate, payment voucher, money order, book order, cheque, gold and silver, jewelry, famous calligraphy and painting, valuables, real estate, equipment, precious animals and their products, precious plants and their products, drugs and the like.
(6) The treatment action is as follows: including and not limited to: checking, detaining, freezing, keeping, paying, liability refunding, paying, impound, and returning.
(7) Clause or legal document title: including and not limited to: criminal litigation law, twenty-fourth and thirty-sixth criminal litigation law, etc.
As an optimization scheme 1 of the embodiment, the method for preprocessing the training set according to the selected legal documents related to the property related to the case comprises the following steps: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
In the learner training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
Preferably, 4 learners are used in the learner training process, including: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
As an optimization scheme 2 of the above embodiment, in the development set construction process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
In the learner weight determination process, as shown in fig. 2, comprising the steps of:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
Figure RE-GDA0002673044180000061
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
Figure RE-GDA0002673044180000062
as an optimization scheme 3 of the above embodiment, in the step 3, in obtaining a final entity identification result by using a weighted voting method, a calculation formula of the weighted voting method is as follows:
Figure RE-GDA0002673044180000063
wherein h isiFor the learner, x is the test sample, cjIn the form of an output tag, the tag is,
Figure 2
is hiAt label cjAn output of wiIs a learning device hiThe weight of (c).
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (8)

1. A method for identifying an entity of a knowledge base of related to property based on ensemble learning is characterized by acquiring a legal document set related to the property, constructing a corpus according to the legal document related to the property, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process comprises the following steps:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set; constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
2. The integrated learning-based case-related property knowledge base entity identification method as claimed in claim 1, wherein the entity classification comprises a disposal unit, a worker of the disposal unit, a case-related person, a document, a case-related property, a disposal action and a term or a title of a legal document.
3. The integrated learning-based case-related property knowledge base entity identification method as claimed in claim 1, wherein the training set preprocessing according to the selected case-related property-related legal documents comprises the steps of: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
4. The integrated learning-based identification method for the property-related knowledge base entity as claimed in claim 3, wherein in the learning training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
5. The integrated learning-based identification method for the property-related knowledge base entity as claimed in claim 4, wherein 4 learners are adopted in the learner training process, and the method comprises the following steps: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
6. The integrated learning-based identification method for the property-related knowledge base entity, according to claim 1, wherein in the development set construction process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
7. The integrated learning-based identification method for the property-involved knowledge base entity, as claimed in claim 6, wherein in the learner weight determination process, the method comprises the following steps:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
Figure FDA0002636101260000021
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
Figure FDA0002636101260000022
8. the integrated learning-based identification method for the property-involved knowledge base entity as claimed in claim 1, wherein in the step 3, a weighted voting method is adopted to obtain the final entity identification result, and the calculation formula of the weighted voting method is as follows:
Figure FDA0002636101260000023
wherein h isiFor the learner, x is the test sample, cjIn the form of an output tag, the tag is,
Figure 1
is hiAt label cjAn output of wiIs a learning device hiThe weight of (c).
CN202010825763.5A 2020-08-17 2020-08-17 Case-related property knowledge base entity identification method based on ensemble learning Pending CN112001171A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010825763.5A CN112001171A (en) 2020-08-17 2020-08-17 Case-related property knowledge base entity identification method based on ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010825763.5A CN112001171A (en) 2020-08-17 2020-08-17 Case-related property knowledge base entity identification method based on ensemble learning

Publications (1)

Publication Number Publication Date
CN112001171A true CN112001171A (en) 2020-11-27

Family

ID=73472513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010825763.5A Pending CN112001171A (en) 2020-08-17 2020-08-17 Case-related property knowledge base entity identification method based on ensemble learning

Country Status (1)

Country Link
CN (1) CN112001171A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158659A (en) * 2021-02-08 2021-07-23 银江股份有限公司 Case-related property calculation method based on judicial text
CN113886602A (en) * 2021-10-19 2022-01-04 四川大学 Multi-granularity cognition-based domain knowledge base entity identification method
CN113918682A (en) * 2021-10-19 2022-01-11 四川大学 Knowledge extraction method of case-related property knowledge base
CN113919355A (en) * 2021-10-19 2022-01-11 四川大学 Semi-supervised named entity recognition method suitable for less-training corpus scene
CN114491039A (en) * 2022-01-27 2022-05-13 四川大学 Meta-learning few-sample text classification method based on gradient improvement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635289A (en) * 2018-11-30 2019-04-16 上海智臻智能网络科技股份有限公司 Entry classification method and audit information abstracting method
CN110807328A (en) * 2019-10-25 2020-02-18 华南师范大学 Named entity identification method and system oriented to multi-strategy fusion of legal documents
CN111191029A (en) * 2019-12-19 2020-05-22 南京理工大学 AC construction method based on supervised learning and text classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635289A (en) * 2018-11-30 2019-04-16 上海智臻智能网络科技股份有限公司 Entry classification method and audit information abstracting method
CN110807328A (en) * 2019-10-25 2020-02-18 华南师范大学 Named entity identification method and system oriented to multi-strategy fusion of legal documents
CN111191029A (en) * 2019-12-19 2020-05-22 南京理工大学 AC construction method based on supervised learning and text classification

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
蔡月红 等: "基于Tri-trainning半监督学习的中文组织机构名识别" *
蔡月红;朱倩;程显毅;: "基于Tri-training半监督学习的中文组织机构名识别" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158659A (en) * 2021-02-08 2021-07-23 银江股份有限公司 Case-related property calculation method based on judicial text
CN113158659B (en) * 2021-02-08 2024-03-08 银江技术股份有限公司 Case-related property calculation method based on judicial text
CN113886602A (en) * 2021-10-19 2022-01-04 四川大学 Multi-granularity cognition-based domain knowledge base entity identification method
CN113918682A (en) * 2021-10-19 2022-01-11 四川大学 Knowledge extraction method of case-related property knowledge base
CN113919355A (en) * 2021-10-19 2022-01-11 四川大学 Semi-supervised named entity recognition method suitable for less-training corpus scene
CN113919355B (en) * 2021-10-19 2023-11-07 四川大学 Semi-supervised named entity recognition method suitable for small training corpus scene
CN114491039A (en) * 2022-01-27 2022-05-13 四川大学 Meta-learning few-sample text classification method based on gradient improvement
CN114491039B (en) * 2022-01-27 2023-10-03 四川大学 Primitive learning few-sample text classification method based on gradient improvement

Similar Documents

Publication Publication Date Title
CN112001171A (en) Case-related property knowledge base entity identification method based on ensemble learning
CN109684440B (en) Address similarity measurement method based on hierarchical annotation
Arras et al. Explaining recurrent neural network predictions in sentiment analysis
Er et al. Attention pooling-based convolutional neural network for sentence modelling
CN112215004A (en) Application method in extraction of text entities of military equipment based on transfer learning
CN111079985A (en) Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features
CN113011185A (en) Legal field text analysis and identification method, system, storage medium and terminal
Liu et al. Image retrieval using fused deep convolutional features
CN110889786A (en) Legal action insured advocate security use judging service method based on LSTM technology
CN110826316A (en) Method for identifying sensitive information applied to referee document
CN108549723A (en) A kind of text concept sorting technique, device and server
Dong et al. The detection of fraudulent financial statements: an integrated language model
CN109871449A (en) A kind of zero sample learning method end to end based on semantic description
Dorle et al. Political sentiment analysis through social media
Luong et al. Intent extraction from social media texts using sequential segmentation and deep learning models
CN116304035B (en) Multi-notice multi-crime name relation extraction method and device in complex case
Usmani et al. News headlines categorization scheme for unlabelled data
Cao et al. Skill requirements analysis for data analysts based on named entities recognition
CN115422920A (en) Referee document dispute focus identification method based on BERT and GAT
Hamed et al. DISINFORMATION DETECTION ABOUT ISLAMIC ISSUES ON SOCIAL MEDIA USING DEEP LEARNING TECHNIQUES
Ahbali et al. Identifying corporate credit risk sentiments from financial news
Li et al. Attention-based LSTM-CNNs for uncertainty identification on Chinese social media texts
CN113051903A (en) Method for comparing consistency of sentences, case passes, sentencing plots and judicial documents
Plachouras et al. Information extraction of regulatory enforcement actions: From anti-money laundering compliance to countering terrorism finance
Roslan et al. Stock prediction using sentiment analysis in twitter for day trader

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201127