CN112001171A - Case-related property knowledge base entity identification method based on ensemble learning - Google Patents
Case-related property knowledge base entity identification method based on ensemble learning Download PDFInfo
- Publication number
- CN112001171A CN112001171A CN202010825763.5A CN202010825763A CN112001171A CN 112001171 A CN112001171 A CN 112001171A CN 202010825763 A CN202010825763 A CN 202010825763A CN 112001171 A CN112001171 A CN 112001171A
- Authority
- CN
- China
- Prior art keywords
- property
- learner
- training
- entity
- knowledge base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 238000012549 training Methods 0.000 claims abstract description 43
- 238000011161 development Methods 0.000 claims abstract description 30
- 238000012360 testing method Methods 0.000 claims abstract description 24
- 230000018109 developmental process Effects 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 18
- 230000011218 segmentation Effects 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 9
- 230000009471 action Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 4
- 230000015654 memory Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract 1
- 238000007689 inspection Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 5
- 230000008520 organization Effects 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000009411 base construction Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services; Handling legal documents
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Technology Law (AREA)
- Artificial Intelligence (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a case-related property knowledge base entity identification method based on ensemble learning, which comprises the following steps: carrying out training set pretreatment on a plurality of randomly selected legal documents related to the related assets according to entity categories; training the T learners according to the obtained training set to obtain learners; randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; calculating the classification accuracy of the corpora in the development set by using the trained learners, and constructing the weight of each learner by using the classification accuracy of each learner; dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method. The entity identification problem of a small-scale corpus and high accuracy requirement can be solved, and the relevant knowledge fusion of the processing of the property involved in the case can be automatically completed according to the existing legal provisions.
Description
Technical Field
The invention belongs to the technical field of entity identification of a property-related knowledge base, and particularly relates to a method for identifying a property-related knowledge base entity based on ensemble learning.
Background
The knowledge base can describe concepts, entities and relations thereof in the objective world in a structured form, and effective organization, management and understanding of mass information are completed. The potential of the knowledge base system in the applications of knowledge fusion, intelligent question answering, big data decision making and the like is widely concerned. The knowledge base is a huge network with entities as nodes, and comprises the entities, entity attributes and relationships among the entities. Entity identification is a core technology for knowledge base construction.
Entity identification refers to identifying entities with specific meanings from text and determining categories for the entities. Entity recognition plays an important role in a variety of natural language processing applications, such as information extraction, information retrieval, automatic text summarization, machine translation, knowledge bases, and the like. With respect to entity identification, considerable research has been conducted at home and abroad, and methods for entity identification can be roughly classified into three types: rule-based methods, traditional machine learning-based methods, and deep learning-based methods. Rule-based methods rely on a large number of manual rules and do not require corpus labeling. However, the rule making is time-consuming and labor-consuming, and needs to be supported by professional knowledge in some professional fields. The portability of rule-based approaches is limited and good performance needs to be achieved by updating the rules for text from new domains. Thus, this method is now slowly used less often. With the development of the traditional machine learning, a plurality of traditional machine learning methods are successfully applied to the entity recognition task, such as hidden markov models, maximum entropy models, conditional random fields, and the like. In addition to using machine learning algorithms alone, multiple methods may be combined to accomplish the entity recognition task. Deep learning-based methods, such as bidirectional long-and short-term memory neural network models, have also been successfully applied to entity recognition tasks. Compared with the traditional machine learning-based method, the deep learning-based method does not need elaborate feature engineering, can automatically capture the context dependence in the input text, and can be well represented.
However, the challenge faced by entity identification in the construction process of the property knowledge base involved in the case is different from the general entity identification method, and becomes a unique and challenging problem. Due to the particularity of the domain knowledge base, the following challenges are faced in completing entity identification during the process of creating the knowledge base: (1) the training corpus is few and single. The construction target of the case-related property knowledge base is to automatically complete knowledge extraction based on legal rules, and the corpus is mainly derived from the legal rules related to case-related property treatment in the formally implemented legal rules, so that the training corpus is far less than a general knowledge base and is also less than a general legal knowledge base when entity identification is carried out; (2) the requirement on identification accuracy is high. The application target of the property-related knowledge base is to provide support for front-line case handling personnel in judicial practice, which puts extremely high requirements on the correctness and accuracy of knowledge in the knowledge base. In order to ensure the correctness of the knowledge base and reduce subsequent work, the identification accuracy of the entity identification algorithm is much higher than that of the general knowledge base.
Disclosure of Invention
In order to solve the problems, the invention provides an entity identification method of a case-related property knowledge base based on integrated learning, which can solve the entity identification problems of a small-scale corpus and high accuracy requirements, and can automatically complete the relevant knowledge fusion of case-related property disposal in criminal cases according to the existing legal provisions.
In order to achieve the purpose, the invention adopts the technical scheme that: a method for identifying entities in a knowledge base of related to property based on ensemble learning comprises the steps of obtaining a legal document set related to the property, constructing a corpus according to the legal document related to the property, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process comprises the following steps:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set, and constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
Further, the entity classification includes a disposal unit, a worker of the disposal unit, a case-related person, a document, a property involved in the case, a disposal action, and a term or a title of a legal document.
Further, the training set preprocessing according to the selected legal documents related to the property related to the case comprises the following steps: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
Further, in the learner training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
Further, 4 learners are adopted in the learner training process, including: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
Further, in the development set building process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
Further, in the learner weight determination process, the method includes the steps of:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
further, in the step 3, in obtaining a final entity identification result by using a weighted voting method, a calculation formula of the weighted voting method is:
wherein h isiFor the learner, x is the test sample, cjIn the form of an output tag, the tag is,is hiAt label cjAn output of wiIs a learning device hiThe weight of (c).
The beneficial effects of the technical scheme are as follows:
the method can solve the entity identification problem of small-scale corpus and high accuracy requirement. The scheme provides that a learner obtained by training a plurality of existing entity recognition algorithms on a training set carries out weighted voting. The result obtained by the method is better than that obtained by an independent entity identification method, and the identification effect is improved. A plurality of existing identification schemes are selected to respectively complete entity identification tasks, and then parallel integration is carried out to improve identification accuracy. The invention can automatically complete the fusion of the related knowledge of the handling of the property involved in the criminal case according to the prior law provision.
Drawings
FIG. 1 is a schematic flow chart of an integrated learning-based identification method for an entity of a property knowledge base involved in a case;
fig. 2 is a flowchart illustrating a learner weight determination process according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described with reference to the accompanying drawings.
In this embodiment, referring to fig. 1, the present invention provides an entity identification method for a knowledge base of assets involved in a case based on ensemble learning, which includes the steps of obtaining a legal document set related to the assets involved in the case, constructing a corpus according to the legal documents related to the assets involved in the case, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process includes the steps of:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set, and constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
The acquired legal document set related to the property involved in the case comprises 14 policy documents and laws and regulations related to the management and disposal of the property involved in the case.
In the above legal documents, the entity to be identified is classified to include a disposal unit, a worker of the disposal unit, a person related to a case, a document, a property related to the case, a disposal action and a provision or a title of the legal document.
(1) Treatment unit: including and not limited to: the national court of people, the national inspection institute, the public security organization, the ministry of public security, the department of justice, the ministry of finance, the department of finance, the highest national court of people, the basic national court of people, the national security organization, the inspection institute, the national library, the central national library, the case handling department, the custody department, the case handling unit, the higher government and law organization, the Chinese people's bank, the court, the council, the inspection and observation committee of the national inspection institute, the committee of trial, the legal assistance mechanism, the prison, the custody, the community correction mechanism, the guard house, the customs and the like.
(2) Staff/affiliated staff of the disposal unit: including and not limited to: case handling personnel, custodian personnel, supervisor personnel, inspection personnel, judicial staff, trial personnel, reconnaissance personnel, civilian accompanying and reviewing personnel, courtyard, inspection chief, public security organization responsible personnel, bookkeeping personnel, trial chief and the like.
(3) Case-related personnel: including and not limited to: parties, defendents, defendees, offsite, relatives, victims, litigants, prosecutes, criminal suspects, crimes, plaints, attorneys, legal agents, litigant participants, disputes, testifiers, appraisers, translators, conspires, stakeholders, attorneys on duty, guardians, reporters, referees, reporters, critiques, current criminals, major suspects, and the like.
(4) Document: including and not limited to: a decision, a notice, a certificate, a police officer's approval of an arrest, a national institute of quarantine prosecution, a national court decision, a detainment, a release certificate, an arrest, a promissory note, a notice, a decision, a case decision, a search certificate, a wanted statue, a legal document, etc.
(5) Relating to property: including and not limited to: property, property-related, document, mail, telegraph, deposit, remittance, bond, stock share, fund share, property, contraband, legal property, article, automobile, boat, money, case-related, deposit voucher, power certificate, payment voucher, money order, book order, cheque, gold and silver, jewelry, famous calligraphy and painting, valuables, real estate, equipment, precious animals and their products, precious plants and their products, drugs and the like.
(6) The treatment action is as follows: including and not limited to: checking, detaining, freezing, keeping, paying, liability refunding, paying, impound, and returning.
(7) Clause or legal document title: including and not limited to: criminal litigation law, twenty-fourth and thirty-sixth criminal litigation law, etc.
As an optimization scheme 1 of the embodiment, the method for preprocessing the training set according to the selected legal documents related to the property related to the case comprises the following steps: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
In the learner training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
Preferably, 4 learners are used in the learner training process, including: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
As an optimization scheme 2 of the above embodiment, in the development set construction process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
In the learner weight determination process, as shown in fig. 2, comprising the steps of:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
as an optimization scheme 3 of the above embodiment, in the step 3, in obtaining a final entity identification result by using a weighted voting method, a calculation formula of the weighted voting method is as follows:
wherein h isiFor the learner, x is the test sample, cjIn the form of an output tag, the tag is,is hiAt label cjAn output of wiIs a learning device hiThe weight of (c).
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (8)
1. A method for identifying an entity of a knowledge base of related to property based on ensemble learning is characterized by acquiring a legal document set related to the property, constructing a corpus according to the legal document related to the property, and dividing the corpus into a training set, a development set and a test set, wherein the entity identification process comprises the following steps:
step 1: training a learner, namely performing training set pretreatment on a plurality of legal documents related to the property randomly selected from the legal document set related to the property according to entity categories; training T learners according to the obtained training set to obtain a learner hi,i=1...T;
Step 2: learner weight determination: randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set; using a trained learner hiT, calculating the classification accuracy of the corpora in the development set; constructing the weight of each learner by using the classification accuracy of each learner;
and step 3: entity identification: dividing words of the legal documents related to the property to construct a test set, and classifying samples in the test set by each learner; and combining the classification results of all the learners, and obtaining a final entity identification result by adopting a weighted voting method.
2. The integrated learning-based case-related property knowledge base entity identification method as claimed in claim 1, wherein the entity classification comprises a disposal unit, a worker of the disposal unit, a case-related person, a document, a case-related property, a disposal action and a term or a title of a legal document.
3. The integrated learning-based case-related property knowledge base entity identification method as claimed in claim 1, wherein the training set preprocessing according to the selected case-related property-related legal documents comprises the steps of: and taking the legal documents related to the property involved in the case as a training set, performing word segmentation by using a Chinese word segmentation tool, and manually labeling the result after word segmentation according to the entity category to construct a corpus.
4. The integrated learning-based identification method for the property-related knowledge base entity as claimed in claim 3, wherein in the learning training process: selecting T learners, and obtaining T sampling sets containing m training samples by using a self-service sampling method for a given training data set containing m samples; then training a learner h based on each sampling seti,i=1...T。
5. The integrated learning-based identification method for the property-related knowledge base entity as claimed in claim 4, wherein 4 learners are adopted in the learner training process, and the method comprises the following steps: hidden Markov model, conditional random field, maximum entropy model, and two-way long-short term memory neural network model.
6. The integrated learning-based identification method for the property-related knowledge base entity, according to claim 1, wherein in the development set construction process: and randomly selecting two related legal documents of the property concerned which are not in the test set, segmenting the selected documents, manually labeling segmentation results, and constructing a development set.
7. The integrated learning-based identification method for the property-involved knowledge base entity, as claimed in claim 6, wherein in the learner weight determination process, the method comprises the following steps:
2.1. randomly selecting two related legal documents of the related property which are not in the test set, and constructing a development set;
2.2. using a trained learner hiT classifying corpora in the development set respectively;
2.3. respectively calculating the learner h according to the classification result and the manual labeling resultiT, the classification accuracy on the development set, the classification accuracy calculation formula is:
where N is the total number of samples in the development set, MiFor learning device hiThe number of samples with wrong classification results;
2.4. using the classification accuracy p of each learneriThe weight of the learner is constructed, and the weight calculation formula is as follows:
8. the integrated learning-based identification method for the property-involved knowledge base entity as claimed in claim 1, wherein in the step 3, a weighted voting method is adopted to obtain the final entity identification result, and the calculation formula of the weighted voting method is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010825763.5A CN112001171A (en) | 2020-08-17 | 2020-08-17 | Case-related property knowledge base entity identification method based on ensemble learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010825763.5A CN112001171A (en) | 2020-08-17 | 2020-08-17 | Case-related property knowledge base entity identification method based on ensemble learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112001171A true CN112001171A (en) | 2020-11-27 |
Family
ID=73472513
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010825763.5A Pending CN112001171A (en) | 2020-08-17 | 2020-08-17 | Case-related property knowledge base entity identification method based on ensemble learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112001171A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158659A (en) * | 2021-02-08 | 2021-07-23 | 银江股份有限公司 | Case-related property calculation method based on judicial text |
CN113886602A (en) * | 2021-10-19 | 2022-01-04 | 四川大学 | Multi-granularity cognition-based domain knowledge base entity identification method |
CN113918682A (en) * | 2021-10-19 | 2022-01-11 | 四川大学 | Knowledge extraction method of case-related property knowledge base |
CN113919355A (en) * | 2021-10-19 | 2022-01-11 | 四川大学 | Semi-supervised named entity recognition method suitable for less-training corpus scene |
CN114491039A (en) * | 2022-01-27 | 2022-05-13 | 四川大学 | Meta-learning few-sample text classification method based on gradient improvement |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635289A (en) * | 2018-11-30 | 2019-04-16 | 上海智臻智能网络科技股份有限公司 | Entry classification method and audit information abstracting method |
CN110807328A (en) * | 2019-10-25 | 2020-02-18 | 华南师范大学 | Named entity identification method and system oriented to multi-strategy fusion of legal documents |
CN111191029A (en) * | 2019-12-19 | 2020-05-22 | 南京理工大学 | AC construction method based on supervised learning and text classification |
-
2020
- 2020-08-17 CN CN202010825763.5A patent/CN112001171A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635289A (en) * | 2018-11-30 | 2019-04-16 | 上海智臻智能网络科技股份有限公司 | Entry classification method and audit information abstracting method |
CN110807328A (en) * | 2019-10-25 | 2020-02-18 | 华南师范大学 | Named entity identification method and system oriented to multi-strategy fusion of legal documents |
CN111191029A (en) * | 2019-12-19 | 2020-05-22 | 南京理工大学 | AC construction method based on supervised learning and text classification |
Non-Patent Citations (2)
Title |
---|
蔡月红 等: "基于Tri-trainning半监督学习的中文组织机构名识别" * |
蔡月红;朱倩;程显毅;: "基于Tri-training半监督学习的中文组织机构名识别" * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158659A (en) * | 2021-02-08 | 2021-07-23 | 银江股份有限公司 | Case-related property calculation method based on judicial text |
CN113158659B (en) * | 2021-02-08 | 2024-03-08 | 银江技术股份有限公司 | Case-related property calculation method based on judicial text |
CN113886602A (en) * | 2021-10-19 | 2022-01-04 | 四川大学 | Multi-granularity cognition-based domain knowledge base entity identification method |
CN113918682A (en) * | 2021-10-19 | 2022-01-11 | 四川大学 | Knowledge extraction method of case-related property knowledge base |
CN113919355A (en) * | 2021-10-19 | 2022-01-11 | 四川大学 | Semi-supervised named entity recognition method suitable for less-training corpus scene |
CN113919355B (en) * | 2021-10-19 | 2023-11-07 | 四川大学 | Semi-supervised named entity recognition method suitable for small training corpus scene |
CN114491039A (en) * | 2022-01-27 | 2022-05-13 | 四川大学 | Meta-learning few-sample text classification method based on gradient improvement |
CN114491039B (en) * | 2022-01-27 | 2023-10-03 | 四川大学 | Primitive learning few-sample text classification method based on gradient improvement |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112001171A (en) | Case-related property knowledge base entity identification method based on ensemble learning | |
CN109684440B (en) | Address similarity measurement method based on hierarchical annotation | |
Arras et al. | Explaining recurrent neural network predictions in sentiment analysis | |
Er et al. | Attention pooling-based convolutional neural network for sentence modelling | |
CN112215004A (en) | Application method in extraction of text entities of military equipment based on transfer learning | |
CN111079985A (en) | Criminal case criminal period prediction method based on BERT and fused with distinguishable attribute features | |
CN113011185A (en) | Legal field text analysis and identification method, system, storage medium and terminal | |
Liu et al. | Image retrieval using fused deep convolutional features | |
CN110889786A (en) | Legal action insured advocate security use judging service method based on LSTM technology | |
CN110826316A (en) | Method for identifying sensitive information applied to referee document | |
CN108549723A (en) | A kind of text concept sorting technique, device and server | |
Dong et al. | The detection of fraudulent financial statements: an integrated language model | |
CN109871449A (en) | A kind of zero sample learning method end to end based on semantic description | |
Dorle et al. | Political sentiment analysis through social media | |
Luong et al. | Intent extraction from social media texts using sequential segmentation and deep learning models | |
CN116304035B (en) | Multi-notice multi-crime name relation extraction method and device in complex case | |
Usmani et al. | News headlines categorization scheme for unlabelled data | |
Cao et al. | Skill requirements analysis for data analysts based on named entities recognition | |
CN115422920A (en) | Referee document dispute focus identification method based on BERT and GAT | |
Hamed et al. | DISINFORMATION DETECTION ABOUT ISLAMIC ISSUES ON SOCIAL MEDIA USING DEEP LEARNING TECHNIQUES | |
Ahbali et al. | Identifying corporate credit risk sentiments from financial news | |
Li et al. | Attention-based LSTM-CNNs for uncertainty identification on Chinese social media texts | |
CN113051903A (en) | Method for comparing consistency of sentences, case passes, sentencing plots and judicial documents | |
Plachouras et al. | Information extraction of regulatory enforcement actions: From anti-money laundering compliance to countering terrorism finance | |
Roslan et al. | Stock prediction using sentiment analysis in twitter for day trader |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201127 |