CN110909547A - Judicial entity identification method based on improved deep learning - Google Patents

Judicial entity identification method based on improved deep learning Download PDF

Info

Publication number
CN110909547A
CN110909547A CN201911156444.3A CN201911156444A CN110909547A CN 110909547 A CN110909547 A CN 110909547A CN 201911156444 A CN201911156444 A CN 201911156444A CN 110909547 A CN110909547 A CN 110909547A
Authority
CN
China
Prior art keywords
sequence
short term
term memory
judicial
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911156444.3A
Other languages
Chinese (zh)
Inventor
王艳
杨品莉
林锋
邹奕
周激流
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201911156444.3A priority Critical patent/CN110909547A/en
Publication of CN110909547A publication Critical patent/CN110909547A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents

Abstract

The invention discloses a judicial entity recognition method based on improved deep learning, which comprises the steps of obtaining a judicial text, carrying out standard processing on the format of the text and marking, and obtaining a data set comprising a training sample and a test sample; inputting the training sample into a judicial entity recognition model for training; and inputting the test sample of the text to be recognized into the trained judicial entity recognition model to obtain a recognition result. The invention can obtain long-distance context characteristics, obtain more information and improve the identification precision and range; the problem that the prediction label sequence is invalid in judicial identification by the deep learning method is solved, and the effectiveness and reliability of identification are ensured.

Description

Judicial entity identification method based on improved deep learning
Technical Field
The invention belongs to the technical field of judicial entity identification, and particularly relates to a judicial entity identification method based on improved deep learning.
Background
In the judicial field, the judicial files have the problems of large data volume, various file types and the like, so that the realization of information automation in the judicial field is a necessary trend for the development of the judicial field. The information automation in the judicial field can reduce the workload of the judicial personnel, is beneficial to improving the work efficiency of the judicial industry, and is beneficial to realizing the information sharing in the judicial field.
In recent years, with the continuous proposition of new natural language processing techniques and the urgent need of implementing judicial information automation in the judicial field, more and more natural language processing techniques are applied to the judicial field, such as entity recognition, relationship extraction, and the like. A large number of judicial domain entities exist in the legal case text, and the recognition of the judicial domain entities is the basis for realizing the automation of the judicial domain information and is the premise of the subsequent technologies of extracting the judicial information, constructing knowledge maps of the judicial domain and the like. Therefore, the research of judicial entity identification is very important for the development of the judicial field.
At present, named entity recognition has achieved a great deal of success in many fields as a fundamental research of natural language processing. However, because of the particularity of chinese characters compared with english characters, the phenomenon of meaning ambiguity of chinese characters exists and the relationship between chinese character words and words is relatively tight, and the research results of entity recognition in the chinese field are relatively few at present. The earliest named entity approaches included dictionary and rule-based approaches that required an expert to manually create a rule template that identified the named entity using pattern and string matching. The two methods have high requirements on the material library, and the transportability of the two methods is poor. With the wider application of deep learning technology in natural language processing and the introduction of distributed representation of words, the named entity recognition technology based on deep learning has achieved some achievements. However, deep learning based methods predict each character independently from a given set of features without considering the already predicted tags above, which may render the predicted tag sequence invalid. At present, a circulating neural network (RNN) is a typical deep learning network model for processing serialized sentences, and practice proves that the method can cause the problem of gradient disappearance and cannot continuously optimize if the length of a sequence is too long; therefore, RNN has a length dependency problem and cannot acquire context feature information of any length.
Disclosure of Invention
In order to solve the problems, the invention provides a judicial entity identification method based on improved deep learning, which can acquire long-distance context characteristics, acquire more information and improve identification precision and range; the problem that the prediction label sequence is invalid in judicial identification by the deep learning method is solved, and the effectiveness and reliability of identification are ensured.
In order to achieve the purpose, the invention adopts the technical scheme that: a judicial entity recognition method based on improved deep learning comprises the following steps of;
acquiring a judicial text, carrying out standard processing on the format of the text and marking the text, and acquiring a data set comprising a training sample and a test sample;
inputting the training sample into a judicial entity recognition model for training;
and inputting the test sample of the text to be recognized into the trained judicial entity recognition model to obtain a recognition result.
Further, in the process of carrying out standard processing and marking on the text format, space removal processing is carried out on the text first, and then the text is marked to obtain a text sequence.
Further, the judicial entity recognition model is a bidirectional long and short term memory model with a conditional random field, the bidirectional long and short term memory model with the conditional random field comprises a sequence input module, a forward long and short term memory model module, a backward long and short term memory model module and a conditional random field module, and the sequence input module, the forward long and short term memory model module, the backward long and short term memory model module and the conditional random field module are sequentially connected.
Further, the forward long-short term memory model module extracts past features, and the backward long-short term memory model module extracts future features; performing long-short term memory feature extraction on the same sequence from left to right, and performing long-short term memory feature extraction from right to left to obtain a tag sequence of the bidirectional semantic information; the problem of length dependence of a traditional deep learning method is solved, and context feature information with any length can be obtained; information changed to a cell state by a door mechanism is utilized to maintain the durability of information transmission, so that long-distance context characteristics can be learned; the information from front to back can be coded, the information from back to front can also be coded, bidirectional semantic information can be obtained, and the identification effectiveness is improved.
And the conditional random field module is connected to the hidden layer output of the backward long-short term memory model module, and jointly decodes the tag sequence output by the backward long-short term memory model module to perform sentence-level sequence marking. In order to solve the problem that the label sequence output from the bidirectional long and short term memory model is possibly invalid, a conditional random field module is connected to the hidden layer output of the bidirectional long and short term memory model, the label sequence output by the bidirectional long and short term memory model is jointly decoded, and sentence-level sequence labeling is carried out instead of decoding each label independently.
Further, the processing in the judicial entity recognition model comprises the steps of:
searching a character vector corresponding to each character in an input text sequence by a sequence input module, and inputting the searched character vector sequence into a forward long and short term memory model module and a backward long and short term memory model module;
respectively obtaining hidden layer coding representation of the character vector through a forward long-short term memory model module and a backward long-short term memory model module;
distributing marks for each character through a conditional random field module, and calculating two types of scores;
the output marker sequence is the highest-scoring sequence.
Further, the forward long-short term memory model module and the backward long-short term memory model module have the same structure and comprise three gate structures using sigmod as an activation function and a cell state unit, wherein the three gate structures are an input gate, a forgetting gate and an output gate respectively; the working process comprises the following steps:
ft=σ(Wf[ht-1,xt]+bf);
it=σ(Wi[ht-1,xt]+bi);
Ot=σ(Wo[ht-1,xt]+bo);
Figure BDA0002284931330000031
Figure BDA0002284931330000032
ht=Ot*tanh(Ct);
wherein the input of the current time is xt(ii) a Hidden layer state at the previous moment is ht-1(ii) a Hidden layer state at current moment is ht(ii) a The temporary cell state is
Figure BDA0002284931330000033
The cell state at the present time is Ct(ii) a Last minute cell state is Ct-1
The forgetting gate is used for selecting information to be forgotten, and the input of the forgetting gate is ht-1And xtThe output is the value f of the forgetting gatet(ii) a Calculating the cell state at the current moment, and inputting a value it、ft
Figure BDA0002284931330000034
And Ct-1The output is the cell state C at the current momentt(ii) a Calculating the hidden layer state of the output gate and the current time, wherein the input is ht-1、xtAnd CtOutput as value O of output gatetAnd hidden layer state ht
Finally, the sentence is obtainedHidden layer state sequence with same sub-length { h }0,h1…ht-1}。
Further, the conditional random field module is configured to calculate a joint probability for the entire sequence; the parameterized form of the conditional random field module is defined as follows:
Figure BDA0002284931330000041
in the formula, tk、δlIs a characteristic function, λk、μlAre corresponding weights, ZxIs a specification factor;
wherein; z (x) Σyexp(∑i,kλktk(yi-1,yi,x,i)+∑i,jμlδl(yi,x,i));
Obtaining the conditional probability of the output sequence y according to the input sequence x by the formula; t is tkThe feature function defined on the edge is called transfer feature, and whether the feature is matched or not is judged by depending on the current word and the previous word, and the feature function is determined by the current position and the previous position; deltalIs a characteristic function defined on the node, called as a state characteristic, and is determined by the current position; typically, the value of the characteristic function is 1 or 0; taking 1 when the condition is met, and taking 0 when the condition is not met; the output result of the conditional random field module is completely composed of a characteristic function tk、δlAnd λk、μlAnd (6) determining.
Further, the conditional random field module may solve the problem that the predicted tag sequence based on the neural network method may not be valid by learning some constraints from the training samples to ensure that the finally predicted entity tag sequence is valid. In the loss function of the conditional random field module, the sequence with the largest output score is a label prediction sequence, and if a given sequence X is provided and the sequence marking result is y, the score is defined as:
Figure BDA0002284931330000042
wherein, P is an initial scoring matrix obtained by the linear operation of the hidden layer output of the bidirectional long and short term memory model, and A is a conversion scoring matrix; a. thei,jProbability of a tag following a tag being a tag, Pi,jIs the word WiProbability of mapping to a label;
and calculating the score of the output label sequence y corresponding to the input sequence X, wherein the final predicted label sequence is the sequence with the highest score.
Further, the data set includes training samples, validation samples, and test samples; verifying the accuracy of the model by the verification sample;
standardizing the format of the judicial text, and removing blank spaces; and then, marking the text into a BIO word tag form by using a corpus labeling tool as the input of the model, wherein the tag form comprises 3 types of entity categories and 7 types of word tags, and the 3 types of entity categories are criminal names, places and judicial units.
Further, optimizing the judicial entity recognition model through an optimizer, wherein the optimizer adopts adaptive moment estimation to optimize a recognition result; the method has the capability of calculating the self-adaptive learning rate of different parameters, low memory requirement and high calculation efficiency, and is suitable for larger-scale data sets.
The beneficial effects of the technical scheme are as follows:
the recognition model established in the judicial entity recognition method provided by the invention consists of a bidirectional long-short term memory model and a conditional random field module, and the characteristics of the deep learning-based method can be reserved based on the recognition network model. Long-distance context characteristics can be obtained, more information can be obtained, and the identification precision and range are improved; the problem that the prediction label sequence is invalid in judicial identification by the deep learning method is solved, and the effectiveness and reliability of identification are ensured.
Drawings
FIG. 1 is a schematic flow chart of a judicial entity recognition method based on improved deep learning according to the present invention;
FIG. 2 is a topology structure diagram of a judicial entity recognition model in an embodiment of the present invention;
FIG. 3 is a diagram of a topology structure of a long-term and short-term memory model module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described with reference to the accompanying drawings.
In this embodiment, referring to fig. 1, the present invention provides a judicial entity recognition method based on improved deep learning, including;
acquiring a judicial text, carrying out standard processing on the format of the text and marking the text, and acquiring a data set comprising a training sample and a test sample;
inputting the training sample into a judicial entity recognition model for training;
and inputting the test sample of the text to be recognized into the trained judicial entity recognition model to obtain a recognition result.
As an optimization scheme of the above embodiment, in the process of performing specification processing and marking on a text format, a space removal processing is performed on a text first, and then the text is marked to obtain a text sequence.
As an optimization scheme of the above embodiment, as shown in fig. 2, the judicial entity recognition model is a bidirectional long-short term memory model with a conditional random field, and the bidirectional long-short term memory model with the conditional random field includes a sequence input module, a forward long-short term memory model module, a backward long-short term memory model module and a conditional random field module, and the sequence input module, the forward long-short term memory model module, the backward long-short term memory model module and the conditional random field module are connected in sequence.
The forward long-short term memory model module extracts past features, and the backward long-short term memory model module extracts future features; performing long-short term memory feature extraction on the same sequence from left to right, and performing long-short term memory feature extraction from right to left to obtain a tag sequence of the bidirectional semantic information; the problem of length dependence of a traditional deep learning method is solved, and context feature information with any length can be obtained; information changed to a cell state by a door mechanism is utilized to maintain the durability of information transmission, so that long-distance context characteristics can be learned; the information from front to back can be coded, the information from back to front can also be coded, bidirectional semantic information can be obtained, and the identification effectiveness is improved.
And the conditional random field module is connected to the hidden layer output of the backward long-short term memory model module, and jointly decodes the tag sequence output by the backward long-short term memory model module to perform sentence-level sequence marking. In order to solve the problem that the label sequence output from the bidirectional long and short term memory model is possibly invalid, a conditional random field module is connected to the hidden layer output of the bidirectional long and short term memory model, the label sequence output by the bidirectional long and short term memory model is jointly decoded, and sentence-level sequence labeling is carried out instead of decoding each label independently.
Wherein the processing in the judicial entity recognition model comprises the steps of:
searching a character vector corresponding to each character in an input text sequence by a sequence input module, and inputting the searched character vector sequence into a forward long and short term memory model module and a backward long and short term memory model module;
respectively obtaining hidden layer coding representation of the character vector through a forward long-short term memory model module and a backward long-short term memory model module;
distributing marks for each character through a conditional random field module, and calculating two types of scores;
the output marker sequence is the highest-scoring sequence.
As shown in fig. 3, the forward long-short term memory model module and the backward long-short term memory model module have the same structure, and include three gate structures using sigmod as an activation function and a cell state unit, where the three gate structures are an input gate, a forgetting gate and an output gate; the working process comprises the following steps:
ft=σ(Wf[ht-1,xt]+bf);
it=σ(Wi[ht-1,xt]+bi);
Ot=σ(Wo[ht-1,xt]+bo);
Figure BDA0002284931330000061
Figure BDA0002284931330000062
ht=Ot*tanh(Ct);
wherein the input of the current time is xt(ii) a Hidden layer state at the previous moment is ht-1(ii) a Hidden layer state at current moment is ht(ii) a The temporary cell state is
Figure BDA0002284931330000063
The cell state at the present time is Ct(ii) a Last minute cell state is Ct-1
The forgetting gate is used for selecting information to be forgotten, and the input of the forgetting gate is ht-1And xtThe output is the value f of the forgetting gatet(ii) a Calculating the cell state at the current moment, and inputting a value it、ft
Figure BDA0002284931330000071
And Ct-1The output is the cell state C at the current momentt(ii) a Calculating the hidden layer state of the output gate and the current time, wherein the input is ht-1、xtAnd CtOutput as value O of output gatetAnd hidden layer state ht
Finally, a hidden layer state sequence { h) with the same length as the sentence is obtained0,h1…ht-1}。
The conditional random field module is used for calculating the joint probability of the whole sequence; the parameterized form of the conditional random field module is defined as follows:
Figure BDA0002284931330000072
in the formula, tk、δlIs a characteristic function, λk、μlAre corresponding weights, ZxIs a specification factor;
wherein: z (x) Σyexp(∑i,kλktk(yi-1,yi,x,i)+∑i,jμlδl(yi,x,i));
Obtaining the conditional probability of the output sequence y according to the input sequence x by the formula; t is tkThe feature function defined on the edge is called transfer feature, and whether the feature is matched or not is judged by depending on the current word and the previous word, and the feature function is determined by the current position and the previous position; deltalIs a characteristic function defined on the node, called as a state characteristic, and is determined by the current position; typically, the value of the characteristic function is 1 or 0; taking 1 when the condition is met, and taking 0 when the condition is not met; the output result of the conditional random field module is completely composed of a characteristic function tk、δlAnd λk、μlAnd (6) determining.
The conditional random field module may solve the problem that a predicted tag sequence based on a neural network approach may not be valid by learning some constraints from the training samples to ensure that the final predicted entity tag sequence is valid. In the loss function of the conditional random field module, the sequence with the largest output score is a label prediction sequence, and if a given sequence X is provided and the sequence marking result is y, the score is defined as:
Figure BDA0002284931330000073
wherein, P is an initial scoring matrix obtained by the linear operation of the hidden layer output of the bidirectional long and short term memory model, and A is a conversion scoring matrix; a. thei,jProbability of a tag following a tag being a tag, Pi,jIs the word WiProbability of mapping to a label;
and calculating the score of the output label sequence y corresponding to the input sequence X, wherein the final predicted label sequence is the sequence with the highest score.
As an optimization scheme of the above embodiment, the data set includes a training sample, a verification sample and a test sample, and the accuracy of the model is verified by the verification sample;
standardizing the format of the judicial text, and removing blank spaces; and then, marking the text into a BIO word tag form by using a corpus labeling tool as the input of the model, wherein the tag form comprises 3 types of entity categories and 7 types of word tags, and the 3 types of entity categories are criminal names, places and judicial units.
As an optimization scheme of the above embodiment, the judicial entity recognition model is optimized by an optimizer, and the optimizer estimates an optimized recognition result by using a self-adaptive moment; the method has the capability of calculating the self-adaptive learning rate of different parameters, low memory requirement and high calculation efficiency, and is suitable for larger-scale data sets.
In the specific implementation process, the following embodiments are used for illustration:
the experimental data set of the invention is from 500 referee documents downloaded from a referee document network, mainly comprising referee documents of three cases of a prisoner-reducing case, a parole case and a temporary supervision case, wherein 300 referee documents are taken as training samples, 100 are taken as verification samples and 100 are taken as test samples. Firstly, 500 referee documents are normalized in format, blank spaces are removed, and then the referee documents are marked into a BIO character label form by using a corpus marking tool YDEEA with the help of a legal expert as the input of a model so as to reduce manual participation. Herein, 3 types of entity categories (criminal name, place, judicial unit) and 7 types of word labels are defined, as shown in table 1.
TABLE 1 BIO word tag categories
Figure BDA0002284931330000081
For the purpose of evaluating the model herein, accuracy (precision), recall (recall) and F1 value (F-measure) are used as evaluation indexes herein. The calculation formula of the evaluation index is as follows:
Figure BDA0002284931330000082
Figure BDA0002284931330000083
Figure BDA0002284931330000084
in the model provided by the invention, a data set is trained, and a plurality of evaluation indexes obtain better results, wherein the accuracy rate is 0.863, the recall rate is 0.837, and the F1 value is 0.848.
As shown in Table 2, the recognition results obtained by the optimizer through optimization by using adaptive moment estimation are better than those obtained by other optimizers, and the accuracy, the recall rate and the F1 value are obviously higher than those obtained by other optimizers.
TABLE 2 comparison of evaluation indices of different optimizers under a dataset
Figure BDA0002284931330000091
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. A judicial entity recognition method based on improved deep learning is characterized by comprising the following steps of;
acquiring a judicial text, carrying out standard processing on the format of the text and marking the text, and acquiring a data set comprising a training sample and a test sample;
inputting the training sample into a judicial entity recognition model for training;
and inputting the test sample of the text to be recognized into the trained judicial entity recognition model to obtain a recognition result.
2. The judicial entity recognition method based on improved deep learning of claim 1, wherein in the process of carrying out standard processing and marking on the text format, the text is firstly subjected to de-space processing, and then is marked to obtain a text sequence.
3. The judicial entity recognition method based on the improved deep learning of claim 2, wherein the judicial entity recognition model is a bidirectional long-short term memory model with conditional random fields, and the bidirectional long-short term memory model with conditional random fields comprises a sequence input module, a forward long-short term memory model module, a backward long-short term memory model module and a conditional random field module, and the sequence input module, the forward long-short term memory model module, the backward long-short term memory model module and the conditional random field module are connected in sequence.
4. The judicial entity recognition method based on improved deep learning of claim 3, wherein the forward long-short term memory model module extracts past features and the backward long-short term memory model module extracts future features; performing long-short term memory feature extraction on the same sequence from left to right, and performing long-short term memory feature extraction from right to left to obtain a tag sequence of the bidirectional semantic information;
and the conditional random field module is connected to the hidden layer output of the backward long-short term memory model module, and jointly decodes the tag sequence output by the backward long-short term memory model module to perform sentence-level sequence marking.
5. The judicial entity recognition method based on improved deep learning of claim 4, wherein the processing procedure in the judicial entity recognition model comprises the steps of:
searching a character vector corresponding to each character in an input text sequence by a sequence input module, and inputting the searched character vector sequence into a forward long and short term memory model module and a backward long and short term memory model module;
respectively obtaining hidden layer coding representation of the character vector through a forward long-short term memory model module and a backward long-short term memory model module;
distributing marks for each character through a conditional random field module, and calculating two types of scores;
the output marker sequence is the highest-scoring sequence.
6. The judicial entity recognition method based on advanced learning of any one of claims 2-5, wherein the forward long-short term memory model module and the backward long-short term memory model module have the same structure, and comprise three gate structures with sigmod as an activation function and a cell state unit, wherein the three gate structures are an input gate, a forgetting gate and an output gate; the working process comprises the following steps:
ft=σ(Wf[ht-1,xt]+bf);
it=σ(Wi[ht-1,xt]+bi);
Ot=σ(WO[ht-1,xt]+bO);
Figure FDA0002284931320000021
Figure FDA0002284931320000022
ht=Ot*tanh(Ct);
wherein the input of the current time is xt(ii) a Hidden layer state at the previous moment is ht-1(ii) a Hidden layer state at current moment is ht(ii) a The temporary cell state is
Figure FDA0002284931320000023
The cell state at the present time is Ct(ii) a Last minute cell state is Ct-1
The forgetting gate is used for selecting information to be forgotten, and the input of the forgetting gate is ht-1And xtThe output is the value f of the forgetting gatet(ii) a Calculating the cell state at the current moment, and inputting a value it、ft
Figure FDA0002284931320000024
And Ct-1The output is the cell state C at the current momentt(ii) a Calculating the hidden layer state of the output gate and the current time, wherein the input is ht-1、xtAnd CtOutput as value O of output gatetAnd hidden layer state ht
Finally, a hidden layer state sequence { h) with the same length as the sentence is obtained0,h1…ht-1}。
7. The method of claim 5 in which the conditional random field module is configured to calculate the joint probability of the entire sequence and the parameterized form of the conditional random field module is defined as follows:
Figure FDA0002284931320000025
in the formula, tk、δlIs a characteristic function, λk、μlAre corresponding weights, ZxIs a specification factor; wherein: z (x) Σyexp(∑i,kλktk(yi-1,yi,x,i)+∑i,jμlδl(yi,x,i));
Through the upper partObtaining the conditional probability of an output sequence y according to an input sequence x; t is tkThe feature function defined on the edge is called transfer feature, and whether the feature is matched or not is judged by depending on the current word and the previous word, and the feature function is determined by the current position and the previous position; deltalIs a characteristic function defined on the node, called as a state characteristic, and is determined by the current position; typically, the value of the characteristic function is 1 or 0; taking 1 when the condition is met, and taking 0 when the condition is not met; the output result of the conditional random field module is completely composed of a characteristic function tk、δlAnd λk、μlAnd (6) determining.
8. The method as claimed in claim 6, wherein in the loss function of the conditional random field module, the sequence with the largest output score is a label prediction sequence, and assuming that the sequence is given by X and the sequence labeling result is y, the score is defined as:
Figure FDA0002284931320000031
wherein, P is an initial scoring matrix obtained by the linear operation of the hidden layer output of the bidirectional long and short term memory model, and A is a conversion scoring matrix; a. thei,jProbability of a tag following a tag being a tag, Pi,jIs the word WiProbability of mapping to a label;
and calculating the score of the output label sequence y corresponding to the input sequence X, wherein the final predicted label sequence is the sequence with the highest score.
9. The judicial entity recognition method based on improved deep learning of claim 2, wherein the data set comprises training samples, verification samples and test samples;
standardizing the format of the judicial text, and removing blank spaces; and then, marking the text into a BIO word tag form by using a corpus labeling tool as the input of the model, wherein the tag form comprises 3 types of entity categories and 7 types of word tags, and the 3 types of entity categories are criminal names, places and judicial units.
10. The judicial entity recognition method based on improved deep learning of claim 1, wherein the judicial entity recognition model is optimized by an optimizer, and the optimizer estimates the optimized recognition result by using adaptive moments.
CN201911156444.3A 2019-11-22 2019-11-22 Judicial entity identification method based on improved deep learning Pending CN110909547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911156444.3A CN110909547A (en) 2019-11-22 2019-11-22 Judicial entity identification method based on improved deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911156444.3A CN110909547A (en) 2019-11-22 2019-11-22 Judicial entity identification method based on improved deep learning

Publications (1)

Publication Number Publication Date
CN110909547A true CN110909547A (en) 2020-03-24

Family

ID=69818786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911156444.3A Pending CN110909547A (en) 2019-11-22 2019-11-22 Judicial entity identification method based on improved deep learning

Country Status (1)

Country Link
CN (1) CN110909547A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347780A (en) * 2020-11-27 2021-02-09 浙江大学 Judicial fact finding generation method, device and medium based on deep neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980608A (en) * 2017-03-16 2017-07-25 四川大学 A kind of Chinese electronic health record participle and name entity recognition method and system
CN109241285A (en) * 2018-08-29 2019-01-18 东南大学 A kind of device of the judicial decision in a case of auxiliary based on machine learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980608A (en) * 2017-03-16 2017-07-25 四川大学 A kind of Chinese electronic health record participle and name entity recognition method and system
CN109241285A (en) * 2018-08-29 2019-01-18 东南大学 A kind of device of the judicial decision in a case of auxiliary based on machine learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
(印度)桑塔努•帕塔纳亚克(SANTANU PATTANAYAK)著: "《Python人工智能项目实战》", 31 October 2019, 机械工业出版社 *
何云琪 等: "基于句法和语义特征的疾病名称识别", 《中国科学: 信息科学》 *
谢云: "面向中文法律文本的命名实体识别研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
顾孙炎: "基于深度神经网络的中文命名实体识别研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347780A (en) * 2020-11-27 2021-02-09 浙江大学 Judicial fact finding generation method, device and medium based on deep neural network
CN112347780B (en) * 2020-11-27 2023-09-12 浙江大学 Judicial fact finding generation method, device and medium based on deep neural network

Similar Documents

Publication Publication Date Title
CN110083831B (en) Chinese named entity identification method based on BERT-BiGRU-CRF
CN108984526B (en) Document theme vector extraction method based on deep learning
CN111444726B (en) Chinese semantic information extraction method and device based on long-short-term memory network of bidirectional lattice structure
CN111639171B (en) Knowledge graph question-answering method and device
CN111310471B (en) Travel named entity identification method based on BBLC model
CN110083682A (en) It is a kind of to understand answer acquisition methods based on the machine readings for taking turns attention mechanism more
CN110427623A (en) Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium
CN111738007B (en) Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN113642330A (en) Rail transit standard entity identification method based on catalog topic classification
CN112115238A (en) Question-answering method and system based on BERT and knowledge base
CN112487820B (en) Chinese medical named entity recognition method
CN112836046A (en) Four-risk one-gold-field policy and regulation text entity identification method
Ahmad et al. Bengali word embeddings and it's application in solving document classification problem
CN116151256A (en) Small sample named entity recognition method based on multitasking and prompt learning
CN114911892A (en) Interaction layer neural network for search, retrieval and ranking
CN113191148A (en) Rail transit entity identification method based on semi-supervised learning and clustering
CN117076653B (en) Knowledge base question-answering method based on thinking chain and visual lifting context learning
CN112800184B (en) Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction
CN111274829A (en) Sequence labeling method using cross-language information
CN111881256B (en) Text entity relation extraction method and device and computer readable storage medium equipment
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN113360667B (en) Biomedical trigger word detection and named entity identification method based on multi-task learning
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN110909547A (en) Judicial entity identification method based on improved deep learning
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200324

RJ01 Rejection of invention patent application after publication