CN115470354A

CN115470354A - Method and system for identifying nested and overlapped risk points based on multi-label classification

Info

Publication number: CN115470354A
Application number: CN202211366277.7A
Authority: CN
Inventors: 郭建威; 孙林君; 高扬
Original assignee: Hangzhou Real Intelligence Technology Co ltd
Current assignee: Hangzhou Real Intelligence Technology Co ltd
Priority date: 2022-11-03
Filing date: 2022-11-03
Publication date: 2022-12-13
Anticipated expiration: 2042-11-03
Also published as: CN115470354B

Abstract

The invention belongs to the technical field of contract document identification, and particularly relates to a method and a system for identifying nested and overlapped risk points based on multi-label classification. The method comprises the steps that S1, after sentence segmentation is carried out on contract documents marked with risk points and keywords, the contract documents are input into a BERT model to be pre-trained, and the contract pre-training BERT model is obtained; s2, segmenting the contract document into sentences, and inputting the sentences into a contract pre-training BERT model to extract word expressions and bottom layer characteristics; s3, merging the position information and the label information of the sentence sequence into the BilSTM to obtain the characteristics of the merged position and label information and performing characteristic compression; and S4, inputting the compressed features into a double affine network, performing range enumeration and entity label classification, and performing parameter learning. The method and the device have the characteristics of automatically identifying the risk points in the contract document and auditing the risk points.

Description

Method and system for identifying nested and overlapped risk points based on multi-label classification

Technical Field

The invention belongs to the technical field of contract document identification, and particularly relates to a method and a system for identifying nested and overlapped risk points based on multi-label classification.

Background

The identification of risk points in the contract document is a method for identifying various types of key information defined by industry-related experts from the contract document, is a process for extracting structured information from an unstructured text of the contract document, and is widely applied to the examination and verification of the risk points of the same type, such as labor contracts, buying and selling contracts, purchasing contracts and the like of various industries at present. Some approval processes of various industries need to audit and approve contract documents, the key point of audit is often some key information and risk points in the contract documents, traditional contract document audit depends on manual identification and audit of various risk points, along with rapid development of internet technology and information office, the audit of a large number of contract documents is electronic, a convenient scene is provided for improving audit efficiency through manual intelligent technology intervention, but the risk points of electronic paid documents need manual positioning, the efficiency is low, and omission is easy. The artificial intelligence technology is utilized to help practitioners in various industries to carry out risk point auditing work on electronic contract documents, so that the risk points are automatically identified, the efficiency of auditing the contracts in various industries is improved, and the omission of the risk points is avoided.

Common risk point identification scenes in contracts are generally divided into common scenes, nested scenes and overlapped scenes.

The common scene means that risk points are not associated with each other in the contract text, such as: [ first name: company limited, telephone number: year 2022, month 01, day 01, wherein the risk point "first party name" [ finity company ] is not associated with the risk point "telephone number" [ year 2022, month 01, day 01 ].

The nested scene refers to the fact that entity texts are nested with each other, for example [ from the date of the same life, the first party pays the second party in a bank transfer mode ] and the text belongs to a risk point 'payment mode', but the text [ bank transfer ] also belongs to a risk point 'payment tool'.

The risk points overlap means that, for text [ contract time: 2022, 01 month, 01 day, wherein [ 2022, 01 month, 01 day ] is both the risk point "first party contract time" and the risk point "second party contract time" in the contracting parties, and belongs to both risk points.

Methods of named entity identification can be generally divided into sequence annotation-based and pointer-based methods. The method of sequence labeling is to give a label to each word of the text, and the extraction of the entity is completed through the correspondence between the label and the category. Pointer-based methods typically predict head and tail pointers and then assign the text wrapped by the head and tail pointers to corresponding entity classes.

In the contract, for the entity identification of the common scene, no matter a method based on sequence labeling or a method based on a pointer can well complete the task, the common mode is to use a machine learning model to model the text, then extract entity information by adopting a method based on sequence labeling, and the common model has a Long Short-Term Memory network (LSTM) such as a bidirectional LSTM named entity identification method based on prediction position attention described in the patent application No. CN201910225622.7 or a mixed language material named entity identification method based on Bi-LSTM-CNN mixed with a Convolutional Neural network model (CNN) to capture characteristics, such as described in the patent application No. CN 201710946532.8.

For the nested scenario, a general sequence labeling-based method fails because the general sequence labeling method can only assign one category information to each word, and the overlapping part of the entities corresponds to multiple tags in the nested case. At this time, the problem of entity nesting can be solved by improving a method based on sequence labeling, for example, a multi-label classification problem can be converted into a multi-classification problem through combining labels, or a method of hierarchical identification is proposed, and nested entities are hierarchically identified from an internal entity to an external entity and from the external entity to the internal entity. As described in patent publication No. CN114281937a, each sub-entity of a nested entity can be identified by predicting a first nested entity and then identifying a second nested entity based on the prediction information of the first nested entity. For the nested entity identification, by means of determining the start index and the end index and the entity class label, for a non-overlapping nested entity, as long as the start index and the end index of the entity are unique, it indicates that the entity can be uniquely identified, as described in patent application CN 202011522097.4. The document with patent publication number CN114386417a proposes a named entity recognition method of incorporated word boundary information, which utilizes information of incorporated word level in an external word list, utilizes a pre-training model to extract vector representation of semantic information, and performs head-tail range judgment on a sequence input into a bidirectional affine network based on a bidirectional affine network method.

For the overlapped scene, because the entity belongs to the particularity of the entity overlapped scene, namely, the entities are completely consistent, the current method generally considers that different entities have different head and tail parts, in the scene of extracting the risk point identification information in the contract, a range formed by the head and the tail possibly belongs to two or more risk points at the same time, so that the method for processing the common scene and the nested scene is invalid, and the general method is to model the entity independently or extract the entity independently firstly and then distribute the entity to different categories by utilizing a multi-classification model.

However, the above prior art has the following disadvantages:

1. nested and overlapping scenarios where risk points in the treaty document exist in large numbers. For a nested scene, a general model is difficult to capture nested entity information of a contract text, such as LSTM and CNN, because there is no interaction of information between head and tail entities, the recognition effect is not good. For the overlapping scene, a method of identifying different risk points by using separate models is adopted, so that the overlapping problem caused by identifying the risk points can be avoided, but a large amount of resource overhead is also caused, and a large amount of waste is caused. For the situation that error accumulation inevitably occurs when a model which firstly identifies risk points and then distributes corresponding labels to the risk points through classification is adopted, firstly, the learning process of two stages has a dependency relationship, the entity identification result of the first stage influences the classification result of the second stage, secondly, the weight of loss functions of each stage is introduced as a hyper-parameter, and finally, the network structure and the loss functions need to be designed manually in each stage, wherein the quality of enumeration candidate range needs to be improved by designing the network structure and the loss functions. The two-stage learning increases the difficulty of model design and the cost of parameter tuning.

2. The contract documents also have a large number of unbalanced categories, whether the method is based on sequence labeling or a pointer-based method, wherein the identification result of most candidates is not a certain entity, the number of the entities is far less than that of the non-entity candidate results, and the problem of sample unbalance exists, and the phenomenon of category unbalance in the multi-classification problem or the multi-label classification problem is not considered based on Binary cross entropy Loss (BCE Loss) or cross entropy Loss at present. The existing nested named entity recognition method uses cross entropy loss during training as described in the document with patent publication number CN112989835a to minimize the difference between the predicted distribution and the standard distribution, so that the problem of category imbalance is not concerned.

Therefore, it is very important to design a method and a system for identifying nested and overlapped risk points based on multi-label classification, which can automatically identify the risk points in the contract document and help practitioners in various industries to perform risk point review.

Disclosure of Invention

The invention provides a method and a system for identifying nested and overlapped risk points based on multi-label classification, which can automatically identify the risk points in a contract document and help practitioners in various industries to carry out risk point audit, in order to solve the problems that a great number of nested and overlapped scenes of the risk points in the conventional contract document and a great number of unbalanced categories of the contract document exist in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

a method for identifying nested and overlapping risk points based on multi-label classification, comprising the steps of;

s1, after segmenting a contract document labeled with risk points and keywords into sentences, inputting the segmented contract document into a BERT model for pre-training to obtain a contract pre-training BERT model;

s2, segmenting a sentence of the contract document to be identified, inputting the sentence into a contract pre-training BERT model to extract word representation and bottom layer characteristics, and simultaneously obtaining position information and label information of a sentence sequence;

s3, merging the position information and the label information of the sentence sequence into a bidirectional long-short term memory network (BilSTM), obtaining the characteristics of the merged position and label information and performing characteristic compression;

and S4, inputting the features compressed in the step S3 into a double affine network, performing range enumeration and entity label classification, and performing parameter learning through a label matrix and an introduced ASL loss function.

Preferably, the pre-training in step S1 includes the following steps:

s11, shielding risk points and keywords in the contract in a random shielding mode, and predicting the shielded risk points and keywords by using an unmasked context through a BERT model;

and S12, randomly putting the two sentences together, and judging whether the two sentences are in the same paragraph or not through a BERT model.

Preferably, step S2 includes the steps of:

s21, representing the entity labels of the input sentences by adopting a three-dimensional tensor; the three-dimensional tensor expression is set as any label type

The initial position is

Of a matrix

Setting as 1;

s22, the sequence of the input sentence is represented by characters; input sentence

In which

Is a word in a sentence;

s23, inputting the character sequence of the sentence into a contract pre-training BERT model obtained by pre-training; obtaining a vector representation of a sentence as shown in equation (1)

Wherein

Pre-training the output of the last hidden layer in the BERT model for the contract, wherein n is the length of a sentence;

（1）。

preferably, step S3 includes the steps of:

s31, using the BilSTM to extract features, and fusing position information and label information of the sentence sequence into the BilSTM; setting the position vector of the contract document in the contract pre-training BERT model as

Initializing a vector matrix of class information as

Wherein m is the number of categories; derived from two weight matrices

And

is marked as

And

and obtaining the fusion information of the category and the position by using the formula (2)

；

（2）；

The formula of the BilSTM feature extraction is formula (3), wherein X is a sentence vector output by the contract pre-training BERT model,

a feature corresponding to each token; using fused information pairs of derived categories and locations

Weighting is carried out, as shown in formula (4), and finally the obtained product

For fusing categories and positions token characteristics of the information;

（3）

（4）；

s32, inputting the token features fused with the category and the position information into two feed-forward neural networks FFNN for feature compression, wherein the specific process is shown as a formula (5) and a formula (6):

(5)

(6)

wherein,

,

are respectively candidate entity ranges

The start and end positions of the optical fiber,

and

is the sentence in the position

And

fusing token characteristics of category and position information,

，

are respectively as

，

Entity after feature compression

Cephalad and caudal features.

Preferably, step S4 includes the steps of:

s41, inputting the compressed features into a bidirectional affine network classifier for classification, as shown in the following formula (7)

And

the characteristics of the characters of the beginning and end of the ith entity respectively;

（7）

wherein

A tensor of (NUM _ DIMENSION, NUM _ LABEL, NUM _ DIMENSION), NUM _ DIMENSION being a DIMENSION after feature compression, NUM _ LABEL being a number of entity classes;

is composed of

The tensor of (a);

for offset, the dimension is (

Tensor of 1);

is a classification score of entity i;

s42, classifying the result

And performing loss calculation on the input label in the step S21, and for any entity i, when the entity type is the same

Is taken as

When the temperature of the water is higher than the set temperature,

represent the lifeThe name entity type is

The probability of (a) of (b) being,

is entity i in type

The classification score of (a) is calculated,

has a value range of [1,C]C is the number of categories NUM _ LABEL; when in use

When the probability is not an entity, the probability after the deviation is calculated by adopting a probability driving formula (10) calculated by adopting a formula (8), wherein p is a result calculated by adopting the formula (8)

m is a set hyperparameter, is a shift parameter of the probability, and will

The following formula of belt-type (11)

(ii) a If it is an entity, directly substitute the formula (11)

In a

In the calculation formula (2), p is the calculation result of the formula (8)

；

And

respectively positive and negative attention parameters, representing the probability deviation width as a set hyper-parameter, and finally calculating the ASL loss value in the training stage according to a formula (10) and a formula (11);

（8）

（10）

（11）

and carrying out iterative tuning on the weights of all layers in the neural network model by utilizing a back propagation algorithm according to the calculated ASL loss value.

The invention also provides a system for identifying nested and overlapping risk points based on multi-label classification, comprising;

the pre-training module is used for segmenting the contract document labeled with the risk points and the keywords into sentences and inputting the sentence into the BERT model for pre-training to obtain a contract pre-training BERT model;

the system comprises a feature extraction module, a sentence segmentation module, a contract pre-training BERT module and a recognition module, wherein the feature extraction module is used for segmenting a sentence of a contract document to be recognized, inputting the sentence into the contract pre-training BERT module to extract word representation and bottom layer features, and simultaneously obtaining position information and label information of a sentence sequence;

the characteristic fusion module is used for fusing the position information and the label information of the sentence sequence into a bidirectional long-short term memory network (BilSTM) to obtain the characteristics fused with the position and the label information and carry out characteristic compression;

and the classification module is used for inputting the compressed features into the double affine network, performing range enumeration and entity label classification, and performing parameter learning through a label matrix and an introduced ASL loss function.

Preferably, the pre-training module is specifically as follows:

blocking the risk points and the keywords in the contract in a random covering mode, and predicting the blocked risk points and the blocked keywords by using the context which is not covered by a BERT model;

randomly putting two sentences together, and judging whether the two sentences are in the same paragraph or not through a BERT model.

Compared with the prior art, the invention has the beneficial effects that: (1) The invention is based on BERT (bidirectional coding representation conversion algorithm), adopts a new pre-training method, leads the model to be pre-trained in a mode of predicting risk points in the contract through context, and changes the commonly used pre-training method of adjacent sentences into the training mode of the same paragraph according to the characteristics of the contract text; (2) Aiming at the problem of entity nesting and overlapping existing in a named entity recognition task of a contract document, a pointer-based mode is adopted, a bidirectional affine network is used as a module for entity range enumeration and entity classification, a matrix containing a label type and an entity initial position subscript is used instead of a label sequence to represent the label type of training data, the condition that a BIO sequence cannot represent a nested entity and two head-tail pointer sequences cannot represent an overlapped entity is avoided, the representation is simple and visual, labels of the nested and overlapped entities can be directly represented, particularly labels of different entities with completely same head-tail positions are represented and input into a double affine network by utilizing token fusion of positions and labels, interactive information of the head and tail of the entities can be well captured, and the problem of entity nesting and overlapping is solved; (3) Aiming at the problem of class unbalance, the invention uses asymmetric loss (ASL loss), which is an improvement of Focal loss (Focal loss) on the multi-label classification problem, and solves the problem of unbalance of positive and negative samples in the multi-label classification task; (4) Aiming at the condition that the entity is too long, the invention fuses the position information and the label information of the sequence into the BilSTM, so that the BilSTM has stronger sequence modeling capability, thereby reducing the condition of entity fracture.

Drawings

FIG. 1 is a schematic diagram of a method for identifying nested and overlapping risk points based on multi-label classification in accordance with the present invention;

fig. 2 is a schematic diagram of an actual application of the method for identifying nested and overlapped risk points based on multi-label classification according to the embodiment of the present invention.

Detailed Description

In order to more clearly illustrate the embodiments of the present invention, the following description will explain the embodiments of the present invention with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.

Example 1:

the method for identifying nested and overlapping risk points based on multi-label classification as shown in fig. 1 comprises the following steps;

s1, segmenting a contract document marked with risk points and keywords into sentences, and inputting the sentences into a BERT model for pre-training to obtain a contract pre-training BERT model;

s2, sentence segmentation is carried out on the contract document needing to be recognized, the contract document is input into a contract pre-training BERT model, word expression and bottom layer characteristics are extracted, and meanwhile position information and label information of a sentence sequence are obtained;

s3, merging the position information and the label information of the sentence sequence into a bidirectional long-short term memory network BilSTM to obtain the characteristics of the merged position and label information and performing characteristic compression;

and S4, inputting the features compressed in the step S3 into a double affine network, performing range enumeration and entity label classification, and performing parameter learning by a label matrix and introducing an ASL loss function.

Further, the pre-training in step S1 includes the following steps:

s11, blocking the risk points and the keywords in the contract in a random covering mode, and predicting the blocked risk points and the blocked keywords by using the unmasked context through a BERT model;

Through the pre-training mode, the pre-trained BERT can more effectively acquire the characteristics of the contract text, and a better effect is generated on downstream tasks.

Further, step S2 includes the steps of:

The initial position is

Of a matrix

Setting as 1;

the input of the model training stage comprises two parts, namely a text needing document named entity extraction and a position and an entity type of a named entity in the text based on pointer labeling. The entity LABELs of the data in the data loading phase passing through the model are converted into tensor L of (NUM _ LABEL, NUM _ SEQ _ LEN, NUM _ SEQ _ LEN), wherein NUM _ LABEL is the category number of the entity, and NUM _ SEQ _ LEN is the maximum length of the input text. L is a three-dimensional tensor, the first dimension being a categorical category and the second, three dimensions being all possible combinations of a starting location to an ending location of an entity.

Wherein

Is a word in a sentence;

Wherein

（1）。

further, step S3 includes the steps of:

The vector matrix of the initialized category information is

Wherein m is the number of categories; derived from two weight matrices

And

is marked as

And

；

（2）；

Weighting is carried out, as shown in formula (4), and the obtained product is

A token feature fusing category and position information;

（3）

（4）；

s32, inputting token features fused with the category and the position information into two feed-forward neural networks FFNN for feature compression, wherein the specific process is shown as a formula (5) and a formula (6):

(5)

(6)

wherein,

,

are respectively candidate entity ranges

The start and end positions of the air flow,

and

is the sentence in the position

And

a token feature that fuses category and location information,

are respectively as

，

Entities after feature compression

Cephalad and caudal features.

Further, step S4 includes the steps of:

s41, inputting the compressed features into a bidirectional affine network classifier for classification, wherein the classification is shown in the following formula (7)

And

（7）

wherein

is composed of

The tensor of (a);

for offset, the dimension is (

Tensor of 1);

is a classification score of entity i;

s42, classifying the results

Is taken as

When the temperature of the water is higher than the set temperature,

represents a named entity type of

The probability of (a) of (b) being,

is entity i in type

The classification score of (a) is determined,

m is a set hyperparameter, is a shift parameter of the probability, and will

The following formula of belt-type (11)

(ii) a If it is a solid, it is directly substituted for the formula (11) above

In the calculation formula (2), p is the calculation result of the formula (8)

；

And

（8）

（10）

（11）

The invention also provides a system for identifying nested and overlapped risk points based on multi-label classification, which comprises the following steps of;

The pre-training module is specifically as follows:

blocking the risk points and the keywords in the contract in a random covering mode, and predicting the blocked risk points and the keywords by using an unmasked context through a BERT model;

two sentences are randomly put together, and whether the two sentences are in the same paragraph is judged through a BERT model.

Based on the technical scheme of the invention, the steps of the specific implementation and operation process are as follows:

pre-training:

1. and (4) masking the risk points and the keywords in the contract text, and sending the masked risk points and the keywords into a BERT model, wherein the BERT predicts the masked risk points and the keywords according to the unmasked context.

2. The sentences in the contract are randomly combined into sentence pairs and sent into a BERT model, and the BERT predicts whether the two sentences are in the same paragraph (same paragraph).

And (4) supervision training:

the training phase is shown in fig. 2 and has the following steps:

3. and dividing documents to be subjected to entity extraction and input into the system into a training set document set and a verification set.

4. Each document needs to contain two parts, one part is the text of the document, and the other part is the entity position and the label corresponding to the text. And the entity set of the document is subdivided according to the segmentation position of the sentence to obtain a list consisting of the sentence and the entities of the sentence.

5. Each sentence of the training set

And sentence entity

Respectively input into BERT to obtain vector representation of sentence token

；

6. Token vector for final output of BERT

Inputting into BilSTM to extract context information of sentence

Then, the position information and the context information are fused according to the formula (4) to obtain the context expression of the fused position information

；

7. Respectively inputting the sentence context information extracted by the BilSTM into two FFNN feed-forward neural networks with 1 layer, applying the formulas (5) and (6) to obtain the characteristics of the entity head and tail of the sentence,

and

；

8. inputting the characteristics of the head and the tail of the entity of the sentence into a bidirectional affine network, and applying the formula (7) and the formula (8) to output the range enumeration result of the entity and the probability corresponding to the range

；

9. Calculating an ASL loss value of the affine-two network output probability by applying the formula (10) and the formula (11) according to an ASL loss function;

10. adjusting the weight of each layer of the neural network model according to the ASL loss value;

11. after sentences of the training set are trained for one round, calculating the accuracy of the current model parameter prediction verification set, and storing the model weight;

12. repeating the steps 1 to 9 until the preset epoch value is reached;

13. the value stored in the epoch and having the maximum accuracy in the verification set is the learned optimal model.

In addition, the prediction phase of the model comprises the following steps:

1. and (3) segmenting the text of the document by using periods to obtain a sentence list according to the entity positions and labels which are not required to be contained in the document to be predicted.

2. Inputting each sentence to be predicted into BERT to obtain vector representation of the sentence token;

3. inputting the token vector finally output by the BERT into the BilSTM to extract the context information of the sentence;

4. then fusing the position information and the context information according to a formula (4) to obtain a context expression of the fused position information;

5. respectively inputting the context expression of the fusion position information into two FFNN feedforward neural networks of 1 layer to obtain the characteristics of the entity head and the tail of the sentence;

6. inputting the characteristics of the head and the tail of the entity of the sentence into a bidirectional affine network, and outputting a range enumeration result and the probability corresponding to the range of the entity;

7. outputting entities with the probability larger than a set threshold value in the enumeration range result, namely the entities extracted from the sentences;

8. combining the extraction results of all sentences in the document to serve as an entity extraction result of the document;

the entity extraction model is trained from the training data set through the training stage, and the entity contained in the document is predicted through the prediction result after the document to be extracted is input, so that the training and prediction process of a complete entity extraction model is completed.

The invention adopts a method based on a bidirectional affine network to identify entities in contract documents to solve the problems of entity overlapping and entity nesting in the task of extracting the entities in the contract documents, improves the universality of the algorithm by adopting a mode of cutting documents by periods, can be simply popularized in the task of extracting other scenes, adopts BERT to carry out token representation and BilSTM to carry out feature extraction, adopts a label matrix and the bidirectional affine network to carry out entity range enumeration and entity category prediction, can solve the problem of entity nesting in the contract document extraction, and adopts ALS loss to optimize network parameters to solve the problem of complete overlapping of the initial positions of the entities.

The invention discloses a pre-training method based on the characteristics of contract texts, which is characterized in that pre-training is carried out by a training method which enables a model to predict hidden risk points and keywords and predict whether two sentences are in the same paragraph, and BERT pre-trained according to the method has more contract text information, can better acquire the contract text information during training prediction, and greatly reduces the requirements of training resources.

The invention creatively provides a method for enumerating and classifying entity ranges in contract documents based on a bidirectional affine network, which is different from a method for enumerating the entity ranges and then classifying the entities.

The invention creatively blends the position information and the category information of the bottom layer into the middle layer BilSTM network, so that the capacity of capturing the long text information by the middle layer BilSTM network is enhanced, the information characteristics of the risk points can be better extracted, and the possibility of entity fracture is greatly reduced.

The invention innovatively designs the entity range in the contract document expressed by using the label matrix, combines the multi-label classification loss function ASL loss, is suitable for the problem of network parameter optimization under the condition of entity nesting and overlapping in the contract document, solves the problem of category imbalance in range-based named entity extraction, and can be popularized to other scenes.

In conclusion, the method has the characteristics of reducing manual design, having strong universality and solving the problems of entity nesting and overlapping in entity extraction in the contract document.

The foregoing has outlined, rather broadly, the preferred embodiment and principles of the present invention in order that those skilled in the art may better understand the detailed description of the invention without departing from its broader aspects.

Claims

1. A method for identifying nested and overlapping risk points based on multi-label classification, comprising the steps of;

2. The method for identifying nested and overlapping risk points based on multi-label classification as claimed in claim 1, wherein the pre-training in step S1 comprises the steps of:

3. The method for identifying nested and overlapping risk points based on multi-label classification as claimed in claim 2, wherein step S2 comprises the steps of:

s21, representing the entity labels of the input sentences by adopting a three-dimensional tensor; the three-dimensional tensor is expressed asSet for any one of the label categories as

The initial position is

Of a matrix

Setting as 1;

Wherein

Is a word in a sentence;

Wherein

（1）。

4. the method for identifying nested and overlapping risk points based on multi-label classification as claimed in claim 3, wherein step S3 comprises the steps of:

s31, using BilSTM to extract features, and extracting position information and label information of sentence sequenceMelting the information into the BilSTM; setting the position vector of the contract document in the contract pre-training BERT model as

The vector matrix of the initialized category information is

Wherein m is the number of categories; derived from two weight matrices

And

is marked as

And

；

（2）；

a feature corresponding to each token; using fused pairs of information derived for category and location

Weighting is carried out as shown in equation (4) and finallyObtained

A token feature fusing category and position information;

（3）

（4）；

(5)

(6)

wherein,

,

are respectively candidate entity ranges

The start and end positions of the air flow,

and

is the sentence in the position

And

fusing token characteristics of category and position information,

，

are respectively as

，

Entity after feature compression

Cephalad and caudal features.

5. The method for identifying nested and overlapping risk points based on multi-label classification as claimed in claim 4, wherein step S4 comprises the steps of:

And features of characters that are the beginning and end, respectively, of the ith entity;

（7）

wherein

(NUM _ DIMENSION, NUM _ LABEL, NUM _ DIMENSION) tensor, NUM _ DIMENSION being the DIMENSION after feature compression, NUM _ LABELIs the number of entity classes;

is composed of

The tensor of (a);

for offset, the dimension is (

Tensor of 1);

is a classification score of entity i;

s42, classifying the result

Is taken as

When the temperature of the water is higher than the set temperature,

represents a named entity type of

The probability of (a) of (b) being,

is entity i in type

The classification score of (a) is calculated,

m is a set hyperparameter, is a shift parameter of the probability, and will

The following formula of belt-type (11)

(ii) a If it is an entity, directly substitute the formula (11)

In a

In the calculation formula (2), p is the calculation result of the formula (8)

；

And

respectively positive and negative attention parameters, and set hyperparameters representing probability deviationWidth, and finally calculating an ASL loss value in a training stage according to a formula (10) and a formula (11);

（8）

（10）

（11）

6. A system for identifying nested and overlapping risk points based on multi-label classification, comprising;

and the classification module is used for inputting the compressed features into a double affine network, performing range enumeration and entity label classification, and performing parameter learning by a label matrix and introducing an ASL loss function.

7. The system for identifying nested and overlapping risk points based on multi-label classification as claimed in claim 6, wherein the pre-training module is specifically as follows: