CN109918635A - A kind of contract text risk checking method, device, equipment and storage medium - Google Patents
A kind of contract text risk checking method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109918635A CN109918635A CN201711320389.8A CN201711320389A CN109918635A CN 109918635 A CN109918635 A CN 109918635A CN 201711320389 A CN201711320389 A CN 201711320389A CN 109918635 A CN109918635 A CN 109918635A
- Authority
- CN
- China
- Prior art keywords
- clause
- text
- risk
- contract
- type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000012502 risk assessment Methods 0.000 claims abstract description 14
- 238000001514 detection method Methods 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 5
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 238000012795 verification Methods 0.000 description 8
- 238000012937 correction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000012015 optical character recognition Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 239000002689 soil Substances 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Abstract
The invention discloses a kind of contract text risk checking method, device, equipment and storage mediums, it is related to the fields such as natural language processing and semantic computation, the described method includes: obtaining the corresponding clause disaggregated model of the commercial field according to the affiliated commercial field of contract text to be detected;Using the clause disaggregated model, classify to the clause of the contract text, obtain the contract text clause text and corresponding clause type;Risk assessment is carried out to the clause text of each clause type, determines the degree of risk of the clause text of each clause type.The embodiment of the present invention is by training obtained categorization module and deep semantic Matching Model based on a large amount of contract texts, realize the risk supervision to contract text to be detected, risk prompting is carried out to client, it does not need to extract specific word or phrase and artificial setting rule, improves contract text parsing and risk adjudicates accuracy rate.
Description
Technical field
The present invention relates to fields such as natural language processing and semantic computations, in particular to a kind of contract text risk supervision side
Method, device, equipment and storage medium.
Background technique
In recent years, with the standardization of contract text and the progress of natural language processing, some companies utilize nature language
Speech technology automatically parses commercial contract, and wherein the part basis work of commercial contract can be pre-processed by machine,
To reduce manpower, improve efficiency.
The prior art mainly passes through artificial or semiautomatic fashion and selects some specific word or phrases as feature, then sharp
Contract text is parsed with preset rules or machine learning algorithm, to complete the work of document parsing judge.
For such scheme, need manually to extract feature word, and complete the detection of legal documents with preset rules,
Or the similarity between two documents is calculated by the similarity for the keyword for calculating two legal documents.
Due to the diversity feature of Chinese expression, above scheme can not accurately be parsed to contract text and risk is sentenced
Certainly.
Summary of the invention
A kind of contract text risk checking method, device, equipment and storage medium provided in an embodiment of the present invention solve to close
With the problem of text resolution and risk judgement inaccuracy.
A kind of contract text risk checking method provided according to embodiments of the present invention, comprising:
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained;
Using the clause disaggregated model, classify to the clause of the contract text, obtains the contract text
Clause text and corresponding clause type;
Risk assessment is carried out to the clause text of each clause type, determines the clause text of each clause type
This degree of risk.
Preferably, further includes:
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained
Before, the clause disaggregated model classified for the clause to contract text is constructed;
Using the training contract text of the commercial field, constructed clause disaggregated model is trained, obtaining property
The clause disaggregated model that can optimize.
Preferably, the training contract text using the commercial field carries out constructed clause disaggregated model
Training, the clause disaggregated model for obtaining performance optimization include:
Classify to the clause of the training contract text, obtains the clause text and correspondence of the training contract text
Clause type;
Word segmentation processing is carried out to the clause text of the training contract text, obtains the item for forming the training contract text
The word of money text;
Term vector and corresponding clause type using the word, adjust the parameter of the clause disaggregated model
It is whole, obtain the clause disaggregated model of performance optimization.
Preferably, further includes:
The clause disaggregated model is being utilized, after classifying to the clause of the contract text, if each presetting item
Money type has corresponding clause text, it is determined that the contract text is complete.
Preferably, the clause text to each clause type carries out risk assessment, determines each clause
The degree of risk of the clause text of type includes:
Using semantic matches model, by the clause sample of the clause text of each clause type and the clause type
Similarity comparison is carried out, clause text similarity is obtained;
According to the clause text similarity and default risk threshold value, risk assessment is carried out to the contract text, is obtained
The degree of risk of the clause text of each clause type.
Preferably, described to utilize semantic matches model, by the clause text of each clause type and the clause class
The clause sample of type carries out similarity comparison, and obtaining clause text similarity includes:
The corresponding multiple clause samples of the clause type are obtained from sample database;
Using the semantic matches model, will form the term vector of the word of the clause text respectively with form each institute
The term vector for stating the word of clause sample carries out similarity comparison, and it is similar to each clause sample to obtain the clause text
It spends, and maximum similarity is determined as to the clause text similarity of the clause type.
Preferably, further includes:
After determining the degree of risk of clause text of each clause type, by the item of each clause type
Money text is saved as new samples to the sample database;
Using the new samples of the sample database, the clause categorization module and the semantic matches mould are updated
Type.
A kind of contract text risk supervision device provided according to embodiments of the present invention, comprising:
Model obtains module, for obtaining the commercial field pair according to the affiliated commercial field of contract text to be detected
The clause disaggregated model answered;
Clause categorization module is classified to the clause of the contract text, is obtained for utilizing the clause disaggregated model
Clause text and corresponding clause type to the contract text;
Risk evaluation module carries out risk assessment for the clause text to each clause type, determines each institute
State the degree of risk of the clause text of clause type.
A kind of contract text risk supervision equipment provided according to embodiments of the present invention, comprising: processor, and with it is described
The memory of processor coupling;The contract text risk supervision journey that can be run on the processor is stored on the memory
Sequence, the contract text risk supervision program realize above-mentioned contract text risk checking method when being executed by the processor
Step.
The storage medium provided according to embodiments of the present invention is stored thereon with contract text risk supervision program, the conjunction
The step of above-mentioned contract text risk checking method is realized when being executed by processor with text risk supervision program.
Technical solution provided in an embodiment of the present invention has the following beneficial effects:
The embodiment of the present invention by based on the training of a large amount of contract texts obtained categorization module and deep semantic Matching Model,
Realize to the risk supervision of contract text to be detected, risk prompting carried out to client, do not need to extract specific word or phrase with
And artificial setting rule, it improves contract text parsing and risk adjudicates accuracy rate.
Detailed description of the invention
Fig. 1 is contract text risk supervision flow chart provided in an embodiment of the present invention;
Fig. 2 is contract text risk supervision device block diagram provided in an embodiment of the present invention;
Fig. 3 is contract text risk detecting system architecture diagram provided in an embodiment of the present invention;
Fig. 4 is completeness detection module flow chart provided in an embodiment of the present invention;
Fig. 5 is risk supervision module flow diagram provided in an embodiment of the present invention;
Fig. 6 is self-learning module flow chart provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent
Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
Fig. 1 is contract text risk supervision flow chart provided in an embodiment of the present invention, as shown in Figure 1, step includes:
Step S101: according to the affiliated commercial field of contract text to be detected, the corresponding clause of the commercial field is obtained
Disaggregated model.
Before step S101, further includes: the clause disaggregated model that building is classified for the clause to contract text,
Using the training contract text of the commercial field, constructed clause disaggregated model is trained, obtains performance optimization
Clause disaggregated model.When specific training, classifies to the clause of the training contract text, obtain the training contract text
Clause text and corresponding clause type, word segmentation processing is carried out to the clause text of the training contract text, is formed
The word of the clause text of the training contract text, term vector and corresponding clause type using the word, to described
The parameter of clause disaggregated model is adjusted, and obtains the clause disaggregated model of performance optimization.
Step S102: the clause disaggregated model is utilized, classifies to the clause of the contract text, obtains the conjunction
Clause text and corresponding clause type with text.
Using the classification results of step S102, completeness detection can be carried out to the contract text, specifically, if every
A default clause type has corresponding clause text, it is determined that the contract text is complete.
Step S103: risk assessment is carried out to the clause text of each clause type, determines each clause class
The degree of risk of the clause text of type.
Step S103 includes: using semantic matches model, by the clause text of each clause type and the clause
The clause sample of type carries out similarity comparison, clause text similarity is obtained, according to the clause text similarity and default wind
Dangerous threshold value carries out risk assessment to the contract text, obtains the degree of risk of the clause text of each clause type.More
Specifically, the corresponding multiple clause samples of the clause type can be obtained from sample database, then utilize the semanteme
Matching Model will form the word of the term vector of the word of the clause text respectively with the word for forming each clause sample
Vector carries out similarity comparison, obtains the similarity of the clause text Yu each clause sample, and maximum similarity is true
It is set to the clause text similarity of the clause type.
After step s 103, further includes: using the clause text of each clause type as new samples, save to institute
Sample database is stated, to update the clause categorization module and institute's predicate using the new samples of the sample database
Adopted Matching Model.
Further, before saving sample, risk report can also be generated first according to the processing result of step S103,
And it is sent to client, and so that the figure in the law circle of client carries out identification, then according to the confirmation qualification result of client,
Save sample.
Wherein, when artificial identification, confirm whether the clause text of the contract text has high risk degree or low-risk really
Degree.If it is confirmed that clause text has high risk degree or there are risks, then it can be using the clause text as the clause type
Negative data is stored in sample database.If it is confirmed that clause text has low-risk degree or risk is not present, then it can be by this
Positive example sample of the money text as the clause type is stored in sample database.Under normal circumstances, in step s 103, selection
Clause sample can be positive example sample, or negative data.When choosing positive example sample, the clause text of contract text with
The similarity of positive example sample is higher, illustrates that risk is lower, conversely, when choosing takes negative data, the clause text of contract text with
The similarity of negative data is higher, illustrates that risk is higher.
The embodiment of the present invention is based on big data analysis, carries out risk supervision to the content for the contract text drafted, specifically
Say, can to the contract text in a certain commercial field drafted carry out content detection, realize to the contract text drafted whether
Complete and routinely loophole detection, while result displaying and prompting are carried out by displaying and correction verification module, to improve conjunction
Same enforceability.
It will appreciated by the skilled person that implement the method for the above embodiments be can be with
Relevant hardware is instructed to complete by program, the program can store in computer-readable storage medium.Into
One step says that the present invention can also provide a kind of storage medium, is stored thereon with contract text risk supervision program, the contract text
This risk supervision program realizes the step of above-mentioned contract text risk checking method when being executed by processor.Wherein, described
Storage medium may include ROM/RAM, magnetic disk, CD, USB flash disk.
Fig. 2 is contract text risk supervision device block diagram provided in an embodiment of the present invention, as shown in Figure 2, comprising:
Model obtains module 21, for obtaining the commercial field according to the affiliated commercial field of contract text to be detected
Corresponding clause disaggregated model.The clause disaggregated model be it is pre-generated, specifically: first building for contract text
The clause disaggregated model classified of clause, the training contract text of the commercial field is then utilized, to constructed item
Money disaggregated model is trained, and obtains the clause disaggregated model of performance optimization.
Clause categorization module 22, for classifying to the clause of the contract text using the clause disaggregated model,
Obtain the contract text clause text and corresponding clause type.After classification, the contract text has been carried out
Standby property detects, specifically: if each default clause type has corresponding clause text, it is determined that the contract text is complete.
Risk evaluation module 23 carries out risk assessment for the clause text to each clause type, determines each
The degree of risk of the clause text of the clause type.Specifically: semantic matches model is utilized, by each clause type
Clause text and the clause sample of the clause type carry out similarity comparison, clause text similarity are obtained, according to the clause
Text similarity and default risk threshold value carry out risk assessment to the contract text, obtain the item of each clause type
The degree of risk of money text.It, can also be by each institute after obtaining the degree of risk of clause text of each clause type
The clause text of clause type is stated as new samples, optimizes the clause disaggregated model and the semantic matches model.
The embodiment of the present invention provides a kind of contract text risk supervision equipment, comprising: processor, and with the processor
The memory of coupling;The contract text risk supervision program that can be run on the processor, institute are stored on the memory
State the step of above-mentioned contract text risk checking method is realized when contract text risk supervision program is executed by the processor.
Fig. 3 is contract text risk detecting system architecture diagram provided in an embodiment of the present invention, the embodiment of the invention provides
A kind of method and system for the detection of field contract text content risks, judge whether contract text is complete and routinely leaks
The detection in hole, while giving corresponding risk by display module and reminding.The system comprises:
Text collection module 31, for the text collection to legal documents.The system supports papery document or by external
It stores equipment and carries out file importing.
Text Pretreatment module 32, for carrying out standardized format, simplified and traditional body conversion to collected text, capital and small letter turns
It changes, symbol removal merges and splits the processing such as word.
Completeness detection module 33, it is whether complete for detecting clause of the legal documents in the field.
Risk supervision module 34 for detecting specific clause to party with the presence or absence of infringement risk, and calculates its risk
Degree.
Displaying and correction verification module 35, for showing risk report, while for manually assessing risk confidence level.
Sample database 36 is mainly used for storing the data after original training data and desk checking, and is used for completeness
Detection model (or clause disaggregated model) and risk supervision model (or semantic matches model) automatically update study.
Described device further include:
Self-learning module (does not indicate) in figure, for being updated training to original model, makes it have preferably extensive
Property.
The system is really a kind of big data analysis based on server-side, and is aided with the legal documents inspection of client identification
Examining system.Its course of work is streaming process, specifically: text collection module 31 is used for the acquisition of Law Text, for papery
Legal documents, pass through included OCR (Optical Character Recognition, the optical character identification) text of system
Scanning device acquisition;For the document being stored in storage equipment, system be can be read directly.Text Pretreatment module 32 is used for
Collected text is pre-processed, includes following operation: standardized format, simplified and traditional body conversion, capital and small letter is converted, non-semantic
Symbol removal, merges and splits word etc..Whether completeness detection module 33 is complete for detecting field contract text currently entered,
Judge whether the contract lacks the necessary clause in the field.Risk supervision module 34 is for calculating the clause text under current class
Similarity between sheet and text to be compared, and according to preset threshold value, carry out the assessment of risk.Displaying and correction verification module
35, in addition to having the function of showing risk report, also provides and manually check the effect of verification, and check results are automatically uploaded to
In sample database 36.
The method includes two parts:
1, server-side
Preprocessing module 32 receives the text of client acquisition, is pre-processed, and contract text turns treated
Completeness detection module 33 is issued, which classifies to this Chinese clause of contract by model, and wherein classification is basis
Different commercial fields is preset, if all basic class in the field have corresponding text clause to match, then it is assumed that when
Preceding contract text clause meets the completeness of the field contract, is otherwise unsatisfactory for.Then all clauses and corresponding ownership
Class label caches, and is forwarded to risk supervision module 34, and text risk supervision module 34 passes through semantic similarity model meter
The similarity that clause text is corresponded under the category is calculated, and according to preset risk threshold value, to judge that it whether there is risk, and it is raw
At Risk Assessment Report, and it is pushed to displaying and the correction verification module 35 of client, text is forwarded again by artificial verification
To self-learning module, self-learning module carries out self study update according to new sample.
2, client
Text collection module 31 acquires text, here includes two kinds of forms: for the legal documents of papery, passing through OCR text
The acquisition of word scanning device;It can be read directly for being stored in text file in storage equipment.Then collected law text
Book text is forwarded to the preprocessing module 32 of server-side.
It shows and correction verification module 35 receives the report that risk supervision module 34 generates, and show party and relevant people
Member, while the information of user feedback is transmitted to self-learning module, the update for model.
The embodiment of the present invention is based on a large amount of a certain field contract text, is carried out using deep learning to contract text content
Resolved detection.Resolved detection, such as house property deal contract can be carried out to common specific area commercial contract text, wherein house
Deal contract generally comprises following provision content: the personal information of both parties, the attribute in house, method of performance, liability for breach of contract,
Solution of dispute etc..Based on above-mentioned relevant provision content, using deep learning train classification models, each clause
It is categorized into the classification of corresponding clause, realizes and the completeness of the clause classification in estate trade contract field is parsed.Then, herein
On the basis of, using deep semantic Matching Model, calculate the relevant provision text of contract terms text to be detected and the category
Similarity is compared its similarity with preset risk threshold value, to judge whether clause has risk.Finally by being
The self-learning module of system, automatically updates model, and as systematic sample is more and more, system can reach adaptively, reduces artificial dry
In advance.Therefore the embodiment of the present invention is a kind of practical, has self adaptive contract text Context resolution detection system, fully meets work
The demand of Cheng Yingyong.
Fig. 4 is 33 flow chart of completeness detection module provided in an embodiment of the present invention, as shown in figure 4, the process includes such as
Lower step: when completeness detection module 33 receives pretreated text, according to trained disaggregated model, to its every rule
Money text is classified, and judges whether that all classifications have corresponding clause text matches according to the result of classification, if it is
Then think that current area contract text clause is complete, is otherwise incomplete, while clause text and its corresponding class
It Huan Cun not get off, be forwarded to risk supervision module 34, be further processed.
Term vector training:
Since term vector is the input of program model indispensability, therefore need to train term vector in advance, at the same the word in field to
Amount can be multiplexed.Term vector can be automatically generated by the term vector training tool of open source, can be had more to the expression of word
Profound Semantic.
Firstly, the contract text to input pre-processes, mainly include: forbidden character removal, digital replacement are non-heavy
Digital name replacement under big meaning etc., and using every as a line.
Secondly, being segmented to sentence, separated with space, every is used as a line.Participle can pass through the participle work of open source
Tool realize, such as: ansj, stammerer participle, Harbin Institute of Technology LTP.
Finally, selecting parameter training term vector model appropriate.Term vector is that 100 dimensions just have been able under normal circumstances
The meaning of word is expressed in classification task well, form is as follows:
China=[0.00570705,0.4275226, -0.62307459,0.01425633,0.02571641,
0.85126471,-0.4231756,0.031421404,...0.21345081]
The algorithms selection TextCNN algorithm of disaggregated model, because the clause of contract text generally will not be too long, and text point
Class does not have to generally consider long sequence semanteme, therefore selects the algorithm.The algorithm is to be inputted based on deep learning as term vector, no
Artificial extraction feature is needed, generalization is preferable.
The embodiment of the present invention is indicated by the vectorization of word, trains the model for measuring completeness and similitude.
Wherein, the specific steps of completeness detection include:
Step S401: training sample prepares.
1, indispensable clause classification in prespecified current area contract, the field, contract terms text is grouped into corresponding class
Not, the text set of strings of each classification is formed.
Such as: in house deal field, contract necessity classification generally comprises following content: the information of both parties, house
Attribute information, house transaction information, payment method, method of performance, liability for breach of contract, solution of dispute etc..As " Party B agrees to
Buying the house property that the Nanjing Yuhua District street Yu Hua possesses that is located in that Party A possesses, (villa, apartment, is lived at office building
Residence, workshop, StoreFront), construction area is 90 square metres.(being detailed in soil house warrant the 21070021st) ", it is clear that the clause is
Belong to house attribute information.
2, the text obtained to the first step carries out automation pretreatment, includes following operation: standardized format, and simplified and traditional body turns
Change, capital and small letter conversion, non-semantic symbol removal, merging split word etc..Processing result is as follows:
What Party B agreed to buy that Party A possesses be located in house property villa that the Nanjing Yuhua District street Yu Hua possesses,
Office building, apartment, house, workshop, StoreFront, construction area are 90 square metres.It is detailed in soil house warrant the 21070021st "
3, word segmentation processing is carried out to pretreated text.
Such as: word segmentation result
Party B/agreement/purchase/Party A/possesses// be located in// Jiangsu Province/Nanjing/Yuhua District/street Yu Hua/gather around
Have// house property ...
Step S402: disaggregated model training.
Input is the term vector and class label of each word in clause text, constructs disaggregated model using tensorflow
TextCNN is trained and parameter adjustment, makes it have optimal performance, and generate final disaggregated model.
Step S403: new input contract text is detected by model, if the indispensable classification of current area is all
There are corresponding text matches, then it is assumed that current contract text is complete, is otherwise incomplete.
Step S404: the new input contract text and corresponding classification are saved.
Fig. 5 is 34 flow chart of risk supervision module provided in an embodiment of the present invention, as shown in figure 5, the process includes as follows
Step: risk supervision module 34 receives the text that completeness detection module 33 transmits and its corresponding class label, is based on depth
Text similarity measurement algorithm calculates several received text similarities under the text and current class, if highest similarity is greater than in advance
If threshold value, then it is assumed that the clause is low-risk, otherwise it is assumed that the clause is high risk, provides early warning, passes through risk report
It is shown.
Step S501: training sample prepares.
The contract terms text for having divided word and corresponding class label passed over based on completeness detection module,
Several clause texts are picked out at random as such standard sentence without what is put back in proportion in each classification.
Step S502 to step S504: deep semantic Matching Model (or text similarity model) calculates similarity.
Here model can be based on multilayer neural network model buildings using the training of improved DSSM algorithm model
Generalized semantic Matching Model, is calculated using cosine similarity.Input is the term vector of two sentences to be compared, carries out mould
Type training exports as similarity.To test sample, the similarity with such several standard sentence is calculated separately, highest phase is chosen
It is compared like degree with preset threshold value, if highest similarity is greater than preset threshold value, then it is assumed that the clause is low-risk, no
Then think that the clause is high risk, and generates risk report.
Such as: system thresholds are preset as 0.9, when maximum similarity is 0.932, then it is assumed that the clause text is low-risk
's.
Fig. 6 is self-learning module flow chart provided in an embodiment of the present invention, as shown in fig. 6, the process key step are as follows:
Step S601 to step S603: showing and the risk report of generation is showed experience of law by correction verification module 35
Party has legal advice mechanism to confirm it.
Step S605, step S606: system can be added to the sample for having been acknowledged attribute in sample database 36 automatically.
Such as: if after certain field law expert generates report progress identification to it, system can be automatically corresponding reporting
Clause text is added in sample database 36, specifically, if clause text there are risk, using the clause text as
The negative data of the category, if risk is not present in clause text, using the clause text as the positive example sample for changing classification.With
Sample it is more and more, diversity is increasingly more complete, and the performance of system also can be higher and higher.
In conclusion the preset kind of the embodiment of the present invention based on contract, the contract currently drafted based on big data analysis
Clause whether there is risk and corresponding clause risk, facilitate nonlegal personage in not corresponding Fundamentals of Law, determine
The contract drafted whether there is risk, to avoid bringing corresponding loss to obligee, simultaneously because signing with risk contract
User is intuitively showed by client, improves the friendly of service.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique
It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as
Fall into protection scope of the present invention.
Claims (10)
1. a kind of contract text risk checking method characterized by comprising
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained;
Using the clause disaggregated model, classify to the clause of the contract text, obtains the clause of the contract text
Text and corresponding clause type;
Risk assessment is carried out to the clause text of each clause type, determines the clause text of each clause type
Degree of risk.
2. the method according to claim 1, wherein further include:
According to the affiliated commercial field of contract text to be detected, obtain the corresponding clause disaggregated model of the commercial field it
Before, construct the clause disaggregated model classified for the clause to contract text;
Using the training contract text of the commercial field, constructed clause disaggregated model is trained, it is excellent to obtain performance
The clause disaggregated model of change.
3. according to the method described in claim 2, it is characterized in that, the training contract text using the commercial field,
Constructed clause disaggregated model is trained, the clause disaggregated model for obtaining performance optimization includes:
Classify to the clause of the training contract text, obtains the clause text and corresponding item of the training contract text
Money type;
Word segmentation processing is carried out to the clause text of the training contract text, obtains the clause text for forming the training contract text
This word;
Term vector and corresponding clause type using the word, are adjusted the parameter of the clause disaggregated model, obtain
The clause disaggregated model optimized to performance.
4. according to the method described in claim 3, it is characterized by further comprising:
The clause disaggregated model is being utilized, after classifying to the clause of the contract text, if each default clause class
Type has corresponding clause text, it is determined that the contract text is complete.
5. the method according to claim 3 or 4, which is characterized in that the clause text to each clause type
Risk assessment is carried out, determines that the degree of risk of the clause text of each clause type includes:
Using semantic matches model, the clause text of each clause type and the clause sample of the clause type are carried out
Similarity comparison obtains clause text similarity;
According to the clause text similarity and default risk threshold value, risk assessment is carried out to the contract text, is obtained each
The degree of risk of the clause text of the clause type.
6. according to the method described in claim 5, it is characterized in that, described utilize semantic matches model, by each clause
The clause text of type and the clause sample of the clause type carry out similarity comparison, and obtaining clause text similarity includes:
The corresponding multiple clause samples of the clause type are obtained from sample database;
Using the semantic matches model, will form the term vector of the word of the clause text respectively with form each item
The term vector of the word of money sample carries out similarity comparison, obtains the similarity of the clause text Yu each clause sample,
And maximum similarity is determined as to the clause text similarity of the clause type.
7. according to the method described in claim 6, it is characterized by further comprising:
After determining the degree of risk of clause text of each clause type, by the clause text of each clause type
This is saved as new samples to the sample database;
Using the new samples of the sample database, the clause categorization module and the semantic matches model are updated.
8. a kind of contract text risk supervision device characterized by comprising
Model obtains module, for it is corresponding to obtain the commercial field according to the affiliated commercial field of contract text to be detected
Clause disaggregated model;
Clause categorization module classifies to the clause of the contract text, obtains institute for utilizing the clause disaggregated model
State contract text clause text and corresponding clause type;
Risk evaluation module carries out risk assessment for the clause text to each clause type, determines each item
The degree of risk of the clause text of money type.
9. a kind of contract text risk supervision equipment characterized by comprising processor and with the processor coupling deposit
Reservoir;The contract text risk supervision program that can be run on the processor, the contract text are stored on the memory
The contract text risk as described in any one of claims 1 to 7 is realized when this risk supervision program is executed by the processor
The step of detection method.
10. a kind of storage medium, which is characterized in that be stored thereon with contract text risk supervision program, the contract text wind
Danger detection program realizes the contract text risk checking method as described in any one of claims 1 to 7 when being executed by processor
The step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711320389.8A CN109918635A (en) | 2017-12-12 | 2017-12-12 | A kind of contract text risk checking method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711320389.8A CN109918635A (en) | 2017-12-12 | 2017-12-12 | A kind of contract text risk checking method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109918635A true CN109918635A (en) | 2019-06-21 |
Family
ID=66956837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711320389.8A Pending CN109918635A (en) | 2017-12-12 | 2017-12-12 | A kind of contract text risk checking method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109918635A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457659A (en) * | 2019-07-05 | 2019-11-15 | 中国平安人寿保险股份有限公司 | Clause document structure tree method and terminal device |
CN110502745A (en) * | 2019-07-18 | 2019-11-26 | 平安科技(深圳)有限公司 | Text information evaluation method, device, computer equipment and storage medium |
CN110502632A (en) * | 2019-07-19 | 2019-11-26 | 平安科技(深圳)有限公司 | Contract terms reviewing method, device, computer equipment and storage medium based on clustering algorithm |
CN110705265A (en) * | 2019-08-27 | 2020-01-17 | 阿里巴巴集团控股有限公司 | Contract clause risk identification method and device |
CN110705955A (en) * | 2019-08-22 | 2020-01-17 | 阿里巴巴集团控股有限公司 | Contract detection method and device |
CN110765765A (en) * | 2019-09-16 | 2020-02-07 | 平安科技(深圳)有限公司 | Contract key clause extraction method and device based on artificial intelligence and storage medium |
CN111666408A (en) * | 2020-05-26 | 2020-09-15 | 中国工商银行股份有限公司 | Method and device for screening and displaying important clauses |
CN111783781A (en) * | 2020-05-22 | 2020-10-16 | 平安国际智慧城市科技股份有限公司 | Malicious clause identification method, device and equipment based on product agreement character identification |
CN112184498A (en) * | 2020-09-29 | 2021-01-05 | 中国平安财产保险股份有限公司 | Contract scoring method and device, computer equipment and storage medium |
CN112183424A (en) * | 2020-10-12 | 2021-01-05 | 北京华严互娱科技有限公司 | Real-time hand tracking method and system based on video |
CN112232088A (en) * | 2020-11-19 | 2021-01-15 | 京北方信息技术股份有限公司 | Contract clause risk intelligent identification method and device, electronic equipment and storage medium |
CN112330214A (en) * | 2020-11-26 | 2021-02-05 | 杭州睿胜软件有限公司 | Contract review method and device and readable storage medium |
CN112464660A (en) * | 2020-11-25 | 2021-03-09 | 平安医疗健康管理股份有限公司 | Text classification model construction method and text data processing method |
CN112668899A (en) * | 2020-12-31 | 2021-04-16 | 无锡软美信息科技有限公司 | Contract risk identification method and device based on artificial intelligence |
CN113051897A (en) * | 2021-05-25 | 2021-06-29 | 中国电子科技集团公司第三十研究所 | GPT2 text automatic generation method based on Performer structure |
CN113779640A (en) * | 2021-09-01 | 2021-12-10 | 北京橙色云科技有限公司 | Contract signing method, contract signing device and storage medium |
CN115392805A (en) * | 2022-10-28 | 2022-11-25 | 国能大渡河大数据服务有限公司 | Transaction type contract compliance risk diagnosis method and system |
CN116089614A (en) * | 2023-01-12 | 2023-05-09 | 杭州瓴羊智能服务有限公司 | Text marking method and device |
CN117151096A (en) * | 2023-09-05 | 2023-12-01 | 江苏群杰物联科技有限公司 | Intelligent contract checking method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080306784A1 (en) * | 2007-06-05 | 2008-12-11 | Vijay Rajkumar | Computer-implemented methods and systems for analyzing clauses of contracts and other business documents |
CN103366231A (en) * | 2012-03-29 | 2013-10-23 | 上海天闻律师事务所 | Contract risk information automatic processing method and device |
CN106844544A (en) * | 2016-12-30 | 2017-06-13 | 全民互联科技(天津)有限公司 | A kind of contract terms Risk Identification Method and system |
-
2017
- 2017-12-12 CN CN201711320389.8A patent/CN109918635A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080306784A1 (en) * | 2007-06-05 | 2008-12-11 | Vijay Rajkumar | Computer-implemented methods and systems for analyzing clauses of contracts and other business documents |
CN103366231A (en) * | 2012-03-29 | 2013-10-23 | 上海天闻律师事务所 | Contract risk information automatic processing method and device |
CN106844544A (en) * | 2016-12-30 | 2017-06-13 | 全民互联科技(天津)有限公司 | A kind of contract terms Risk Identification Method and system |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457659A (en) * | 2019-07-05 | 2019-11-15 | 中国平安人寿保险股份有限公司 | Clause document structure tree method and terminal device |
CN110457659B (en) * | 2019-07-05 | 2023-07-25 | 中国平安人寿保险股份有限公司 | Clause document generation method and terminal equipment |
CN110502745A (en) * | 2019-07-18 | 2019-11-26 | 平安科技(深圳)有限公司 | Text information evaluation method, device, computer equipment and storage medium |
CN110502745B (en) * | 2019-07-18 | 2023-04-07 | 平安科技(深圳)有限公司 | Text information evaluation method and device, computer equipment and storage medium |
CN110502632A (en) * | 2019-07-19 | 2019-11-26 | 平安科技(深圳)有限公司 | Contract terms reviewing method, device, computer equipment and storage medium based on clustering algorithm |
CN110705955A (en) * | 2019-08-22 | 2020-01-17 | 阿里巴巴集团控股有限公司 | Contract detection method and device |
CN110705955B (en) * | 2019-08-22 | 2023-03-07 | 创新先进技术有限公司 | Contract detection method and device |
CN110705265A (en) * | 2019-08-27 | 2020-01-17 | 阿里巴巴集团控股有限公司 | Contract clause risk identification method and device |
WO2021051934A1 (en) * | 2019-09-16 | 2021-03-25 | 平安科技(深圳)有限公司 | Method and apparatus for extracting key contract term on basis of artificial intelligence, and storage medium |
CN110765765A (en) * | 2019-09-16 | 2020-02-07 | 平安科技(深圳)有限公司 | Contract key clause extraction method and device based on artificial intelligence and storage medium |
CN110765765B (en) * | 2019-09-16 | 2023-10-20 | 平安科技(深圳)有限公司 | Contract key term extraction method, device and storage medium based on artificial intelligence |
CN111783781A (en) * | 2020-05-22 | 2020-10-16 | 平安国际智慧城市科技股份有限公司 | Malicious clause identification method, device and equipment based on product agreement character identification |
CN111783781B (en) * | 2020-05-22 | 2024-04-05 | 深圳赛安特技术服务有限公司 | Malicious term recognition method, device and equipment based on product agreement character recognition |
CN111666408A (en) * | 2020-05-26 | 2020-09-15 | 中国工商银行股份有限公司 | Method and device for screening and displaying important clauses |
CN112184498A (en) * | 2020-09-29 | 2021-01-05 | 中国平安财产保险股份有限公司 | Contract scoring method and device, computer equipment and storage medium |
CN112183424A (en) * | 2020-10-12 | 2021-01-05 | 北京华严互娱科技有限公司 | Real-time hand tracking method and system based on video |
CN112232088A (en) * | 2020-11-19 | 2021-01-15 | 京北方信息技术股份有限公司 | Contract clause risk intelligent identification method and device, electronic equipment and storage medium |
CN112464660B (en) * | 2020-11-25 | 2023-02-07 | 深圳平安医疗健康科技服务有限公司 | Text classification model construction method and text data processing method |
CN112464660A (en) * | 2020-11-25 | 2021-03-09 | 平安医疗健康管理股份有限公司 | Text classification model construction method and text data processing method |
WO2022111548A1 (en) * | 2020-11-26 | 2022-06-02 | 杭州睿胜软件有限公司 | Contract review method and apparatus, and readable storage medium |
CN112330214A (en) * | 2020-11-26 | 2021-02-05 | 杭州睿胜软件有限公司 | Contract review method and device and readable storage medium |
CN112668899A (en) * | 2020-12-31 | 2021-04-16 | 无锡软美信息科技有限公司 | Contract risk identification method and device based on artificial intelligence |
CN113051897A (en) * | 2021-05-25 | 2021-06-29 | 中国电子科技集团公司第三十研究所 | GPT2 text automatic generation method based on Performer structure |
CN113779640A (en) * | 2021-09-01 | 2021-12-10 | 北京橙色云科技有限公司 | Contract signing method, contract signing device and storage medium |
CN115392805A (en) * | 2022-10-28 | 2022-11-25 | 国能大渡河大数据服务有限公司 | Transaction type contract compliance risk diagnosis method and system |
CN116089614A (en) * | 2023-01-12 | 2023-05-09 | 杭州瓴羊智能服务有限公司 | Text marking method and device |
CN116089614B (en) * | 2023-01-12 | 2023-11-21 | 瓴羊智能科技有限公司 | Text marking method and device |
CN117151096A (en) * | 2023-09-05 | 2023-12-01 | 江苏群杰物联科技有限公司 | Intelligent contract checking method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109918635A (en) | A kind of contract text risk checking method, device, equipment and storage medium | |
CN110298032B (en) | Text classification corpus labeling training system | |
CN108304372B (en) | Entity extraction method and device, computer equipment and storage medium | |
WO2018028077A1 (en) | Deep learning based method and device for chinese semantics analysis | |
US20210064821A1 (en) | System and method to extract customized information in natural language text | |
CN106257455B (en) | A kind of Bootstrapping method extracting viewpoint evaluation object based on dependence template | |
US10733675B2 (en) | Accuracy and speed of automatically processing records in an automated environment | |
CN104820629A (en) | Intelligent system and method for emergently processing public sentiment emergency | |
CN104199965A (en) | Semantic information retrieval method | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN111639183B (en) | Financial co-industry public opinion analysis method and system based on deep learning algorithm | |
CN103092975A (en) | Detection and filter method of network community garbage information based on topic consensus coverage rate | |
CN110377731A (en) | Complain text handling method, device, computer equipment and storage medium | |
CN110196977A (en) | A kind of intelligence alert inspection processing system and method | |
CN107943514A (en) | The method for digging and system of core code element in a kind of software document | |
CN116992005B (en) | Intelligent dialogue method, system and equipment based on large model and local knowledge base | |
CN107958068B (en) | Language model smoothing method based on entity knowledge base | |
CN113919366A (en) | Semantic matching method and device for power transformer knowledge question answering | |
CN116089873A (en) | Model training method, data classification and classification method, device, equipment and medium | |
CN107766560B (en) | Method and system for evaluating customer service flow | |
CN112966682A (en) | File classification method and system based on semantic analysis | |
CN111782793A (en) | Intelligent customer service processing method, system and equipment | |
CN114239579A (en) | Electric power searchable document extraction method and device based on regular expression and CRF model | |
CN113971210A (en) | Data dictionary generation method and device, electronic equipment and storage medium | |
CN116522912B (en) | Training method, device, medium and equipment for package design language model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190621 |