CN109918635A - A kind of contract text risk checking method, device, equipment and storage medium - Google Patents

A kind of contract text risk checking method, device, equipment and storage medium Download PDF

Info

Publication number
CN109918635A
CN109918635A CN201711320389.8A CN201711320389A CN109918635A CN 109918635 A CN109918635 A CN 109918635A CN 201711320389 A CN201711320389 A CN 201711320389A CN 109918635 A CN109918635 A CN 109918635A
Authority
CN
China
Prior art keywords
clause
text
risk
contract
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711320389.8A
Other languages
Chinese (zh)
Inventor
许慢
牛国扬
陈虹
温海娇
邓钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201711320389.8A priority Critical patent/CN109918635A/en
Publication of CN109918635A publication Critical patent/CN109918635A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a kind of contract text risk checking method, device, equipment and storage mediums, it is related to the fields such as natural language processing and semantic computation, the described method includes: obtaining the corresponding clause disaggregated model of the commercial field according to the affiliated commercial field of contract text to be detected;Using the clause disaggregated model, classify to the clause of the contract text, obtain the contract text clause text and corresponding clause type;Risk assessment is carried out to the clause text of each clause type, determines the degree of risk of the clause text of each clause type.The embodiment of the present invention is by training obtained categorization module and deep semantic Matching Model based on a large amount of contract texts, realize the risk supervision to contract text to be detected, risk prompting is carried out to client, it does not need to extract specific word or phrase and artificial setting rule, improves contract text parsing and risk adjudicates accuracy rate.

Description

A kind of contract text risk checking method, device, equipment and storage medium
Technical field
The present invention relates to fields such as natural language processing and semantic computations, in particular to a kind of contract text risk supervision side Method, device, equipment and storage medium.
Background technique
In recent years, with the standardization of contract text and the progress of natural language processing, some companies utilize nature language Speech technology automatically parses commercial contract, and wherein the part basis work of commercial contract can be pre-processed by machine, To reduce manpower, improve efficiency.
The prior art mainly passes through artificial or semiautomatic fashion and selects some specific word or phrases as feature, then sharp Contract text is parsed with preset rules or machine learning algorithm, to complete the work of document parsing judge.
For such scheme, need manually to extract feature word, and complete the detection of legal documents with preset rules, Or the similarity between two documents is calculated by the similarity for the keyword for calculating two legal documents.
Due to the diversity feature of Chinese expression, above scheme can not accurately be parsed to contract text and risk is sentenced Certainly.
Summary of the invention
A kind of contract text risk checking method, device, equipment and storage medium provided in an embodiment of the present invention solve to close With the problem of text resolution and risk judgement inaccuracy.
A kind of contract text risk checking method provided according to embodiments of the present invention, comprising:
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained;
Using the clause disaggregated model, classify to the clause of the contract text, obtains the contract text Clause text and corresponding clause type;
Risk assessment is carried out to the clause text of each clause type, determines the clause text of each clause type This degree of risk.
Preferably, further includes:
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained Before, the clause disaggregated model classified for the clause to contract text is constructed;
Using the training contract text of the commercial field, constructed clause disaggregated model is trained, obtaining property The clause disaggregated model that can optimize.
Preferably, the training contract text using the commercial field carries out constructed clause disaggregated model Training, the clause disaggregated model for obtaining performance optimization include:
Classify to the clause of the training contract text, obtains the clause text and correspondence of the training contract text Clause type;
Word segmentation processing is carried out to the clause text of the training contract text, obtains the item for forming the training contract text The word of money text;
Term vector and corresponding clause type using the word, adjust the parameter of the clause disaggregated model It is whole, obtain the clause disaggregated model of performance optimization.
Preferably, further includes:
The clause disaggregated model is being utilized, after classifying to the clause of the contract text, if each presetting item Money type has corresponding clause text, it is determined that the contract text is complete.
Preferably, the clause text to each clause type carries out risk assessment, determines each clause The degree of risk of the clause text of type includes:
Using semantic matches model, by the clause sample of the clause text of each clause type and the clause type Similarity comparison is carried out, clause text similarity is obtained;
According to the clause text similarity and default risk threshold value, risk assessment is carried out to the contract text, is obtained The degree of risk of the clause text of each clause type.
Preferably, described to utilize semantic matches model, by the clause text of each clause type and the clause class The clause sample of type carries out similarity comparison, and obtaining clause text similarity includes:
The corresponding multiple clause samples of the clause type are obtained from sample database;
Using the semantic matches model, will form the term vector of the word of the clause text respectively with form each institute The term vector for stating the word of clause sample carries out similarity comparison, and it is similar to each clause sample to obtain the clause text It spends, and maximum similarity is determined as to the clause text similarity of the clause type.
Preferably, further includes:
After determining the degree of risk of clause text of each clause type, by the item of each clause type Money text is saved as new samples to the sample database;
Using the new samples of the sample database, the clause categorization module and the semantic matches mould are updated Type.
A kind of contract text risk supervision device provided according to embodiments of the present invention, comprising:
Model obtains module, for obtaining the commercial field pair according to the affiliated commercial field of contract text to be detected The clause disaggregated model answered;
Clause categorization module is classified to the clause of the contract text, is obtained for utilizing the clause disaggregated model Clause text and corresponding clause type to the contract text;
Risk evaluation module carries out risk assessment for the clause text to each clause type, determines each institute State the degree of risk of the clause text of clause type.
A kind of contract text risk supervision equipment provided according to embodiments of the present invention, comprising: processor, and with it is described The memory of processor coupling;The contract text risk supervision journey that can be run on the processor is stored on the memory Sequence, the contract text risk supervision program realize above-mentioned contract text risk checking method when being executed by the processor Step.
The storage medium provided according to embodiments of the present invention is stored thereon with contract text risk supervision program, the conjunction The step of above-mentioned contract text risk checking method is realized when being executed by processor with text risk supervision program.
Technical solution provided in an embodiment of the present invention has the following beneficial effects:
The embodiment of the present invention by based on the training of a large amount of contract texts obtained categorization module and deep semantic Matching Model, Realize to the risk supervision of contract text to be detected, risk prompting carried out to client, do not need to extract specific word or phrase with And artificial setting rule, it improves contract text parsing and risk adjudicates accuracy rate.
Detailed description of the invention
Fig. 1 is contract text risk supervision flow chart provided in an embodiment of the present invention;
Fig. 2 is contract text risk supervision device block diagram provided in an embodiment of the present invention;
Fig. 3 is contract text risk detecting system architecture diagram provided in an embodiment of the present invention;
Fig. 4 is completeness detection module flow chart provided in an embodiment of the present invention;
Fig. 5 is risk supervision module flow diagram provided in an embodiment of the present invention;
Fig. 6 is self-learning module flow chart provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing to a preferred embodiment of the present invention will be described in detail, it should be understood that described below is excellent Select embodiment only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.
Fig. 1 is contract text risk supervision flow chart provided in an embodiment of the present invention, as shown in Figure 1, step includes:
Step S101: according to the affiliated commercial field of contract text to be detected, the corresponding clause of the commercial field is obtained Disaggregated model.
Before step S101, further includes: the clause disaggregated model that building is classified for the clause to contract text, Using the training contract text of the commercial field, constructed clause disaggregated model is trained, obtains performance optimization Clause disaggregated model.When specific training, classifies to the clause of the training contract text, obtain the training contract text Clause text and corresponding clause type, word segmentation processing is carried out to the clause text of the training contract text, is formed The word of the clause text of the training contract text, term vector and corresponding clause type using the word, to described The parameter of clause disaggregated model is adjusted, and obtains the clause disaggregated model of performance optimization.
Step S102: the clause disaggregated model is utilized, classifies to the clause of the contract text, obtains the conjunction Clause text and corresponding clause type with text.
Using the classification results of step S102, completeness detection can be carried out to the contract text, specifically, if every A default clause type has corresponding clause text, it is determined that the contract text is complete.
Step S103: risk assessment is carried out to the clause text of each clause type, determines each clause class The degree of risk of the clause text of type.
Step S103 includes: using semantic matches model, by the clause text of each clause type and the clause The clause sample of type carries out similarity comparison, clause text similarity is obtained, according to the clause text similarity and default wind Dangerous threshold value carries out risk assessment to the contract text, obtains the degree of risk of the clause text of each clause type.More Specifically, the corresponding multiple clause samples of the clause type can be obtained from sample database, then utilize the semanteme Matching Model will form the word of the term vector of the word of the clause text respectively with the word for forming each clause sample Vector carries out similarity comparison, obtains the similarity of the clause text Yu each clause sample, and maximum similarity is true It is set to the clause text similarity of the clause type.
After step s 103, further includes: using the clause text of each clause type as new samples, save to institute Sample database is stated, to update the clause categorization module and institute's predicate using the new samples of the sample database Adopted Matching Model.
Further, before saving sample, risk report can also be generated first according to the processing result of step S103, And it is sent to client, and so that the figure in the law circle of client carries out identification, then according to the confirmation qualification result of client, Save sample.
Wherein, when artificial identification, confirm whether the clause text of the contract text has high risk degree or low-risk really Degree.If it is confirmed that clause text has high risk degree or there are risks, then it can be using the clause text as the clause type Negative data is stored in sample database.If it is confirmed that clause text has low-risk degree or risk is not present, then it can be by this Positive example sample of the money text as the clause type is stored in sample database.Under normal circumstances, in step s 103, selection Clause sample can be positive example sample, or negative data.When choosing positive example sample, the clause text of contract text with The similarity of positive example sample is higher, illustrates that risk is lower, conversely, when choosing takes negative data, the clause text of contract text with The similarity of negative data is higher, illustrates that risk is higher.
The embodiment of the present invention is based on big data analysis, carries out risk supervision to the content for the contract text drafted, specifically Say, can to the contract text in a certain commercial field drafted carry out content detection, realize to the contract text drafted whether Complete and routinely loophole detection, while result displaying and prompting are carried out by displaying and correction verification module, to improve conjunction Same enforceability.
It will appreciated by the skilled person that implement the method for the above embodiments be can be with Relevant hardware is instructed to complete by program, the program can store in computer-readable storage medium.Into One step says that the present invention can also provide a kind of storage medium, is stored thereon with contract text risk supervision program, the contract text This risk supervision program realizes the step of above-mentioned contract text risk checking method when being executed by processor.Wherein, described Storage medium may include ROM/RAM, magnetic disk, CD, USB flash disk.
Fig. 2 is contract text risk supervision device block diagram provided in an embodiment of the present invention, as shown in Figure 2, comprising:
Model obtains module 21, for obtaining the commercial field according to the affiliated commercial field of contract text to be detected Corresponding clause disaggregated model.The clause disaggregated model be it is pre-generated, specifically: first building for contract text The clause disaggregated model classified of clause, the training contract text of the commercial field is then utilized, to constructed item Money disaggregated model is trained, and obtains the clause disaggregated model of performance optimization.
Clause categorization module 22, for classifying to the clause of the contract text using the clause disaggregated model, Obtain the contract text clause text and corresponding clause type.After classification, the contract text has been carried out Standby property detects, specifically: if each default clause type has corresponding clause text, it is determined that the contract text is complete.
Risk evaluation module 23 carries out risk assessment for the clause text to each clause type, determines each The degree of risk of the clause text of the clause type.Specifically: semantic matches model is utilized, by each clause type Clause text and the clause sample of the clause type carry out similarity comparison, clause text similarity are obtained, according to the clause Text similarity and default risk threshold value carry out risk assessment to the contract text, obtain the item of each clause type The degree of risk of money text.It, can also be by each institute after obtaining the degree of risk of clause text of each clause type The clause text of clause type is stated as new samples, optimizes the clause disaggregated model and the semantic matches model.
The embodiment of the present invention provides a kind of contract text risk supervision equipment, comprising: processor, and with the processor The memory of coupling;The contract text risk supervision program that can be run on the processor, institute are stored on the memory State the step of above-mentioned contract text risk checking method is realized when contract text risk supervision program is executed by the processor.
Fig. 3 is contract text risk detecting system architecture diagram provided in an embodiment of the present invention, the embodiment of the invention provides A kind of method and system for the detection of field contract text content risks, judge whether contract text is complete and routinely leaks The detection in hole, while giving corresponding risk by display module and reminding.The system comprises:
Text collection module 31, for the text collection to legal documents.The system supports papery document or by external It stores equipment and carries out file importing.
Text Pretreatment module 32, for carrying out standardized format, simplified and traditional body conversion to collected text, capital and small letter turns It changes, symbol removal merges and splits the processing such as word.
Completeness detection module 33, it is whether complete for detecting clause of the legal documents in the field.
Risk supervision module 34 for detecting specific clause to party with the presence or absence of infringement risk, and calculates its risk Degree.
Displaying and correction verification module 35, for showing risk report, while for manually assessing risk confidence level.
Sample database 36 is mainly used for storing the data after original training data and desk checking, and is used for completeness Detection model (or clause disaggregated model) and risk supervision model (or semantic matches model) automatically update study.
Described device further include:
Self-learning module (does not indicate) in figure, for being updated training to original model, makes it have preferably extensive Property.
The system is really a kind of big data analysis based on server-side, and is aided with the legal documents inspection of client identification Examining system.Its course of work is streaming process, specifically: text collection module 31 is used for the acquisition of Law Text, for papery Legal documents, pass through included OCR (Optical Character Recognition, the optical character identification) text of system Scanning device acquisition;For the document being stored in storage equipment, system be can be read directly.Text Pretreatment module 32 is used for Collected text is pre-processed, includes following operation: standardized format, simplified and traditional body conversion, capital and small letter is converted, non-semantic Symbol removal, merges and splits word etc..Whether completeness detection module 33 is complete for detecting field contract text currently entered, Judge whether the contract lacks the necessary clause in the field.Risk supervision module 34 is for calculating the clause text under current class Similarity between sheet and text to be compared, and according to preset threshold value, carry out the assessment of risk.Displaying and correction verification module 35, in addition to having the function of showing risk report, also provides and manually check the effect of verification, and check results are automatically uploaded to In sample database 36.
The method includes two parts:
1, server-side
Preprocessing module 32 receives the text of client acquisition, is pre-processed, and contract text turns treated Completeness detection module 33 is issued, which classifies to this Chinese clause of contract by model, and wherein classification is basis Different commercial fields is preset, if all basic class in the field have corresponding text clause to match, then it is assumed that when Preceding contract text clause meets the completeness of the field contract, is otherwise unsatisfactory for.Then all clauses and corresponding ownership Class label caches, and is forwarded to risk supervision module 34, and text risk supervision module 34 passes through semantic similarity model meter The similarity that clause text is corresponded under the category is calculated, and according to preset risk threshold value, to judge that it whether there is risk, and it is raw At Risk Assessment Report, and it is pushed to displaying and the correction verification module 35 of client, text is forwarded again by artificial verification To self-learning module, self-learning module carries out self study update according to new sample.
2, client
Text collection module 31 acquires text, here includes two kinds of forms: for the legal documents of papery, passing through OCR text The acquisition of word scanning device;It can be read directly for being stored in text file in storage equipment.Then collected law text Book text is forwarded to the preprocessing module 32 of server-side.
It shows and correction verification module 35 receives the report that risk supervision module 34 generates, and show party and relevant people Member, while the information of user feedback is transmitted to self-learning module, the update for model.
The embodiment of the present invention is based on a large amount of a certain field contract text, is carried out using deep learning to contract text content Resolved detection.Resolved detection, such as house property deal contract can be carried out to common specific area commercial contract text, wherein house Deal contract generally comprises following provision content: the personal information of both parties, the attribute in house, method of performance, liability for breach of contract, Solution of dispute etc..Based on above-mentioned relevant provision content, using deep learning train classification models, each clause It is categorized into the classification of corresponding clause, realizes and the completeness of the clause classification in estate trade contract field is parsed.Then, herein On the basis of, using deep semantic Matching Model, calculate the relevant provision text of contract terms text to be detected and the category Similarity is compared its similarity with preset risk threshold value, to judge whether clause has risk.Finally by being The self-learning module of system, automatically updates model, and as systematic sample is more and more, system can reach adaptively, reduces artificial dry In advance.Therefore the embodiment of the present invention is a kind of practical, has self adaptive contract text Context resolution detection system, fully meets work The demand of Cheng Yingyong.
Fig. 4 is 33 flow chart of completeness detection module provided in an embodiment of the present invention, as shown in figure 4, the process includes such as Lower step: when completeness detection module 33 receives pretreated text, according to trained disaggregated model, to its every rule Money text is classified, and judges whether that all classifications have corresponding clause text matches according to the result of classification, if it is Then think that current area contract text clause is complete, is otherwise incomplete, while clause text and its corresponding class It Huan Cun not get off, be forwarded to risk supervision module 34, be further processed.
Term vector training:
Since term vector is the input of program model indispensability, therefore need to train term vector in advance, at the same the word in field to Amount can be multiplexed.Term vector can be automatically generated by the term vector training tool of open source, can be had more to the expression of word Profound Semantic.
Firstly, the contract text to input pre-processes, mainly include: forbidden character removal, digital replacement are non-heavy Digital name replacement under big meaning etc., and using every as a line.
Secondly, being segmented to sentence, separated with space, every is used as a line.Participle can pass through the participle work of open source Tool realize, such as: ansj, stammerer participle, Harbin Institute of Technology LTP.
Finally, selecting parameter training term vector model appropriate.Term vector is that 100 dimensions just have been able under normal circumstances The meaning of word is expressed in classification task well, form is as follows:
China=[0.00570705,0.4275226, -0.62307459,0.01425633,0.02571641, 0.85126471,-0.4231756,0.031421404,...0.21345081]
The algorithms selection TextCNN algorithm of disaggregated model, because the clause of contract text generally will not be too long, and text point Class does not have to generally consider long sequence semanteme, therefore selects the algorithm.The algorithm is to be inputted based on deep learning as term vector, no Artificial extraction feature is needed, generalization is preferable.
The embodiment of the present invention is indicated by the vectorization of word, trains the model for measuring completeness and similitude.
Wherein, the specific steps of completeness detection include:
Step S401: training sample prepares.
1, indispensable clause classification in prespecified current area contract, the field, contract terms text is grouped into corresponding class Not, the text set of strings of each classification is formed.
Such as: in house deal field, contract necessity classification generally comprises following content: the information of both parties, house Attribute information, house transaction information, payment method, method of performance, liability for breach of contract, solution of dispute etc..As " Party B agrees to Buying the house property that the Nanjing Yuhua District street Yu Hua possesses that is located in that Party A possesses, (villa, apartment, is lived at office building Residence, workshop, StoreFront), construction area is 90 square metres.(being detailed in soil house warrant the 21070021st) ", it is clear that the clause is Belong to house attribute information.
2, the text obtained to the first step carries out automation pretreatment, includes following operation: standardized format, and simplified and traditional body turns Change, capital and small letter conversion, non-semantic symbol removal, merging split word etc..Processing result is as follows:
What Party B agreed to buy that Party A possesses be located in house property villa that the Nanjing Yuhua District street Yu Hua possesses, Office building, apartment, house, workshop, StoreFront, construction area are 90 square metres.It is detailed in soil house warrant the 21070021st "
3, word segmentation processing is carried out to pretreated text.
Such as: word segmentation result
Party B/agreement/purchase/Party A/possesses// be located in// Jiangsu Province/Nanjing/Yuhua District/street Yu Hua/gather around Have// house property ...
Step S402: disaggregated model training.
Input is the term vector and class label of each word in clause text, constructs disaggregated model using tensorflow TextCNN is trained and parameter adjustment, makes it have optimal performance, and generate final disaggregated model.
Step S403: new input contract text is detected by model, if the indispensable classification of current area is all There are corresponding text matches, then it is assumed that current contract text is complete, is otherwise incomplete.
Step S404: the new input contract text and corresponding classification are saved.
Fig. 5 is 34 flow chart of risk supervision module provided in an embodiment of the present invention, as shown in figure 5, the process includes as follows Step: risk supervision module 34 receives the text that completeness detection module 33 transmits and its corresponding class label, is based on depth Text similarity measurement algorithm calculates several received text similarities under the text and current class, if highest similarity is greater than in advance If threshold value, then it is assumed that the clause is low-risk, otherwise it is assumed that the clause is high risk, provides early warning, passes through risk report It is shown.
Step S501: training sample prepares.
The contract terms text for having divided word and corresponding class label passed over based on completeness detection module, Several clause texts are picked out at random as such standard sentence without what is put back in proportion in each classification.
Step S502 to step S504: deep semantic Matching Model (or text similarity model) calculates similarity.
Here model can be based on multilayer neural network model buildings using the training of improved DSSM algorithm model Generalized semantic Matching Model, is calculated using cosine similarity.Input is the term vector of two sentences to be compared, carries out mould Type training exports as similarity.To test sample, the similarity with such several standard sentence is calculated separately, highest phase is chosen It is compared like degree with preset threshold value, if highest similarity is greater than preset threshold value, then it is assumed that the clause is low-risk, no Then think that the clause is high risk, and generates risk report.
Such as: system thresholds are preset as 0.9, when maximum similarity is 0.932, then it is assumed that the clause text is low-risk 's.
Fig. 6 is self-learning module flow chart provided in an embodiment of the present invention, as shown in fig. 6, the process key step are as follows:
Step S601 to step S603: showing and the risk report of generation is showed experience of law by correction verification module 35 Party has legal advice mechanism to confirm it.
Step S605, step S606: system can be added to the sample for having been acknowledged attribute in sample database 36 automatically. Such as: if after certain field law expert generates report progress identification to it, system can be automatically corresponding reporting Clause text is added in sample database 36, specifically, if clause text there are risk, using the clause text as The negative data of the category, if risk is not present in clause text, using the clause text as the positive example sample for changing classification.With Sample it is more and more, diversity is increasingly more complete, and the performance of system also can be higher and higher.
In conclusion the preset kind of the embodiment of the present invention based on contract, the contract currently drafted based on big data analysis Clause whether there is risk and corresponding clause risk, facilitate nonlegal personage in not corresponding Fundamentals of Law, determine The contract drafted whether there is risk, to avoid bringing corresponding loss to obligee, simultaneously because signing with risk contract User is intuitively showed by client, improves the friendly of service.
Although describing the invention in detail above, but the invention is not restricted to this, those skilled in the art of the present technique It can be carry out various modifications with principle according to the present invention.Therefore, all to be modified according to made by the principle of the invention, all it should be understood as Fall into protection scope of the present invention.

Claims (10)

1. a kind of contract text risk checking method characterized by comprising
According to the affiliated commercial field of contract text to be detected, the corresponding clause disaggregated model of the commercial field is obtained;
Using the clause disaggregated model, classify to the clause of the contract text, obtains the clause of the contract text Text and corresponding clause type;
Risk assessment is carried out to the clause text of each clause type, determines the clause text of each clause type Degree of risk.
2. the method according to claim 1, wherein further include:
According to the affiliated commercial field of contract text to be detected, obtain the corresponding clause disaggregated model of the commercial field it Before, construct the clause disaggregated model classified for the clause to contract text;
Using the training contract text of the commercial field, constructed clause disaggregated model is trained, it is excellent to obtain performance The clause disaggregated model of change.
3. according to the method described in claim 2, it is characterized in that, the training contract text using the commercial field, Constructed clause disaggregated model is trained, the clause disaggregated model for obtaining performance optimization includes:
Classify to the clause of the training contract text, obtains the clause text and corresponding item of the training contract text Money type;
Word segmentation processing is carried out to the clause text of the training contract text, obtains the clause text for forming the training contract text This word;
Term vector and corresponding clause type using the word, are adjusted the parameter of the clause disaggregated model, obtain The clause disaggregated model optimized to performance.
4. according to the method described in claim 3, it is characterized by further comprising:
The clause disaggregated model is being utilized, after classifying to the clause of the contract text, if each default clause class Type has corresponding clause text, it is determined that the contract text is complete.
5. the method according to claim 3 or 4, which is characterized in that the clause text to each clause type Risk assessment is carried out, determines that the degree of risk of the clause text of each clause type includes:
Using semantic matches model, the clause text of each clause type and the clause sample of the clause type are carried out Similarity comparison obtains clause text similarity;
According to the clause text similarity and default risk threshold value, risk assessment is carried out to the contract text, is obtained each The degree of risk of the clause text of the clause type.
6. according to the method described in claim 5, it is characterized in that, described utilize semantic matches model, by each clause The clause text of type and the clause sample of the clause type carry out similarity comparison, and obtaining clause text similarity includes:
The corresponding multiple clause samples of the clause type are obtained from sample database;
Using the semantic matches model, will form the term vector of the word of the clause text respectively with form each item The term vector of the word of money sample carries out similarity comparison, obtains the similarity of the clause text Yu each clause sample, And maximum similarity is determined as to the clause text similarity of the clause type.
7. according to the method described in claim 6, it is characterized by further comprising:
After determining the degree of risk of clause text of each clause type, by the clause text of each clause type This is saved as new samples to the sample database;
Using the new samples of the sample database, the clause categorization module and the semantic matches model are updated.
8. a kind of contract text risk supervision device characterized by comprising
Model obtains module, for it is corresponding to obtain the commercial field according to the affiliated commercial field of contract text to be detected Clause disaggregated model;
Clause categorization module classifies to the clause of the contract text, obtains institute for utilizing the clause disaggregated model State contract text clause text and corresponding clause type;
Risk evaluation module carries out risk assessment for the clause text to each clause type, determines each item The degree of risk of the clause text of money type.
9. a kind of contract text risk supervision equipment characterized by comprising processor and with the processor coupling deposit Reservoir;The contract text risk supervision program that can be run on the processor, the contract text are stored on the memory The contract text risk as described in any one of claims 1 to 7 is realized when this risk supervision program is executed by the processor The step of detection method.
10. a kind of storage medium, which is characterized in that be stored thereon with contract text risk supervision program, the contract text wind Danger detection program realizes the contract text risk checking method as described in any one of claims 1 to 7 when being executed by processor The step of.
CN201711320389.8A 2017-12-12 2017-12-12 A kind of contract text risk checking method, device, equipment and storage medium Pending CN109918635A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711320389.8A CN109918635A (en) 2017-12-12 2017-12-12 A kind of contract text risk checking method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711320389.8A CN109918635A (en) 2017-12-12 2017-12-12 A kind of contract text risk checking method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109918635A true CN109918635A (en) 2019-06-21

Family

ID=66956837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711320389.8A Pending CN109918635A (en) 2017-12-12 2017-12-12 A kind of contract text risk checking method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109918635A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457659A (en) * 2019-07-05 2019-11-15 中国平安人寿保险股份有限公司 Clause document structure tree method and terminal device
CN110502745A (en) * 2019-07-18 2019-11-26 平安科技(深圳)有限公司 Text information evaluation method, device, computer equipment and storage medium
CN110502632A (en) * 2019-07-19 2019-11-26 平安科技(深圳)有限公司 Contract terms reviewing method, device, computer equipment and storage medium based on clustering algorithm
CN110705265A (en) * 2019-08-27 2020-01-17 阿里巴巴集团控股有限公司 Contract clause risk identification method and device
CN110705955A (en) * 2019-08-22 2020-01-17 阿里巴巴集团控股有限公司 Contract detection method and device
CN110765765A (en) * 2019-09-16 2020-02-07 平安科技(深圳)有限公司 Contract key clause extraction method and device based on artificial intelligence and storage medium
CN111666408A (en) * 2020-05-26 2020-09-15 中国工商银行股份有限公司 Method and device for screening and displaying important clauses
CN111783781A (en) * 2020-05-22 2020-10-16 平安国际智慧城市科技股份有限公司 Malicious clause identification method, device and equipment based on product agreement character identification
CN112184498A (en) * 2020-09-29 2021-01-05 中国平安财产保险股份有限公司 Contract scoring method and device, computer equipment and storage medium
CN112183424A (en) * 2020-10-12 2021-01-05 北京华严互娱科技有限公司 Real-time hand tracking method and system based on video
CN112232088A (en) * 2020-11-19 2021-01-15 京北方信息技术股份有限公司 Contract clause risk intelligent identification method and device, electronic equipment and storage medium
CN112330214A (en) * 2020-11-26 2021-02-05 杭州睿胜软件有限公司 Contract review method and device and readable storage medium
CN112464660A (en) * 2020-11-25 2021-03-09 平安医疗健康管理股份有限公司 Text classification model construction method and text data processing method
CN112668899A (en) * 2020-12-31 2021-04-16 无锡软美信息科技有限公司 Contract risk identification method and device based on artificial intelligence
CN113051897A (en) * 2021-05-25 2021-06-29 中国电子科技集团公司第三十研究所 GPT2 text automatic generation method based on Performer structure
CN113779640A (en) * 2021-09-01 2021-12-10 北京橙色云科技有限公司 Contract signing method, contract signing device and storage medium
CN115392805A (en) * 2022-10-28 2022-11-25 国能大渡河大数据服务有限公司 Transaction type contract compliance risk diagnosis method and system
CN116089614A (en) * 2023-01-12 2023-05-09 杭州瓴羊智能服务有限公司 Text marking method and device
CN117151096A (en) * 2023-09-05 2023-12-01 江苏群杰物联科技有限公司 Intelligent contract checking method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080306784A1 (en) * 2007-06-05 2008-12-11 Vijay Rajkumar Computer-implemented methods and systems for analyzing clauses of contracts and other business documents
CN103366231A (en) * 2012-03-29 2013-10-23 上海天闻律师事务所 Contract risk information automatic processing method and device
CN106844544A (en) * 2016-12-30 2017-06-13 全民互联科技(天津)有限公司 A kind of contract terms Risk Identification Method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080306784A1 (en) * 2007-06-05 2008-12-11 Vijay Rajkumar Computer-implemented methods and systems for analyzing clauses of contracts and other business documents
CN103366231A (en) * 2012-03-29 2013-10-23 上海天闻律师事务所 Contract risk information automatic processing method and device
CN106844544A (en) * 2016-12-30 2017-06-13 全民互联科技(天津)有限公司 A kind of contract terms Risk Identification Method and system

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457659A (en) * 2019-07-05 2019-11-15 中国平安人寿保险股份有限公司 Clause document structure tree method and terminal device
CN110457659B (en) * 2019-07-05 2023-07-25 中国平安人寿保险股份有限公司 Clause document generation method and terminal equipment
CN110502745A (en) * 2019-07-18 2019-11-26 平安科技(深圳)有限公司 Text information evaluation method, device, computer equipment and storage medium
CN110502745B (en) * 2019-07-18 2023-04-07 平安科技(深圳)有限公司 Text information evaluation method and device, computer equipment and storage medium
CN110502632A (en) * 2019-07-19 2019-11-26 平安科技(深圳)有限公司 Contract terms reviewing method, device, computer equipment and storage medium based on clustering algorithm
CN110705955A (en) * 2019-08-22 2020-01-17 阿里巴巴集团控股有限公司 Contract detection method and device
CN110705955B (en) * 2019-08-22 2023-03-07 创新先进技术有限公司 Contract detection method and device
CN110705265A (en) * 2019-08-27 2020-01-17 阿里巴巴集团控股有限公司 Contract clause risk identification method and device
WO2021051934A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 Method and apparatus for extracting key contract term on basis of artificial intelligence, and storage medium
CN110765765A (en) * 2019-09-16 2020-02-07 平安科技(深圳)有限公司 Contract key clause extraction method and device based on artificial intelligence and storage medium
CN110765765B (en) * 2019-09-16 2023-10-20 平安科技(深圳)有限公司 Contract key term extraction method, device and storage medium based on artificial intelligence
CN111783781A (en) * 2020-05-22 2020-10-16 平安国际智慧城市科技股份有限公司 Malicious clause identification method, device and equipment based on product agreement character identification
CN111783781B (en) * 2020-05-22 2024-04-05 深圳赛安特技术服务有限公司 Malicious term recognition method, device and equipment based on product agreement character recognition
CN111666408A (en) * 2020-05-26 2020-09-15 中国工商银行股份有限公司 Method and device for screening and displaying important clauses
CN112184498A (en) * 2020-09-29 2021-01-05 中国平安财产保险股份有限公司 Contract scoring method and device, computer equipment and storage medium
CN112183424A (en) * 2020-10-12 2021-01-05 北京华严互娱科技有限公司 Real-time hand tracking method and system based on video
CN112232088A (en) * 2020-11-19 2021-01-15 京北方信息技术股份有限公司 Contract clause risk intelligent identification method and device, electronic equipment and storage medium
CN112464660B (en) * 2020-11-25 2023-02-07 深圳平安医疗健康科技服务有限公司 Text classification model construction method and text data processing method
CN112464660A (en) * 2020-11-25 2021-03-09 平安医疗健康管理股份有限公司 Text classification model construction method and text data processing method
WO2022111548A1 (en) * 2020-11-26 2022-06-02 杭州睿胜软件有限公司 Contract review method and apparatus, and readable storage medium
CN112330214A (en) * 2020-11-26 2021-02-05 杭州睿胜软件有限公司 Contract review method and device and readable storage medium
CN112668899A (en) * 2020-12-31 2021-04-16 无锡软美信息科技有限公司 Contract risk identification method and device based on artificial intelligence
CN113051897A (en) * 2021-05-25 2021-06-29 中国电子科技集团公司第三十研究所 GPT2 text automatic generation method based on Performer structure
CN113779640A (en) * 2021-09-01 2021-12-10 北京橙色云科技有限公司 Contract signing method, contract signing device and storage medium
CN115392805A (en) * 2022-10-28 2022-11-25 国能大渡河大数据服务有限公司 Transaction type contract compliance risk diagnosis method and system
CN116089614A (en) * 2023-01-12 2023-05-09 杭州瓴羊智能服务有限公司 Text marking method and device
CN116089614B (en) * 2023-01-12 2023-11-21 瓴羊智能科技有限公司 Text marking method and device
CN117151096A (en) * 2023-09-05 2023-12-01 江苏群杰物联科技有限公司 Intelligent contract checking method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109918635A (en) A kind of contract text risk checking method, device, equipment and storage medium
CN110298032B (en) Text classification corpus labeling training system
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
WO2018028077A1 (en) Deep learning based method and device for chinese semantics analysis
US20210064821A1 (en) System and method to extract customized information in natural language text
CN106257455B (en) A kind of Bootstrapping method extracting viewpoint evaluation object based on dependence template
US10733675B2 (en) Accuracy and speed of automatically processing records in an automated environment
CN104820629A (en) Intelligent system and method for emergently processing public sentiment emergency
CN104199965A (en) Semantic information retrieval method
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN111639183B (en) Financial co-industry public opinion analysis method and system based on deep learning algorithm
CN103092975A (en) Detection and filter method of network community garbage information based on topic consensus coverage rate
CN110377731A (en) Complain text handling method, device, computer equipment and storage medium
CN110196977A (en) A kind of intelligence alert inspection processing system and method
CN107943514A (en) The method for digging and system of core code element in a kind of software document
CN116992005B (en) Intelligent dialogue method, system and equipment based on large model and local knowledge base
CN107958068B (en) Language model smoothing method based on entity knowledge base
CN113919366A (en) Semantic matching method and device for power transformer knowledge question answering
CN116089873A (en) Model training method, data classification and classification method, device, equipment and medium
CN107766560B (en) Method and system for evaluating customer service flow
CN112966682A (en) File classification method and system based on semantic analysis
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN114239579A (en) Electric power searchable document extraction method and device based on regular expression and CRF model
CN113971210A (en) Data dictionary generation method and device, electronic equipment and storage medium
CN116522912B (en) Training method, device, medium and equipment for package design language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190621