WO2022111548A1 - Contract review method and apparatus, and readable storage medium - Google Patents

Contract review method and apparatus, and readable storage medium Download PDF

Info

Publication number
WO2022111548A1
WO2022111548A1 PCT/CN2021/132929 CN2021132929W WO2022111548A1 WO 2022111548 A1 WO2022111548 A1 WO 2022111548A1 CN 2021132929 W CN2021132929 W CN 2021132929W WO 2022111548 A1 WO2022111548 A1 WO 2022111548A1
Authority
WO
WIPO (PCT)
Prior art keywords
clause
contract
type
risk
text
Prior art date
Application number
PCT/CN2021/132929
Other languages
French (fr)
Chinese (zh)
Inventor
徐青松
李青
Original Assignee
杭州睿胜软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿胜软件有限公司 filed Critical 杭州睿胜软件有限公司
Publication of WO2022111548A1 publication Critical patent/WO2022111548A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents

Definitions

  • the present invention relates to the technical field of artificial intelligence, and in particular, to a contract review method, device and readable storage medium.
  • the purpose of the present invention is to provide a contract review method to automatically review and proofread contracts.
  • the present invention provides a contract review method, comprising:
  • the text classification model is used to identify the contract type of the target contract and the clause type of each clause of the target contract, and the contract type of the target contract and the clause type of each clause of the target contract According to the clause type, find the matching clause template in the sample database to compare the clause content, and output the risk level of each clause according to the comparison result.
  • the step of finding a matching clause template in the sample database to compare clause contents, and outputting the risk level of each clause according to the comparison result includes:
  • the risk level of each clause is output.
  • the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the matching clause template is found in the sample database.
  • the steps of comparing the contents of the clauses and outputting the risk level of the clauses according to the comparison results include:
  • the risk level of each item is output.
  • the clause templates in the sample database include: positive samples with no risk or a risk level lower than the first threshold, and/or with a risk or a risk level higher than the first threshold.
  • a negative example of the second threshold
  • the text classification model is obtained by training contract samples in advance.
  • the step of training contract samples to obtain the text classification model includes:
  • the text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
  • the contract review method further includes: saving the terms of the target contract to the sample database, so as to update the text classification model with new samples of the sample database .
  • the contract review method further includes:
  • the outputted comparison results include:
  • the contract review method further includes:
  • the present invention also provides a contract review device, comprising:
  • Text recognition module used to recognize the text information of the target contract
  • the comparison module is used to identify the contract type of the target contract and the clause types of each clause of the target contract based on the text information of the target contract, using a text classification model, and the contract type according to the target contract Find a matching clause template in the sample database to compare the clause content with the clause type of each clause, and output the risk level of each clause according to the comparison result.
  • the comparison module finds a matching clause template in a sample database to compare clause contents, and the step of outputting the risk level of each clause according to the comparison result includes:
  • the risk level of each clause is output.
  • the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the comparison module finds matching clauses in the sample database.
  • the clause template compares clause contents and outputs the risk level of each clause according to the comparison result, including:
  • the risk level of each item is output.
  • the clause templates in the sample database include: positive samples with no risk or a risk level lower than the first threshold, and/or with a risk or a risk level higher than the first threshold.
  • a negative example of the second threshold
  • the text classification model is obtained by pre-training contract samples.
  • the step of training contract samples to obtain the text classification model includes:
  • the text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
  • the contract review device further comprises: a storage module, configured to save the terms of the target contract to the sample database, so as to be updated with a new sample of the sample database the text classification model.
  • the contract review device further includes an output module, and the output module is configured to output the comparison result by using a text clause output model;
  • the outputted comparison results include:
  • the comparison module is further configured to generate a risk report according to the risk level of each of the clauses and send it to the client for confirmation by the client.
  • the present invention also provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the above-mentioned contract review method is implemented.
  • the contract review method, device and readable storage medium include: identifying text information of a target contract; based on the text information of the target contract, using a text classification model to identify the target The contract type of the contract and the clause type of each clause of the target contract, and according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and The risk level of each of the items is output according to the comparison result. That is, first through the automatic identification of the target contract text, and then use the text classification model to search the sample database for the clause template that matches the type of the clause of the target contract, perform automatic comparison and analysis, and output the risk points in the contract text. In this way, the efficiency of contract review is improved by automatically reviewing and proofreading the contract.
  • FIG. 2 is a block diagram of the composition of a contract review device provided by an embodiment of the present invention.
  • 11-text recognition module 12-comparison module; 13-output module; 14-storage module.
  • this embodiment provides a contract review method, and the contract review method includes the following steps:
  • the contract review method provided in this embodiment firstly identifies the target contract text automatically, and then uses the text classification model to search the sample database for a clause template that matches the type of the clause of the target contract, and performs automatic comparison and analysis. Output the risk points in the contract text, thus improving the efficiency of contract review by automatically reviewing and proofreading the contract.
  • the text presentation of the target contract may be a Word version, a PDF version, a PPT version, a TXT version or a picture version.
  • the textual information of various versions of the contract can be recognized by the character recognition model.
  • the target contract can also be compared with the contract texts of different historical versions, and through the comparison, it can be judged whether the content in the target contract and the content of the original contract (contract of any historical version) are not Consistent to avoid tampering with the target contract text.
  • the target contract and the original contract can be in the same format or in different formats (eg, one or more of Word, PDF, PPT, TXT, etc.).
  • the character recognition model Based on the pre-trained character recognition model, identify the character content in the character area of each line, and obtain the recognized character, and the character recognition model is a model based on a neural network; obtain the position information of the recognized character;
  • the difference point is located according to the position information of the difference point.
  • step S12 is performed.
  • step S12 when using the text classification model to find a matching clause template in the sample database to compare clause content, and output the risk level of each clause according to the comparison result, the following steps can be specifically adopted:
  • step S12 when using the text classification model to find a matching clause template in the sample database to compare clause contents, and outputting the risk level of each clause according to the comparison result, the specific method can be as follows: step:
  • the text classification model is obtained by training contract samples in advance.
  • the contract samples are contract samples in the sample database. Specifically, the following steps can be used to train contract samples to obtain the text classification model:
  • Mark the text paragraphs of each contract sample in the sample database to mark the contract type, clause type and risk level of each text paragraph as shown in Table 1; use the neural network model to mark the contract type, clause type and the text passages of the risk level are trained to obtain the text classification model.
  • the text classification model can be used to identify the contract type of the target contract, the clause type of each clause, and the risk level of each clause.
  • the presentation of the risk levels can be diversified, for example, the identifiers a, b, c, d are used to distinguish different levels, or the text content is high, high, low, low, etc. to distinguish different levels level.
  • the neural network model can be used to train the contract type to obtain the first classification model (also called contract type identification). model), use the neural network model to train the clause types to obtain the second classification model (also called the clause classification model), then use the first classification model to identify the contract type of the target contract, and use the second classification model to identify the target contract.
  • the clause type of each clause and then, using the character matching recognition model, according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and according to the comparison As a result, the risk level for each of the stated items is output.
  • the text classification model may be a general classification model, and may also include several sub-classification models.
  • the specific presentation form of the text classification model does not constitute a limitation to the present application, and only needs to make the model identifiable by using the model.
  • the contract type of the target contract, the clause type of each of the clauses, and the risk level of each of the clauses are sufficient.
  • the samples in the sample database can also be classified according to the risk degree of the clause, and divided into positive samples and negative samples. If the clause text in the contract text has no risk or the risk degree is lower than the first Threshold, the clause text can be used as a positive sample of the clause type and stored in the sample database; if the clause text in the contract text has a negative sample with risk or the risk degree is higher than the second threshold, the clause text can be As a counter-example sample of the clause type, it is stored in the sample database. In this way, in step S12, during the comparison, if the similarity with the positive sample is higher, the risk level of the item is lower, and if the similarity with the negative sample is higher, the risk level of the item is higher. The higher the risk level.
  • the terms of the target contract can also be saved to the sample database, so as to update the text classification model with new samples of the sample database.
  • a risk report may also be generated according to the risk level of each of the terms and sent to the client for confirmation by the client, and , and after the client confirms, save the risk report.
  • the contract review method provided in this embodiment may further include: outputting the comparison result by using a text clause output model; the outputting the comparison result includes: displaying the difference between the target contract and the matched clause template ; show the favorable party of the target contract (the term is self-, opposite, or neutral); and/or mark the clause at risk and the standard template clause that matches the clause at risk.
  • the comparison result in step S12 can be presented in various forms, and the above-mentioned examples do not constitute a limitation to the present application.
  • the comparison result can also annotate the clauses with risk, that is, adding a remark column, annotating suggestions and reasons for modifying the clauses, or outputting a revised version according to the clause template and the risk level.
  • the embodiment of the present invention also provides a contract review device, including:
  • Text recognition module 11 used to recognize the text information of the target contract
  • the comparison module 12 is configured to, based on the text information of the target contract, use a text classification model to identify the contract type of the target contract and the clause types of each clause of the target contract, and the contract according to the target contract Type and the clause type of each said clause, find a matching clause template in the sample database to compare the clause content, and output the risk level of each said clause according to the comparison result.
  • the comparison module 12 finds a matching clause template in the sample database to compare clause contents, and outputs the risk level of each clause according to the comparison result.
  • the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses
  • the comparison module 12 finds the matching clause template in the sample database to compare the clause contents
  • the step of outputting the risk level of each of the clauses according to the comparison result includes: acquiring a plurality of clause templates matching the contract type and clause type of each of the clauses from the sample database; A plurality of said clause templates are respectively compared for the similarity of word feature vectors to confirm the clause template with the greatest similarity with each of the clauses;
  • the risk level of the clause template is output, and the risk level of each clause is output.
  • the comparison module 12 is further configured to generate a risk report according to the risk level of each of the clauses and send it to the client for confirmation by the client.
  • the contract review apparatus further includes an output module 13, and the output module 13 is configured to output the comparison result by using a text clause output model;
  • the results include: showing the difference between the target contract and the matching clause template; showing the favorable party of the target contract; and/or, marking the clause at risk and the clause at risk. matching standard template terms, etc.
  • the contract review apparatus may further include: a storage module 14, configured to save the terms of the target contract to the sample database, so as to update the text classification model with new samples of the sample database.
  • a storage module 14 configured to save the terms of the target contract to the sample database, so as to update the text classification model with new samples of the sample database.
  • client-confirmed risk reports can be stored.
  • each module in the contract review device provided by this embodiment is respectively used to implement each step of the contract review method provided by this implementation. Therefore, for the specific description of the functions that each module can implement, please refer to the corresponding contract review method described above. The relevant description of the steps will not be repeated where repeated.
  • the contract reviewing device can achieve the same technical effect as the above-mentioned contract reviewing method, which will not be repeated here.
  • the text recognition module 11 , the comparison module 12 , the output module 13 and the storage module 14 may be combined in one device, or any one of them may be implemented in one device.
  • the module can be divided into multiple sub-modules, or, at least part of the functions of one or more modules of the text recognition module 11, the comparison module 12, the output module 13 and the storage module 14 can be combined with other modules. At least part of the functions of the modules are combined and implemented in one functional module.
  • At least one of the text recognition module 11 , the comparison module 12 , the output module 13 and the storage module 14 may be at least partially implemented as Hardware circuits, such as Field Programmable Gate Arrays (FPGA), Programmable Logic Arrays (PLA), System on Chip, System on Substrate, System on Package, Application Specific Integrated Circuit (ASIC), or circuits that can be integrated or packaged It can be implemented in any other reasonable manner, such as hardware or firmware, or in an appropriate combination of software, hardware and firmware.
  • Hardware circuits such as Field Programmable Gate Arrays (FPGA), Programmable Logic Arrays (PLA), System on Chip, System on Substrate, System on Package, Application Specific Integrated Circuit (ASIC), or circuits that can be integrated or packaged It can be implemented in any other reasonable manner, such as hardware or firmware, or in an appropriate combination of software, hardware and firmware.
  • the contract reviewing method of the embodiment of the present invention can be applied to the contract reviewing apparatus of the embodiment of the present invention.
  • the contract review apparatus can be configured on an electronic device, wherein the electronic device can be a personal computer, a mobile terminal, etc., and the mobile terminal can be a mobile phone, a tablet computer, or other hardware devices with various operating systems.
  • the electronic device includes a processor and a memory, and the memory is used to store a computer program; when the computer program is executed by the processor, the contract review method provided in this embodiment is implemented.
  • the memory may include random access memory (Random Access Memory, RAM), or may include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory may also be at least one storage device located away from the aforementioned processor.
  • the processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processing, DSP), dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • This embodiment further provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the contract review method provided in this embodiment is implemented.
  • the readable storage medium can be a tangible device that can hold and store instructions for use by the instruction execution device, such as, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or the above. any suitable combination. More specific examples (non-exhaustive list) of readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or Flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical coding devices, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or Flash memory erasable programmable read only memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory sticks floppy disks, mechanical coding devices, and any suitable combination of the foregoing.
  • the computer programs described herein can be downloaded to various computing/processing devices from readable storage media, or to external computers or external storage devices over a network such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives the computer program from the network and forwards the computer program for storage in a readable storage medium in the respective computing/processing device.
  • the computer program for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or any other program in one or more programming languages.
  • ISA instruction set architecture
  • the computer program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server .
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect).
  • LAN local area network
  • WAN wide area network
  • Internet service provider an Internet service provider
  • electronic circuits such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), that can execute computer programmable logic circuits, are personalized by utilizing state information from a computer program.
  • Program instructions are read to implement various aspects of the present invention.
  • the contract review method, device and readable storage medium provided by the present invention first identify the text information of the target contract, and then use the text classification model to identify the target contract based on the text information of the target contract. According to the contract type of the target contract and the clause type of each clause of the target contract, and according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and according to The comparison result outputs the risk level of each of the mentioned items. That is, first through the automatic identification of the target contract text, and then use the text classification model to search the sample database for the clause template that matches the type of the clause of the target contract, perform automatic comparison and analysis, and output the risk points in the contract text. In this way, the efficiency of contract review is improved by automatically reviewing and proofreading the contract.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • Technology Law (AREA)
  • General Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a contract review method and apparatus, and a readable storage medium. First, text information of a target contract is identified; then a contract type of the target contract and term types of terms of the target contract are identified by using a text classification model on the basis of the text information of the target contract; according to the contract type of the target contract and the term types of the terms, a matching term template is found in a sample database for term content comparison, and the risk level of each of the terms is output according to the comparison result. That is, by first automatically identifying text of a target contract, then using a text classification model to search a sample database for a term template matching the type of terms of the target contract, carrying out automatic comparative analysis, and outputting risk points in the contract text, a contract can be automatically reviewed and checked, and the contract review efficiency is improved.

Description

合同审阅方法、装置及可读存储介质Contract review method, device and readable storage medium 技术领域technical field
本发明涉及人工智能技术领域,特别涉及一种合同审阅方法、装置及可读存储介质。The present invention relates to the technical field of artificial intelligence, and in particular, to a contract review method, device and readable storage medium.
背景技术Background technique
目前法务工作者面临大量的合同的起草与修改,为了降低合同中存在的风险,需要对合同进行人工审阅,耗费大量的人力时间。为了提高审阅的效率,合同的自动审查与校对,能够提高合同审阅的效率,尤其是对于条款较多的合同文本。At present, legal workers are faced with the drafting and revision of a large number of contracts. In order to reduce the risks in the contracts, it is necessary to manually review the contracts, which consumes a lot of manpower and time. In order to improve the efficiency of the review, the automatic review and proofreading of the contract can improve the efficiency of the contract review, especially for the contract text with many clauses.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种合同审阅方法,以对合同进行自动审查与校对。The purpose of the present invention is to provide a contract review method to automatically review and proofread contracts.
为实现上述目的,本发明提供一种合同审阅方法,包括:To achieve the above purpose, the present invention provides a contract review method, comprising:
识别目标合同的文本信息;Textual information identifying the target contract;
基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。Based on the text information of the target contract, the text classification model is used to identify the contract type of the target contract and the clause type of each clause of the target contract, and the contract type of the target contract and the clause type of each clause of the target contract According to the clause type, find the matching clause template in the sample database to compare the clause content, and output the risk level of each clause according to the comparison result.
可选的,在所述的合同审阅方法中,所述在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:Optionally, in the contract review method, the step of finding a matching clause template in the sample database to compare clause contents, and outputting the risk level of each clause according to the comparison result includes:
从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;Obtain from the sample database a clause template matching the contract type and clause type of each of the clauses;
对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;Compare the similarity of word feature vectors to each of the clauses and the matched clause templates;
根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所 述条款的风险等级。According to the obtained similarity and the matching risk level of the clause template, the risk level of each clause is output.
可选的,在所述的合同审阅方法中,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板,所述在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:Optionally, in the contract review method, the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the matching clause template is found in the sample database. The steps of comparing the contents of the clauses and outputting the risk level of the clauses according to the comparison results include:
从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;obtaining, from the sample database, a plurality of clause templates matching the contract type and clause type of each of the clauses;
对各所述条款和相匹配的多个所述条款模板分别进行词语特征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;Performing a comparison of the similarity of word feature vectors on each of the clauses and a plurality of matched clause templates to confirm the clause template with the greatest similarity with each of the clauses;
根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。According to the maximum similarity obtained by comparison and the risk level of the item template with the maximum similarity, the risk level of each item is output.
可选的,在所述的合同审阅方法中,所述样本数据库中的条款模板包括:不存在风险或风险度低于第一阈值的正例样本,和/或,存在风险或风险度高于第二阈值的反例样本;Optionally, in the contract review method, the clause templates in the sample database include: positive samples with no risk or a risk level lower than the first threshold, and/or with a risk or a risk level higher than the first threshold. A negative example of the second threshold;
若所述条款与所述正例样本相似度越高,则所述条款的风险等级越低,若所述条款与所述反例样本相似度越高,则所述条款的风险等级越高。If the similarity between the clause and the positive example is higher, the risk level of the clause is lower, and if the similarity between the clause and the negative example is higher, the risk level of the clause is higher.
可选的,在所述的合同审阅方法中,所述文本分类模型通过预先对合同样本进行训练而得到。Optionally, in the contract review method, the text classification model is obtained by training contract samples in advance.
可选的,在所述的合同审阅方法中,对合同样本进行训练以得到所述文本分类模型的步骤包括:Optionally, in the contract review method, the step of training contract samples to obtain the text classification model includes:
对样本数据库中的每个合同样本的文本段落进行标注,以标注出每个文本段落的合同类型、条款类型和风险等级;Annotate the text paragraphs of each contract sample in the sample database to indicate the contract type, clause type and risk level of each text paragraph;
利用神经网络模型对标注有合同类型、条款类型以及风险等级的所述文本段落进行训练,以得到所述文本分类模型。The text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
可选的,在所述的合同审阅方法中,所述合同审阅方法还包括:将所述目标合同的条款保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。Optionally, in the contract review method, the contract review method further includes: saving the terms of the target contract to the sample database, so as to update the text classification model with new samples of the sample database .
可选的,在所述的合同审阅方法中,所述合同审阅方法还包括:Optionally, in the contract review method, the contract review method further includes:
利用文本条款输出模型输出所述比对结果;outputting the alignment result using a textual term output model;
输出的所述比对结果包括:The outputted comparison results include:
显示所述目标合同与相匹配的条款模板之间的差异;display the differences between the target contract and the matching clause template;
显示所述目标合同的有利方;和/或show the favourable parties of said target contract; and/or
标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款。Label the terms at risk and the standard template terms that match the terms at risk.
可选的,在所述的合同审阅方法中,所述合同审阅方法还包括:Optionally, in the contract review method, the contract review method further includes:
根据各所述条款的风险等级,生成风险报告发送至客户端,以供客户端确认,以及,According to the risk level of each said clause, generate a risk report and send it to the client for confirmation by the client, and,
在客户端确认后,保存所述风险报告。After client confirmation, the risk report is saved.
另一方面,本发明还提供一种合同审阅装置,包括:In another aspect, the present invention also provides a contract review device, comprising:
文本识别模块,用于识别目标合同的文本信息;Text recognition module, used to recognize the text information of the target contract;
比对模块,用于基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。The comparison module is used to identify the contract type of the target contract and the clause types of each clause of the target contract based on the text information of the target contract, using a text classification model, and the contract type according to the target contract Find a matching clause template in the sample database to compare the clause content with the clause type of each clause, and output the risk level of each clause according to the comparison result.
可选的,在所述的合同审阅装置中,所述比对模块在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:Optionally, in the contract review device, the comparison module finds a matching clause template in a sample database to compare clause contents, and the step of outputting the risk level of each clause according to the comparison result includes:
从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;Obtain from the sample database a clause template matching the contract type and clause type of each of the clauses;
对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;Compare the similarity of word feature vectors to each of the clauses and the matched clause templates;
根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所述条款的风险等级。According to the obtained similarity and the matched risk level of the clause template, the risk level of each clause is output.
可选的,在所述的合同审阅装置中,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板,所述比对模块在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:Optionally, in the contract review device, the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the comparison module finds matching clauses in the sample database. The clause template compares clause contents and outputs the risk level of each clause according to the comparison result, including:
从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;obtaining, from the sample database, a plurality of clause templates matching the contract type and clause type of each of the clauses;
对各所述条款和相匹配的多个所述条款模板分别进行词语特征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;Performing a comparison of the similarity of word feature vectors on each of the clauses and a plurality of matched clause templates to confirm the clause template with the greatest similarity with each of the clauses;
根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。According to the maximum similarity obtained by comparison and the risk level of the item template with the maximum similarity, the risk level of each item is output.
可选的,在所述的合同审阅装置中,所述样本数据库中的条款模板包括:不存在风险或风险度低于第一阈值的正例样本,和/或,存在风险或风险度高于第二阈值的反例样本;Optionally, in the contract review device, the clause templates in the sample database include: positive samples with no risk or a risk level lower than the first threshold, and/or with a risk or a risk level higher than the first threshold. A negative example of the second threshold;
若所述条款与所述正例样本相似度越高,则所述条款的风险等级越低,若所述条款与所述反例样本相似度越高,则所述条款的风险等级越高。If the similarity between the clause and the positive example is higher, the risk level of the clause is lower, and if the similarity between the clause and the negative example is higher, the risk level of the clause is higher.
可选的,在所述的合同审阅装置中,所述文本分类模型通过预先对合同样本进行训练而得到。Optionally, in the contract review apparatus, the text classification model is obtained by pre-training contract samples.
可选的,在所述的合同审阅装置中,对合同样本进行训练以得到所述文本分类模型的步骤包括:Optionally, in the contract review device, the step of training contract samples to obtain the text classification model includes:
对所述样本数据库中的每个合同样本的文本段落进行标注,以标注出每个文本段落的合同类型、条款类型和风险等级;Marking the text paragraphs of each contract sample in the sample database to mark the contract type, clause type and risk level of each text paragraph;
利用神经网络模型对标注有合同类型、条款类型以及风险等级的所述文本段落进行训练,以得到所述文本分类模型。The text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
可选的,在所述的合同审阅装置中,所述合同审阅装置还包括:存储模块,用于将所述目标合同的条款保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。Optionally, in the contract review device, the contract review device further comprises: a storage module, configured to save the terms of the target contract to the sample database, so as to be updated with a new sample of the sample database the text classification model.
可选的,在所述的合同审阅装置中,所述合同审阅装置还包括输出模块,所述输出模块用于利用文本条款输出模型输出所述比对结果;Optionally, in the contract review device, the contract review device further includes an output module, and the output module is configured to output the comparison result by using a text clause output model;
其中,输出的所述比对结果包括:Wherein, the outputted comparison results include:
显示所述目标合同与相匹配的条款模板之间的差异;display the differences between the target contract and the matching clause template;
显示所述目标合同的有利方;和/或show the favourable parties of said target contract; and/or
标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款。Label the terms at risk and the standard template terms that match the terms at risk.
可选的,在所述的合同审阅装置中,所述比对模块还用于根据各所述条款的风险等级,生成风险报告发送至客户端,以供所述客户端确认。Optionally, in the contract review apparatus, the comparison module is further configured to generate a risk report according to the risk level of each of the clauses and send it to the client for confirmation by the client.
本发明还提供一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现如上所述的合同审阅方法。The present invention also provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the above-mentioned contract review method is implemented.
综上所述,在本发明提供的合同审阅方法、装置及可读存储介质中,包括:识别目标合同的文本信息;基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。即,先通过对目标合同文本的自动识别,而后利用文本分类模型在样本数据库中搜索与目标合同的条款的类型相匹配的条款模板,进行自动比对分析,输出合同文本中的风险点所在,如此便通过对合同进行自动审查与校对,提高了合同审阅的效率。To sum up, the contract review method, device and readable storage medium provided by the present invention include: identifying text information of a target contract; based on the text information of the target contract, using a text classification model to identify the target The contract type of the contract and the clause type of each clause of the target contract, and according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and The risk level of each of the items is output according to the comparison result. That is, first through the automatic identification of the target contract text, and then use the text classification model to search the sample database for the clause template that matches the type of the clause of the target contract, perform automatic comparison and analysis, and output the risk points in the contract text. In this way, the efficiency of contract review is improved by automatically reviewing and proofreading the contract.
附图说明Description of drawings
图1为本发明实施例提供的合同审阅方法的流程图;1 is a flowchart of a contract review method provided by an embodiment of the present invention;
图2为本发明实施例提供的合同审阅装置的组成框图;FIG. 2 is a block diagram of the composition of a contract review device provided by an embodiment of the present invention;
其中,各附图标记说明如下:Wherein, each reference sign is described as follows:
11-文本识别模块;12-比对模块;13-输出模块;14-存储模块。11-text recognition module; 12-comparison module; 13-output module; 14-storage module.
具体实施方式Detailed ways
以下结合附图和具体实施例对本发明提出的合同审阅方法、装置及可读存储介质作进一步详细说明。根据下面说明,本发明的优点和特征将更清楚。需说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。此外,附图所展示的结构往往是实际结构的一部分。特别的,各附图需要展示的侧重点不同,有时会采用不同的比例。The contract review method, device and readable storage medium proposed by the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become more apparent from the following description. It should be noted that, the accompanying drawings are all in a very simplified form and in inaccurate scales, and are only used to facilitate and clearly assist the purpose of explaining the embodiments of the present invention. Furthermore, the structures shown in the drawings are often part of the actual structure. In particular, each drawing needs to show different emphases, and sometimes different scales are used.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示 这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
如图1所示,本实施例提供一种合同审阅方法,所述合同审阅方法包括如下步骤:As shown in FIG. 1 , this embodiment provides a contract review method, and the contract review method includes the following steps:
S11,识别目标合同的文本信息;S11, identifying the text information of the target contract;
S12,基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。S12, based on the text information of the target contract, using a text classification model, identify the contract type of the target contract and the clause type of each clause of the target contract, and identify the contract type of the target contract and the clause type of each clause of the target contract According to the clause type of the clause, find the matching clause template in the sample database to compare the clause content, and output the risk level of each clause according to the comparison result.
本实施例提供的所述合同审阅方法,先通过对目标合同文本的自动识别,而后利用文本分类模型在样本数据库中搜索与目标合同的条款的类型相匹配的条款模板,进行自动比对分析,输出合同文本中的风险点所在,如此便通过对合同进行自动审查与校对,提高了合同审阅的效率。The contract review method provided in this embodiment firstly identifies the target contract text automatically, and then uses the text classification model to search the sample database for a clause template that matches the type of the clause of the target contract, and performs automatic comparison and analysis. Output the risk points in the contract text, thus improving the efficiency of contract review by automatically reviewing and proofreading the contract.
以下对上述各步骤作进一步详细描述。The above steps are further described in detail below.
步骤S11中,目标合同的文本呈现可以是Word版本、PDF版本、PPT版本、TXT版本或者图片版本。可通过字符识别模型识别各种版本的合同的文本信息。In step S11, the text presentation of the target contract may be a Word version, a PDF version, a PPT version, a TXT version or a picture version. The textual information of various versions of the contract can be recognized by the character recognition model.
在识别目标合同的文本信息时,还可将目标合同与不同历史版本的合同文本进行比对,通过比对,以判断目标合同中的内容与原始合同(任一历史版本的合同)的内容是否一致,以避免目标合同文本遭到篡改。目标合同与原始合同可以是相同格式,也可以是不同格式(如:Word、PDF、PPT、TXT等格式中的一种或者几种)。When identifying the text information of the target contract, the target contract can also be compared with the contract texts of different historical versions, and through the comparison, it can be judged whether the content in the target contract and the content of the original contract (contract of any historical version) are not Consistent to avoid tampering with the target contract text. The target contract and the original contract can be in the same format or in different formats (eg, one or more of Word, PDF, PPT, TXT, etc.).
具体的,可采用如下方法来进行目标合同与原始合同的比对:Specifically, the following methods can be used to compare the target contract with the original contract:
获取目标合同的图像和原始合同的电子文档;Obtain images of the target contract and electronic documentation of the original contract;
基于预先训练的区域识别模型,识别所述目标合同的图像中的每一行字符区域,所述区域识别模型为基于神经网络的模型;Identify each line of character regions in the image of the target contract based on a pre-trained region identification model, where the region identification model is a neural network-based model;
基于预先训练的字符识别模型,识别所述每一行字符区域中的字符内容,得到识别后的字符,所述字符识别模型为基于神经网络的模型;获取所述识别后的字符的位置信息;Based on the pre-trained character recognition model, identify the character content in the character area of each line, and obtain the recognized character, and the character recognition model is a model based on a neural network; obtain the position information of the recognized character;
根据所述位置信息以及所述识别后的字符,生成所述目标合同的电子文档;对所述目标合同的电子文档和所述原始合同的电子文档进行内容对比;以及generating an electronic document of the target contract according to the location information and the recognized characters; comparing the content of the electronic document of the target contract with the electronic document of the original contract; and
根据对比结果,判断所述目标合同与所述原始合同是否有差异点,并根据所述差异点的位置信息对所述差异点进行定位。According to the comparison result, it is judged whether there is a difference between the target contract and the original contract, and the difference point is located according to the position information of the difference point.
如此,便通过上述的步骤完成了目标合同与原始合同的比对,若存在差异点,则说明目标合同可能遭到了篡改,则对定位出的差异点进行核实之后,再进行步骤S12。In this way, the comparison between the target contract and the original contract is completed through the above steps. If there is a difference, it means that the target contract may have been tampered with. After verifying the located difference, step S12 is performed.
步骤S12中,在利用所述文本分类模型在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级时,具体可采用如下步骤:In step S12, when using the text classification model to find a matching clause template in the sample database to compare clause content, and output the risk level of each clause according to the comparison result, the following steps can be specifically adopted:
从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所述条款的风险等级。Obtain a clause template matching the contract type and clause type of each of the clauses from the sample database; compare each of the clauses and the matched clause template for the similarity of word feature vectors; according to the comparison The obtained similarity and the matching risk level of the clause template are used to output the risk level of each clause.
另外,较佳的,在所述样本数据库的建立阶段,收集合同类型和条款类型相同的多个条款模板,保存至所述样本数据库,从而使得在利用所述样本数据库训练得到的所述文本分类模型进行条款比对时,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板。在此基础上,步骤S12中,在利用所述文本分类模型在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级时,具体可采用如下步骤:In addition, preferably, in the establishment stage of the sample database, a plurality of clause templates with the same contract type and clause type are collected and saved to the sample database, so that the text classification obtained by training using the sample database When the model compares clauses, the sample database includes a plurality of clause templates matching the contract type and clause type of each clause. On this basis, in step S12, when using the text classification model to find a matching clause template in the sample database to compare clause contents, and outputting the risk level of each clause according to the comparison result, the specific method can be as follows: step:
从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;对各所述条款和相匹配的多个所述条款模板分别进行词语特 征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。Obtain a plurality of clause templates matching the contract type and clause type of each clause from the sample database; compare the similarity of word feature vectors for each clause and the matched clause templates respectively Yes, to confirm the clause template with the maximum similarity with each of the clauses; output the risk level of each clause according to the maximum similarity obtained from the comparison and the risk level of the clause template with the maximum similarity.
其中,所述文本分类模型通过预先对合同样本进行训练而得到。所述合同样本为所述样本数据库中的合同样本。具体的,可采用如下步骤对合同样本进行训练以得到所述文本分类模型:Wherein, the text classification model is obtained by training contract samples in advance. The contract samples are contract samples in the sample database. Specifically, the following steps can be used to train contract samples to obtain the text classification model:
对样本数据库中的每个合同样本的文本段落进行标注,以标注出如表1所示的每个文本段落的合同类型、条款类型和风险等级;利用神经网络模型对标注有合同类型、条款类型以及风险等级的所述文本段落进行训练,以得到所述文本分类模型。Mark the text paragraphs of each contract sample in the sample database to mark the contract type, clause type and risk level of each text paragraph as shown in Table 1; use the neural network model to mark the contract type, clause type and the text passages of the risk level are trained to obtain the text classification model.
如此,在识别目标合同的文本信息后,便可利用所述文本分类模型识别所述目标合同的合同类型、各所述条款的条款类型和各所述条款的风险程度。In this way, after identifying the text information of the target contract, the text classification model can be used to identify the contract type of the target contract, the clause type of each clause, and the risk level of each clause.
本实施例中,所述风险等级的呈现可多元化,例如,以标识符a、b、c、d以区分不同的等级,或者,以文本内容高、较高、低、较低等区分不同的等级。In this embodiment, the presentation of the risk levels can be diversified, for example, the identifiers a, b, c, d are used to distinguish different levels, or the text content is high, high, low, low, etc. to distinguish different levels level.
表1Table 1
合同类型type of contract 条款类型Clause Type 内容content 风险等级Risk level
WWWWWW wwwwww AAAAAAAA aaaaaa
XXXXXX xxxxxx BBBBBBBB bbbbbb
YYYYYY yyyyyy CCCCCCCCCC cccccc
ZZZZZZ zzzzzz DDDDDDDDDDDD dddddd
。。。. . . 。。。. . . 。。。. . . 。。。. . .
在另外一些实施例中,还可在标注出每个文本段落的合同类型、条款类型和风险等级后,利用神经网络模型对合同类型进行训练,以得到第一分类模型(也可叫合同类别识别模型),利用神经网络模型对条款类型进行训练,以得到第二分类模型(也可叫条款分类模型),而后,利用第一分类模型识别目标合同的合同类型,利用第二分类模型识别目标合同的各条款的条款类型,进而,利用字符匹配识别模型根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据 比对结果输出各所述条款的风险等级。In some other embodiments, after the contract type, clause type and risk level of each text paragraph are marked, the neural network model can be used to train the contract type to obtain the first classification model (also called contract type identification). model), use the neural network model to train the clause types to obtain the second classification model (also called the clause classification model), then use the first classification model to identify the contract type of the target contract, and use the second classification model to identify the target contract. The clause type of each clause, and then, using the character matching recognition model, according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and according to the comparison As a result, the risk level for each of the stated items is output.
从上述描述可知,所述文本分类模型可为一个总的分类模型,也可包括若干子分类模型,所述文本分类模型的具体呈现形式不构成对本申请的限制,只需使得利用该模型可识别所述目标合同的合同类型、各所述条款的条款类型和各所述条款的风险程度即可。As can be seen from the above description, the text classification model may be a general classification model, and may also include several sub-classification models. The specific presentation form of the text classification model does not constitute a limitation to the present application, and only needs to make the model identifiable by using the model. The contract type of the target contract, the clause type of each of the clauses, and the risk level of each of the clauses are sufficient.
另外,在样本训练时,还可以根据条款的风险度,对样本数据库中的样本进行分类,分为正例样本和反例样本,如果合同文本中的条款文本不存在风险或风险度低于第一阈值,则可以将该条款文本作为该条款类型的正例样本,存入样本数据库;如果合同文本中的条款文本具有存在风险或风险度高于第二阈值的反例样本,则可以将该条款文本作为该条款类型的反例样本,存入样本数据库。如此,在步骤S12中,进行比对时,若与所述正例样本相似度越高,则所述条款的风险等级越低,若与所述反例样本相似度越高,则所述条款的风险等级越高。In addition, during sample training, the samples in the sample database can also be classified according to the risk degree of the clause, and divided into positive samples and negative samples. If the clause text in the contract text has no risk or the risk degree is lower than the first Threshold, the clause text can be used as a positive sample of the clause type and stored in the sample database; if the clause text in the contract text has a negative sample with risk or the risk degree is higher than the second threshold, the clause text can be As a counter-example sample of the clause type, it is stored in the sample database. In this way, in step S12, during the comparison, if the similarity with the positive sample is higher, the risk level of the item is lower, and if the similarity with the negative sample is higher, the risk level of the item is higher. The higher the risk level.
进一步的,还可将所述目标合同的条款保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。Further, the terms of the target contract can also be saved to the sample database, so as to update the text classification model with new samples of the sample database.
可选的,步骤S12中,除了根据比对结果输出各所述条款的风险等级,还可根据各所述条款的风险等级,生成风险报告发送至客户端,以供所述客户端确认,以及,在所述客户端确认后,保存所述风险报告。Optionally, in step S12, in addition to outputting the risk level of each of the terms according to the comparison result, a risk report may also be generated according to the risk level of each of the terms and sent to the client for confirmation by the client, and , and after the client confirms, save the risk report.
此外,本实施例提供的合同审阅方法还可包括:利用文本条款输出模型输出所述比对结果;输出的所述比对结果包括:显示所述目标合同与相匹配的条款模板之间的差异;显示所述目标合同的有利方(该条款是偏向自己、对方或为中性条款);和/或标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款。In addition, the contract review method provided in this embodiment may further include: outputting the comparison result by using a text clause output model; the outputting the comparison result includes: displaying the difference between the target contract and the matched clause template ; show the favorable party of the target contract (the term is self-, opposite, or neutral); and/or mark the clause at risk and the standard template clause that matches the clause at risk.
也就是说,本实施例中,步骤S12中的比对结果能够以多种形式呈现,上述示例性的几种列举,并不能构成对于本申请的限制。例如,比对结果还可对存在风险的所述条款进行批注,即增加批注栏,批注对条款进行修改的建议及原因说明,或者,输出根据条款模板以及风险度进行修改后的版本。That is to say, in this embodiment, the comparison result in step S12 can be presented in various forms, and the above-mentioned examples do not constitute a limitation to the present application. For example, the comparison result can also annotate the clauses with risk, that is, adding a remark column, annotating suggestions and reasons for modifying the clauses, or outputting a revised version according to the clause template and the risk level.
本发明实施例还提供一种合同审阅装置,包括:The embodiment of the present invention also provides a contract review device, including:
文本识别模块11,用于识别目标合同的文本信息;Text recognition module 11, used to recognize the text information of the target contract;
比对模块12,用于基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。The comparison module 12 is configured to, based on the text information of the target contract, use a text classification model to identify the contract type of the target contract and the clause types of each clause of the target contract, and the contract according to the target contract Type and the clause type of each said clause, find a matching clause template in the sample database to compare the clause content, and output the risk level of each said clause according to the comparison result.
其中,所述比对模块12在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所述条款的风险等级。Wherein, the comparison module 12 finds a matching clause template in the sample database to compare clause contents, and outputs the risk level of each clause according to the comparison result. The contract type of the said clause and the clause template matching the clause type; compare the similarity of the word feature vector between each clause and the matched clause template; The risk level of the clause template, output the risk level of each said clause.
进一步的,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板,所述比对模块12在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;对各所述条款和相匹配的多个所述条款模板分别进行词语特征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。Further, the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the comparison module 12 finds the matching clause template in the sample database to compare the clause contents, and The step of outputting the risk level of each of the clauses according to the comparison result includes: acquiring a plurality of clause templates matching the contract type and clause type of each of the clauses from the sample database; A plurality of said clause templates are respectively compared for the similarity of word feature vectors to confirm the clause template with the greatest similarity with each of the clauses; The risk level of the clause template is output, and the risk level of each clause is output.
关于所述条款模板和所述文本分类模型在前文部分已做出详细描述,在此不再赘述。除了根据比对结果输出各所述条款的风险等级,可选的,所述比对模块12还用于根据各所述条款的风险等级,生成风险报告发送至客户端,以供客户端确认。The clause template and the text classification model have been described in detail in the previous section, and will not be repeated here. In addition to outputting the risk level of each of the clauses according to the comparison result, optionally, the comparison module 12 is further configured to generate a risk report according to the risk level of each of the clauses and send it to the client for confirmation by the client.
此外,与本实施例提供的合同审阅方法相对应的,所述合同审阅装置还包括输出模块13,所述输出模块13用于利用文本条款输出模型输出所述比对结果;输出的所述比对结果包括:显示所述目标合同与相匹配的条款模板之间的差异;显示所述目标合同的有利方;和/或,标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款,等等。In addition, corresponding to the contract review method provided in this embodiment, the contract review apparatus further includes an output module 13, and the output module 13 is configured to output the comparison result by using a text clause output model; The results include: showing the difference between the target contract and the matching clause template; showing the favorable party of the target contract; and/or, marking the clause at risk and the clause at risk. matching standard template terms, etc.
所述合同审阅装置还可包括:存储模块14,用于将所述目标合同的条款 保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。另外,还可存储客户端确认的风险报告。The contract review apparatus may further include: a storage module 14, configured to save the terms of the target contract to the sample database, so as to update the text classification model with new samples of the sample database. In addition, client-confirmed risk reports can be stored.
总而言之,本实施例提供的合同审阅装置中的各模块分别用于实现本实施提供的合同审阅方法的各步骤,因此,各模块能够实现的功能的具体说明可以参考上述所述合同审阅方法的相应步骤的相关描述,重复之处不再赘述。此外,所述合同审阅装置可以实现与上述合同审阅方法相同的技术效果,在此亦不再赘述。All in all, each module in the contract review device provided by this embodiment is respectively used to implement each step of the contract review method provided by this implementation. Therefore, for the specific description of the functions that each module can implement, please refer to the corresponding contract review method described above. The relevant description of the steps will not be repeated where repeated. In addition, the contract reviewing device can achieve the same technical effect as the above-mentioned contract reviewing method, which will not be repeated here.
可以理解的是,所述合同审阅装置中,所述文本识别模块11、所述比对模块12、所述输出模块13及所述存储模块14可以合并在一个装置中实现,或者其中的任意一个模块可以被拆分成多个子模块,或者,所述文本识别模块11、所述比对模块12、所述输出模块13及所述存储模块14的一个或多个模块的至少部分功能可以与其他模块的至少部分功能相结合,并在一个功能模块中实现。根据本发明的实施例,所述合同审阅装置中,所述文本识别模块11、所述比对模块12、所述输出模块13及所述存储模块14中的至少一个可以至少被部分地实现为硬件电路,例如现场可编程门阵列(FPGA)、可编程逻辑阵列(PLA)、片上系统、基板上的系统、封装上的系统、专用集成电路(ASIC),或可以以对电路进行集成或封装的任何其他的合理方式等硬件或固件来实现,或以软件、硬件以及固件三种实现方式的适当组合来实现。It can be understood that, in the contract review device, the text recognition module 11 , the comparison module 12 , the output module 13 and the storage module 14 may be combined in one device, or any one of them may be implemented in one device. The module can be divided into multiple sub-modules, or, at least part of the functions of one or more modules of the text recognition module 11, the comparison module 12, the output module 13 and the storage module 14 can be combined with other modules. At least part of the functions of the modules are combined and implemented in one functional module. According to an embodiment of the present invention, in the contract review apparatus, at least one of the text recognition module 11 , the comparison module 12 , the output module 13 and the storage module 14 may be at least partially implemented as Hardware circuits, such as Field Programmable Gate Arrays (FPGA), Programmable Logic Arrays (PLA), System on Chip, System on Substrate, System on Package, Application Specific Integrated Circuit (ASIC), or circuits that can be integrated or packaged It can be implemented in any other reasonable manner, such as hardware or firmware, or in an appropriate combination of software, hardware and firmware.
从上述描述可知,本发明实施例的合同审阅方法可应用于本发明实施例的合同审阅装置。此外,该合同审阅装置可被配置于电子设备上,其中,该电子设备可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。所述电子设备包括处理器和存储器,所述存储器用于存放计算机程序;所述计算机程序被所述处理器执行时实现本实施例提供的所述合同审阅方法。It can be seen from the above description that the contract reviewing method of the embodiment of the present invention can be applied to the contract reviewing apparatus of the embodiment of the present invention. In addition, the contract review apparatus can be configured on an electronic device, wherein the electronic device can be a personal computer, a mobile terminal, etc., and the mobile terminal can be a mobile phone, a tablet computer, or other hardware devices with various operating systems. The electronic device includes a processor and a memory, and the memory is used to store a computer program; when the computer program is executed by the processor, the contract review method provided in this embodiment is implemented.
在所述电子设备中,所述存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。In the electronic device, the memory may include random access memory (Random Access Memory, RAM), or may include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory.
可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。Optionally, the memory may also be at least one storage device located away from the aforementioned processor.
所述处理器可以是通用处理器,包括中央处理器(Central Processing Unit, CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processing, DSP), dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
本实施例还提供一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现本实施例提供的所述合同审阅方法。This embodiment further provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the contract review method provided in this embodiment is implemented.
所述可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备,例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备以及上述的任意合适的组合。这里所描述的计算机程序可以从可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收所述计算机程序,并转发该计算机程序,以供存储在各个计算/处理设备中的可读存储介质中。用于执行本发明操作的计算机程序可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。所述计算机程序可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网 (WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机程序的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。The readable storage medium can be a tangible device that can hold and store instructions for use by the instruction execution device, such as, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or the above. any suitable combination. More specific examples (non-exhaustive list) of readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or Flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical coding devices, and any suitable combination of the foregoing. The computer programs described herein can be downloaded to various computing/processing devices from readable storage media, or to external computers or external storage devices over a network such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives the computer program from the network and forwards the computer program for storage in a readable storage medium in the respective computing/processing device. The computer program for carrying out the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or any other program in one or more programming languages. Combining source or object code written in programming languages including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server . Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, through the Internet using an Internet service provider) connect). In some embodiments, electronic circuits, such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), that can execute computer programmable logic circuits, are personalized by utilizing state information from a computer program. Program instructions are read to implement various aspects of the present invention.
综上所述,本实发明提供的合同审阅方法、装置及可读存储介质,首先识别目标合同的文本信息,而后基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。即,先通过对目标合同文本的自动识别,而后利用文本分类模型在样本数据库中搜索与目标合同的条款的类型相匹配的条款模板,进行自动比对分析,输出合同文本中的风险点所在,如此便通过对合同进行自动审查与校对,提高了合同审阅的效率。To sum up, the contract review method, device and readable storage medium provided by the present invention first identify the text information of the target contract, and then use the text classification model to identify the target contract based on the text information of the target contract. According to the contract type of the target contract and the clause type of each clause of the target contract, and according to the contract type of the target contract and the clause type of each clause, find a matching clause template in the sample database to compare the clause content, and according to The comparison result outputs the risk level of each of the mentioned items. That is, first through the automatic identification of the target contract text, and then use the text classification model to search the sample database for the clause template that matches the type of the clause of the target contract, perform automatic comparison and analysis, and output the risk points in the contract text. In this way, the efficiency of contract review is improved by automatically reviewing and proofreading the contract.
上述描述仅是对本发明较佳实施例的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于权利要求书的保护范围。The above description is only a description of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any changes and modifications made by those of ordinary skill in the field of the present invention based on the above disclosure all belong to the protection scope of the claims.

Claims (19)

  1. 一种合同审阅方法,其特征在于,包括:A contract review method, comprising:
    识别目标合同的文本信息;Textual information identifying the target contract;
    基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。Based on the text information of the target contract, the text classification model is used to identify the contract type of the target contract and the clause type of each clause of the target contract, and the contract type of the target contract and the clause type of each clause of the target contract According to the clause type, find the matching clause template in the sample database to compare the clause content, and output the risk level of each clause according to the comparison result.
  2. 如权利要求1所述的合同审阅方法,其特征在于,所述在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:The contract review method according to claim 1, wherein the step of finding a matching clause template in a sample database to compare clause contents, and outputting the risk level of each clause according to the comparison result comprises:
    从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;Obtain from the sample database a clause template matching the contract type and clause type of each of the clauses;
    对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;Compare the similarity of word feature vectors to each of the clauses and the matched clause templates;
    根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所述条款的风险等级。According to the obtained similarity and the matched risk level of the clause template, the risk level of each clause is output.
  3. 如权利要求1所述的合同审阅方法,其特征在于,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板,所述在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:The contract review method according to claim 1, wherein the sample database includes a plurality of clause templates matching the contract type and clause type of each of the clauses, and the matching clause is found in the sample database The template compares the content of the clauses, and outputs the risk level of each clause according to the comparison result, including:
    从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;obtaining, from the sample database, a plurality of clause templates matching the contract type and clause type of each of the clauses;
    对各所述条款和相匹配的多个所述条款模板分别进行词语特征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;Performing a comparison of the similarity of word feature vectors on each of the clauses and a plurality of matched clause templates to confirm the clause template with the greatest similarity with each of the clauses;
    根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。According to the maximum similarity obtained by comparison and the risk level of the item template with the maximum similarity, the risk level of each item is output.
  4. 如权利要求2或3所述的合同审阅方法,其特征在于,所述样本数据库中的条款模板包括:不存在风险或风险度低于第一阈值的正例样本,和/或, 存在风险或风险度高于第二阈值的反例样本;The contract review method according to claim 2 or 3, wherein the clause templates in the sample database include: positive samples with no risk or with a risk degree lower than a first threshold, and/or, with risk or Counter-examples whose risk is higher than the second threshold;
    若所述条款与所述正例样本相似度越高,则所述条款的风险等级越低,若所述条款与所述反例样本相似度越高,则所述条款的风险等级越高。If the similarity between the clause and the positive example is higher, the risk level of the clause is lower, and if the similarity between the clause and the negative example is higher, the risk level of the clause is higher.
  5. 如权利要求1所述的合同审阅方法,其特征在于,所述文本分类模型通过预先对合同样本进行训练而得到。The contract review method according to claim 1, wherein the text classification model is obtained by training contract samples in advance.
  6. 如权利要求5所述的合同审阅方法,其特征在于,对合同样本进行训练以得到所述文本分类模型的步骤包括:The contract review method according to claim 5, wherein the step of training contract samples to obtain the text classification model comprises:
    对所述样本数据库中的每个合同样本的文本段落进行标注,以标注出每个文本段落的合同类型、条款类型和风险等级;Marking the text paragraphs of each contract sample in the sample database to mark the contract type, clause type and risk level of each text paragraph;
    利用神经网络模型对标注有合同类型、条款类型以及风险等级的所述文本段落进行训练,以得到所述文本分类模型。The text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
  7. 如权利要求6所述的合同审阅方法,其特征在于,所述合同审阅方法还包括:将所述目标合同的条款保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。The contract review method according to claim 6, wherein the contract review method further comprises: saving the terms of the target contract to the sample database, so as to update the text with a new sample of the sample database classification model.
  8. 如权利要求1所述的合同审阅方法,其特征在于,所述合同审阅方法还包括:The contract review method according to claim 1, wherein the contract review method further comprises:
    利用文本条款输出模型输出所述比对结果;outputting the alignment result using a textual term output model;
    其中,输出的所述比对结果包括:Wherein, the outputted comparison results include:
    显示所述目标合同与相匹配的条款模板之间的差异;display the differences between the target contract and the matching clause template;
    显示所述目标合同的有利方;和/或show the favourable parties of said target contract; and/or
    标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款。Label the terms at risk and the standard template terms that match the terms at risk.
  9. 如权利要求1所述的合同审阅方法,其特征在于,所述合同审阅方法还包括:The contract review method according to claim 1, wherein the contract review method further comprises:
    根据各所述条款的风险等级,生成风险报告发送至客户端,以供所述客户端确认,以及,According to the risk level of each said clause, a risk report is generated and sent to the client for confirmation by the client, and,
    在所述客户端确认后,保存所述风险报告。After confirmation by the client, the risk report is saved.
  10. 一种合同审阅装置,其特征在于,包括:A contract review device, characterized in that it includes:
    文本识别模块,用于识别目标合同的文本信息;Text recognition module, used to recognize the text information of the target contract;
    比对模块,用于基于所述目标合同的文本信息,利用文本分类模型,识别出所述目标合同的合同类型和所述目标合同的各条款的条款类型,以及根据所述目标合同的合同类型和各所述条款的条款类型,在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级。The comparison module is used to identify the contract type of the target contract and the clause types of each clause of the target contract based on the text information of the target contract, using a text classification model, and the contract type according to the target contract Find a matching clause template in the sample database to compare the clause content with the clause type of each clause, and output the risk level of each clause according to the comparison result.
  11. 如权利要求10所述的合同审阅装置,其特征在于,所述比对模块在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:The contract review apparatus according to claim 10, wherein the comparison module finds a matching clause template in a sample database to compare clause contents, and outputs the risk level of each clause according to the comparison result include:
    从所述样本数据库中获取与各所述条款的合同类型及条款类型相匹配的条款模板;Obtain from the sample database a clause template matching the contract type and clause type of each of the clauses;
    对各所述条款和相匹配的所述条款模板进行词语特征向量相似度的比对;Compare the similarity of word feature vectors to each of the clauses and the matched clause templates;
    根据比对得到的相似度和相匹配的所述条款模板的风险等级,输出各所述条款的风险等级。According to the obtained similarity and the matched risk level of the clause template, the risk level of each clause is output.
  12. 如权利要求10所述的合同审阅装置,其特征在于,所述样本数据库中包括多个与各所述条款的合同类型及条款类型相匹配的条款模板,所述比对模块在样本数据库找到相匹配的条款模板进行条款内容比对,并根据比对结果输出各所述条款的风险等级的步骤包括:The contract review apparatus according to claim 10, wherein the sample database includes a plurality of clause templates that match the contract type and clause type of each of the clauses, and the comparison module finds the corresponding clauses in the sample database. The matched clause template compares clause content, and outputs the risk level of each clause according to the comparison result, including:
    从所述样本数据库中获取与各所述条款的合同类型和条款类型相匹配的多个条款模板;obtaining, from the sample database, a plurality of clause templates matching the contract type and clause type of each of the clauses;
    对各所述条款和相匹配的多个所述条款模板分别进行词语特征向量相似度的比对,以确认与各所述条款具有最大相似度的所述条款模板;Performing a comparison of the similarity of word feature vectors on each of the clauses and a plurality of matched clause templates to confirm the clause template with the greatest similarity with each of the clauses;
    根据比对得到的最大相似度和具有最大相似度的所述条款模板的风险等级,输出各所述条款的风险等级。According to the maximum similarity obtained by comparison and the risk level of the item template with the maximum similarity, the risk level of each item is output.
  13. 如权利要求11或12所述的合同审阅装置,其特征在于,所述样本数据库中的条款模板包括:不存在风险或风险度低于第一阈值的正例样本,和/或,存在风险或风险度高于第二阈值的反例样本;The contract review apparatus according to claim 11 or 12, wherein the clause templates in the sample database include: positive samples with no risk or a risk degree lower than a first threshold, and/or with a risk or Counter-example samples whose risk is higher than the second threshold;
    若所述条款与所述正例样本相似度越高,则所述条款的风险等级越低,若所述条款与所述反例样本相似度越高,则所述条款的风险等级越高。If the similarity between the clause and the positive example is higher, the risk level of the clause is lower, and if the similarity between the clause and the negative example is higher, the risk level of the clause is higher.
  14. 如权利要求13所述的合同审阅装置,其特征在于,所述文本分类模型通过预先对合同样本进行训练而得到。The contract review apparatus according to claim 13, wherein the text classification model is obtained by pre-training contract samples.
  15. 如权利要求14所述的合同审阅装置,其特征在于,对合同样本进行训练以得到所述文本分类模型的步骤包括:The contract review apparatus according to claim 14, wherein the step of training contract samples to obtain the text classification model comprises:
    对所述样本数据库中的每个合同样本的文本段落进行标注,以标注出每个文本段落的合同类型、条款类型和风险等级;Marking the text paragraphs of each contract sample in the sample database to mark the contract type, clause type and risk level of each text paragraph;
    利用神经网络模型对标注有合同类型、条款类型以及风险等级的所述文本段落进行训练,以得到所述文本分类模型。The text paragraphs marked with contract types, clause types and risk levels are trained by using a neural network model to obtain the text classification model.
  16. 如权利要求15所述的合同审阅装置,其特征在于,所述合同审阅装置还包括:存储模块,用于将所述目标合同的条款保存至所述样本数据库,以利用所述样本数据库的新样本更新所述文本分类模型。The contract review device according to claim 15, characterized in that, the contract review device further comprises: a storage module, configured to save the terms of the target contract to the sample database, so as to utilize the new data of the sample database. The samples update the text classification model.
  17. 如权利要求10所述的合同审阅装置,其特征在于,所述合同审阅装置还包括输出模块,所述输出模块用于利用文本条款输出模型输出所述比对结果;The contract review device according to claim 10, wherein the contract review device further comprises an output module, and the output module is configured to output the comparison result by using a text clause output model;
    其中,输出的所述比对结果包括:Wherein, the outputted comparison results include:
    显示所述目标合同与相匹配的条款模板之间的差异;display the differences between the target contract and the matching clause template;
    显示所述目标合同的有利方;和/或show the favourable parties of said target contract; and/or
    标注出存在风险的所述条款及与存在风险的所述条款相匹配的标准模板条款。Label the terms at risk and the standard template terms that match the terms at risk.
  18. 如权利要求10所述的合同审阅装置,其特征在于,所述比对模块还用于根据各所述条款的风险等级,生成风险报告发送至客户端,以供所述客户端确认。The contract review apparatus according to claim 10, wherein the comparison module is further configured to generate a risk report according to the risk level of each of the clauses and send it to the client for confirmation by the client.
  19. 一种可读存储介质,其特征在于,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现如权利要求1至9中任一项所述的合同审阅方法。A readable storage medium, wherein a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the contract review method according to any one of claims 1 to 9 is implemented .
PCT/CN2021/132929 2020-11-26 2021-11-24 Contract review method and apparatus, and readable storage medium WO2022111548A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011347815.9 2020-11-26
CN202011347815.9A CN112330214A (en) 2020-11-26 2020-11-26 Contract review method and device and readable storage medium

Publications (1)

Publication Number Publication Date
WO2022111548A1 true WO2022111548A1 (en) 2022-06-02

Family

ID=74307972

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132929 WO2022111548A1 (en) 2020-11-26 2021-11-24 Contract review method and apparatus, and readable storage medium

Country Status (2)

Country Link
CN (1) CN112330214A (en)
WO (1) WO2022111548A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152843A (en) * 2022-11-22 2023-05-23 南京擎盾信息科技有限公司 Category identification method, device and storage medium for contract template to be filled-in content
CN116384387A (en) * 2023-01-04 2023-07-04 深圳擎盾信息科技有限公司 Automatic combination and examination method and device
CN116976683A (en) * 2023-09-25 2023-10-31 江铃汽车股份有限公司 Automatic auditing method, system, storage medium and device for contract clauses

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330214A (en) * 2020-11-26 2021-02-05 杭州睿胜软件有限公司 Contract review method and device and readable storage medium
CN112950017A (en) * 2021-02-26 2021-06-11 云账户技术(天津)有限公司 Contract risk identification method and device and electronic equipment
CN112926299B (en) * 2021-03-29 2024-04-09 杭州天谷信息科技有限公司 Text comparison method, contract review method and auditing system
CN113326684B (en) * 2021-08-03 2021-11-09 江苏金恒信息科技股份有限公司 Contract signing management method, system and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918635A (en) * 2017-12-12 2019-06-21 中兴通讯股份有限公司 A kind of contract text risk checking method, device, equipment and storage medium
CN110163478A (en) * 2019-04-18 2019-08-23 平安科技(深圳)有限公司 A kind of the risk checking method and device of contract terms
CN112330214A (en) * 2020-11-26 2021-02-05 杭州睿胜软件有限公司 Contract review method and device and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918635A (en) * 2017-12-12 2019-06-21 中兴通讯股份有限公司 A kind of contract text risk checking method, device, equipment and storage medium
CN110163478A (en) * 2019-04-18 2019-08-23 平安科技(深圳)有限公司 A kind of the risk checking method and device of contract terms
CN112330214A (en) * 2020-11-26 2021-02-05 杭州睿胜软件有限公司 Contract review method and device and readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152843A (en) * 2022-11-22 2023-05-23 南京擎盾信息科技有限公司 Category identification method, device and storage medium for contract template to be filled-in content
CN116152843B (en) * 2022-11-22 2024-01-12 南京擎盾信息科技有限公司 Category identification method, device and storage medium for contract template to be filled-in content
CN116384387A (en) * 2023-01-04 2023-07-04 深圳擎盾信息科技有限公司 Automatic combination and examination method and device
CN116976683A (en) * 2023-09-25 2023-10-31 江铃汽车股份有限公司 Automatic auditing method, system, storage medium and device for contract clauses
CN116976683B (en) * 2023-09-25 2024-02-27 江铃汽车股份有限公司 Automatic auditing method, system, storage medium and device for contract clauses

Also Published As

Publication number Publication date
CN112330214A (en) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2022111548A1 (en) Contract review method and apparatus, and readable storage medium
US10095780B2 (en) Automatically mining patterns for rule based data standardization systems
WO2022057708A1 (en) Answer autofill method, electronic device, and readable storage medium
US20160335244A1 (en) System and method for text normalization in noisy channels
CN110928931B (en) Sensitive data processing method and device, electronic equipment and storage medium
CN110909123B (en) Data extraction method and device, terminal equipment and storage medium
JP2016536652A (en) Real-time speech evaluation system and method for mobile devices
US9535910B2 (en) Corpus generation based upon document attributes
WO2019228137A1 (en) Method and apparatus for generating message digest, and electronic device and storage medium
US20220237409A1 (en) Data processing method, electronic device and computer program product
US11423219B2 (en) Generation and population of new application document utilizing historical application documents
WO2023240878A1 (en) Resource recognition method and apparatus, and device and storage medium
CN110532449B (en) Method, device, equipment and storage medium for processing service document
US11630869B2 (en) Identification of changes between document versions
US11126797B2 (en) Toxic vector mapping across languages
CN112989050B (en) Form classification method, device, equipment and storage medium
US11645457B2 (en) Natural language processing and data set linking
CN111177387A (en) User list information processing method, electronic device and computer readable storage medium
CN114842982B (en) Knowledge expression method, device and system for medical information system
US20230110931A1 (en) Method and Apparatus for Data Structuring of Text
CN111708819B (en) Method, apparatus, electronic device, and storage medium for information processing
US11593417B2 (en) Assigning documents to entities of a database
CN114154480A (en) Information extraction method, device, equipment and storage medium
US10002450B2 (en) Analyzing a document that includes a text-based visual representation
US11132500B2 (en) Annotation task instruction generation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21897048

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21897048

Country of ref document: EP

Kind code of ref document: A1