CN116227487A

CN116227487A - Legal text risk point intelligent auditing system

Info

Publication number: CN116227487A
Application number: CN202310031069.XA
Authority: CN
Inventors: 华涛; 郝锦程; 周志明
Original assignee: Zhejiang Fazhidao Information Technology Co ltd
Current assignee: Zhejiang Fazhidao Information Technology Co ltd
Priority date: 2023-01-10
Filing date: 2023-01-10
Publication date: 2023-06-06
Anticipated expiration: 2043-01-10
Also published as: CN116227487B

Abstract

The invention relates to the technical field of text data analysis, and discloses an intelligent auditing system for legal text risk points, which comprises an operation terminal, a server and a service terminal: the operation terminal is used for a user to upload the document to be checked; the server comprises a grabbing module, a word segmentation module, an analysis module, a detection data generation module and a risk point library, and the service terminal is used for providing the important sentences related to the risk points for the user. According to the method and the device, important vocabularies in the document to be audited are grabbed according to the preset grabbing rules, the important sentences are relocated, the important sentences are segmented and split, and then corresponding detection data in a picture format are generated, so that risk of the detection data is conveniently judged by a risk point library based on an AI technology, risk points are obtained, suspicious, risk or error positions in legal texts can be efficiently checked, the work efficiency of legal matters is improved, and the work load of the legal matters is reduced.

Description

Legal text risk point intelligent auditing system

Technical Field

The invention relates to the technical field of text data analysis, in particular to an intelligent auditing system for legal text risk points.

Background

The risk point auditing is an indispensable process of law affairs when writing law documents, and aims to ensure that the law documents are real, legal and effective and avoid possible problems in advance.

The existing risk point auditing is mainly completed by manpower, has low efficiency and more limitation, and firstly requires deep lawyers, and the quantity of the deep lawyers is small, so that the manual auditing cost is high, the auditing time is long, the risk cannot be completely identified, the identification rate is not high, and the accuracy is not high.

In order to solve the problem, the legal text risk point intelligent auditing system fuses a machine learning algorithm, a text analysis and natural language processing technology, a deep learning technology, a big data technology and the like, a professional sets specific risk point auditing principles and auditing standards, a machine automatically learns the principles and standards by means of an artificial intelligence technology, content of a legal document to be audited is compared with the principles and standards, missing data of the text to be audited and possible risk points are found, and risk prompts are made.

Disclosure of Invention

The invention aims to provide an intelligent auditing system for legal text risk points, which solves the following technical problems:

how to provide a legal text risk point intelligent auditing system capable of improving the working efficiency of legal workers.

The aim of the invention can be achieved by the following technical scheme:

the legal text risk point intelligent auditing system comprises an operation terminal, a server and a service terminal:

the operation terminal is used for uploading the document to be checked by a user;

the server includes:

the grabbing module is connected with the operation terminal and used for grabbing important words of the document to be checked according to preset grabbing rules;

the word segmentation module is connected with the grabbing module and used for splitting important sentences where the important vocabularies are located according to a preset word segmentation rule to obtain segmented words;

the analysis module is connected with the word segmentation module and used for marking the part of speech of the word segmentation and carrying out syntax analysis on the word segmentation according to the dependency analysis model;

the detection data generation module is connected with the word segmentation module and the analysis module and is used for generating corresponding detection data according to the syntactic analysis result of the important sentence;

the risk point library is connected with the detection data generation module and is used for obtaining risk points according to the detection data comparison;

the service terminal is connected with the risk point library and used for providing the important sentences related to the risk points for the user.

According to the technical scheme, important vocabularies in the document to be audited are grabbed according to the preset grabbing rules, the important sentences are relocated, the important sentences are split, and then the corresponding image format detection data are generated, so that risk of the detection data is conveniently judged by a risk point library based on an AI technology, risk points are obtained, suspicious, risk or error positions in legal texts can be efficiently checked, the work efficiency of legal matters is improved, and the work load of the legal matters is reduced.

As a further scheme of the invention: the syntactic analysis result comprises word sense identification codes corresponding to the segmented words and logic identification codes representing logic relations among the segmented words;

the method for generating corresponding detection data comprises the following steps:

and loading the word sense identification code and the logic identification code of the word corresponding to the important sentence in sequence to generate a blank picture, so as to obtain the detection data.

In the invention, the word sense identification code and the logic identification code are generated according to the preset corresponding rule, so that a word segmentation or a logic relationship is respectively in one-to-one correspondence with the word sense identification code and the logic identification code.

As a further scheme of the invention: the word sense identification code is a rectangular matrix punctuation code with the size specification of 50 x 50 pixels, and the logic identification code is a rectangular matrix punctuation code with the size specification of 10 x 2 pixels;

and the word sense identification code, black pixels and white pixels in the logic identification code and the arrangement sequence among different pixels respectively represent the logical relationship of word sense and front and rear word segmentation.

Thus, in this embodiment of the present invention, the specific display form in the blank picture is that word sense identification codes and logic identification codes are alternately arranged in sequence from left to right, and auxiliary words such as "have", "ground" and the like are deleted.

As a further scheme of the invention: the risk point library comprises a rechecking adjustment unit, when the abnormal probability is lower than the preset abnormal probability but higher than the preset normal probability, the rechecking adjustment unit drives the detection data generation module to regenerate the detection data according to a preset adjustment rule, so as to obtain the abnormal probability of the second detection, and comprehensively calculate the final risk probability;

P＝αP1+βP2

wherein P is the final risk probability, alpha and beta are weight coefficients respectively, alpha < beta, P1 is the anomaly probability obtained by the first detection, and P2 is the anomaly probability obtained by the second detection. In the present embodiment of the present invention, α+β=1.

As a further scheme of the invention: the preset adjustment rule comprises:

the word sense identification code is adjusted to be a rectangular matrix punctuation code with the size specification of 60 x 60 pixels, and the logic identification code is a rectangular matrix punctuation code with the size specification of 10 x 3 pixels;

respectively adjusting the word sense identification code and the logic identification code into rectangular matrix punctuation codes filled by color pixels;

the word sense identification code, the color pixels in the logic identification code and the arrangement sequence among different pixels represent the logical relationship of word sense and front and back word segmentation respectively.

As a further scheme of the invention: the risk reminding system further comprises a risk reminding module, wherein the risk reminding module is used for integrating and centralizing the paragraphs where the key sentences exist, establishing indexes and facilitating checking.

The invention has the beneficial effects that: according to the method and the device, important vocabularies in the document to be audited are grabbed according to the preset grabbing rules, the important sentences are relocated, the important sentences are segmented and split, and then corresponding detection data in a picture format are generated, so that risk of the detection data is conveniently judged by a risk point library based on an AI technology, risk points are obtained, suspicious, risk or error positions in legal texts can be efficiently checked, the work efficiency of legal matters is improved, and the work load of the legal matters is reduced.

Drawings

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of the principle of operation of the present invention;

FIG. 2 is a schematic diagram of audit logic in the present invention;

FIG. 3 is a machine learning training method for capturing important vocabularies in the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1-3, the invention discloses an intelligent auditing system for legal text risk points, which comprises an operation terminal, a server and a service terminal:

the server includes:

In the specific embodiment of the invention, when the user uploads the document to be audited through the operation terminal, the document to be audited can be audited, and the text content can be modified.

After the operation terminal sends out, the server receives the information:

different documents are treated as word document processing, and the complex words possibly appearing in the text and the radical words are identified.

After receiving the information, the server can utilize a machine learning algorithm to perform data extraction operation, and grasp important words and data in the text. Then, the important sentences where important vocabularies are located are segmented, and the grabbed sentences are split into a plurality of words. For example, "the legal rights of the original notice has been seriously damaged by the behavior to be notified", "behavior", "serious", "damage", "original notice", "legal" and "rights" are then removed meaningless auxiliary words, the "already", "right" is removed corresponding to the above example, then Pos tag (part of speech tagging) is made for the divided words, and according to the risk recognition unit included in the risk point library, the abnormal probability corresponding to the important sentence is outputted according to the detection data, and when the abnormal probability is higher than the preset abnormal probability, the important sentence corresponding to the detection data output is marked as a risk point.

And comparing and judging the risk points related in the text to be checked with a legal network protection risk point library, and prompting to check the legal service. In addition, the correlation calculation can be carried out on the correlation and the numerical correlation of the money possibly related in the text to be checked, and after the analysis is finished, the relation of important information and the context of the whole things in the text to be checked are mastered. The text to be checked can be generated into legal documents with legal benefits without errors.

As a further scheme of the invention: the risk point library comprises a risk identification unit, and is used for outputting abnormal probability corresponding to the important sentences according to the detection data, and marking the important sentences corresponding to the detection data output as risk points when the abnormal probability is higher than a preset abnormal probability.

P＝αP1+βP2

As a further scheme of the invention: the preset adjustment rule comprises:

The foregoing describes one embodiment of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by the present invention.

Claims

1. The legal text risk point intelligent auditing system is characterized by comprising an operation terminal, a server and a service terminal:

the server includes:

2. The legal text risk point intelligent auditing system according to claim 1, wherein the syntactic analysis result comprises a word sense identification code corresponding to the segmentation and a logic identification code representing a logic relationship between the segmentation;

3. The legal text risk point intelligent auditing system according to claim 2, wherein the word sense identification code is a rectangular matrix punctuation code of 50 x 50 pixel size specification, and the logic identification code is a rectangular matrix punctuation code of 10 x 2 pixel size specification;

4. The legal text risk point intelligent auditing system according to claim 2, wherein the risk point library includes a risk recognition unit for outputting an abnormal probability corresponding to the important sentence according to the detection data, and marking the important sentence corresponding to the detection data output as a risk point when the abnormal probability is higher than a preset abnormal probability.

5. The legal text risk point intelligent auditing system according to claim 2, wherein the risk point library comprises a rechecking adjustment unit, when the abnormal probability is lower than the preset abnormal probability but higher than the preset normal probability, the rechecking adjustment unit drives the detection data generation module to regenerate the detection data according to a preset adjustment rule, so as to obtain the abnormal probability of the second detection, and comprehensively calculate the final risk probability;

P＝αP1+βP2

wherein P is the final risk probability, alpha and beta are weight coefficients respectively, alpha < beta, P1 is the anomaly probability obtained by the first detection, and P2 is the anomaly probability obtained by the second detection.

6. The legal text risk point intelligent auditing system of claim 5, wherein the preset adjustment rules comprise:

7. The legal text risk point intelligent auditing system according to claim 1, further comprising a risk reminding module for integrating and centralizing the paragraphs of the key sentences with risks and establishing an index for convenient viewing.