CN115374239A - Legal and legal analysis method and device, computer equipment and readable storage medium - Google Patents

Legal and legal analysis method and device, computer equipment and readable storage medium Download PDF

Info

Publication number
CN115374239A
CN115374239A CN202210818710.XA CN202210818710A CN115374239A CN 115374239 A CN115374239 A CN 115374239A CN 202210818710 A CN202210818710 A CN 202210818710A CN 115374239 A CN115374239 A CN 115374239A
Authority
CN
China
Prior art keywords
legal
regulation
law
item
catalog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210818710.XA
Other languages
Chinese (zh)
Inventor
马旭慧
张凯
陈铭
柳进军
李浩浩
武帅兴
张海军
刑凯翔
李俊鹏
陈楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhonghai Zhumeng Technology Co ltd
Original Assignee
Beijing Zhonghai Zhumeng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhonghai Zhumeng Technology Co ltd filed Critical Beijing Zhonghai Zhumeng Technology Co ltd
Priority to CN202210818710.XA priority Critical patent/CN115374239A/en
Publication of CN115374239A publication Critical patent/CN115374239A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/328Management therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results

Abstract

The invention discloses a law and regulation analysis method, a device, computer equipment and a readable storage medium, wherein the method comprises the following steps: analyzing the original law and regulation document line by line, comparing the analyzed law and regulation document with the law and regulation template, and mounting each term obtained by analysis under each corresponding catalogue information item in the law and regulation template to form a law and regulation scale; the verification method is a scale type. The method analyzes the rule contents based on the template object method, and the analyzed levels and relationships can be more accurate; when the content of laws and regulations is adjusted, all quoted or used positions in the system can be reminded in time, and all quoted positions can be updated and adjusted in time.

Description

Legal and legal analysis method and device, computer equipment and readable storage medium
Technical Field
The invention relates to the technical field of policy and law and regulation analysis, in particular to a law and regulation analysis method and device, computer equipment and a readable storage medium.
Background
At present, most laws, administrative laws, government regulations, department regulations, local laws, notice bulletins and the like are stored in forms of texts such as PDF, RTF, OFD and the like, the searching and the use of the legal and legal contents are difficult, the searching is inconvenient according to problems related to punishment in the administrative law enforcement process, and the legal and legal contents cannot be updated in time when changed.
Chinese patent CN202110152861.1 discloses a policy and law analysis method and system based on a regular expression matching algorithm, which is characterized in that a representative small number of policy and law are manually analyzed, and a part of rule algorithm library is arranged; and then training a large amount of historical policy and law and regulation documents, continuously mining and supplementing new matching rules, and finally forming a complete matching rule model. However, the model cannot be analyzed according to different template files aiming at different structures of regulation files, and the analysis methods of money, items and items are not disclosed, so that the hierarchy of the model is relatively unclear.
Disclosure of Invention
The invention aims to provide a law and regulation analysis method, a law and regulation analysis device, computer equipment and a readable storage medium, which are used for realizing the analysis of law and regulation files based on template objects, realizing the analysis of the law and regulation files with different structures according to different template files, realizing the storage of structured information of the files according to levels and facilitating the searching and the use.
In order to solve the above problem, a first aspect of the present invention provides a law and regulation analysis method, including: pre-analyzing the loaded original legal and legal documents to obtain pre-analysis data, wherein the pre-analysis data comprises the following steps: directory information items and maximum index sequence numbers corresponding to the directory information items;
the directory information item includes: the contents of the volume catalog, the editing catalog, the chapter catalog and the bar include: "Bar" catalog and "section" paragraphs;
the 'volume' catalog, the 'editing' catalog, the 'chapter' catalog and the 'bar' catalog are all positioned before the first blank space of each line of the original legal and legal document, and the 'money' paragraph is a natural paragraph without digital sequencing under the 'bar' catalog;
generating a law and regulation template according to the pre-analysis data;
analyzing the original legal regulation document line by line and comparing the original legal regulation document with the legal regulation template, and mounting each analyzed term to each corresponding catalogue information item in the legal regulation template to form a legal regulation scale;
and verifying the law and regulation model.
Preferably, the directory information item further comprises: a "section" directory located before the first space of each line of the original legal document.
Preferably, a legal and legal template is generated by taking the number of the largest serial number as the number of items of each catalog information item.
Preferably, the format of the original legal document comprises: word documents, text documents, PDF documents, or web documents.
Preferably, the method for generating the law and regulation model specifically includes:
analyzing the content of the original law and regulation document line by line according to the structure of the original law and regulation document, reading the directory information item and the index number of the original law and regulation document line by line, comparing the directory information item and the index number of the original law and regulation document with the directory information item and the index number of the original law and regulation template, finding out the specific position of each line of content in the original law and regulation document corresponding to the law and regulation template, and mounting the content.
Preferably, the method for forming the law and regulation model further comprises the steps of analyzing and mounting the item catalog and the target catalog line by line;
the item is analyzed into an item on the basis of a line which is not the catalog information item and begins in the application format of the item under the catalog of the item;
the "item" is parsed into "item" based on the line starting with the application format of "item" other than the directory information item under the "item" directory.
Preferably, the verification content of the law and regulation model includes:
verifying the continuity of the index sequence numbers of the directory information items under the whole laws and regulations;
verifying the continuity of the index sequence numbers of the 'money' catalogues under the contents of each 'bar'; and/or
Verifying the continuity of the index sequence numbers of the item catalogues under the money catalogues; and/or
Verifying the continuity of the index sequence numbers of the 'item' directory under each 'item' directory;
and verifying the correctness of the contents of each clause obtained by analysis.
Preferably, the verification of the continuity of the index numbers of the directory information items under the whole laws and regulations includes:
the verification of the continuity of the index sequence number of the 'bar' content under each 'chapter' and/or 'section' catalog;
verification of continuity of the index number of the "bar" content under the full laws and regulations;
the method can also comprise the following steps:
and verifying the continuity of the index sequence number of the item catalog under the whole law and regulation.
Preferably, the validation of the law and regulation model further comprises a secondary validation for marking the terms which are not validated by parsing the original law and regulation document again and comparing the parsed original law and regulation document with the generated law and regulation model.
Preferably, the method further comprises the following steps: and cleaning the loaded original legal document by using the ASCII code.
According to a second aspect of the present invention, there is provided a law and regulation analysis device applied to a computer apparatus, including:
the pre-analysis module is used for generating a law and regulation template;
the model generation module is used for analyzing an original legal document, comparing the original legal document with the legal template, and mounting each analyzed clause under each corresponding catalogue information item in the legal template to form a legal scale;
the verification module is used for verifying and secondarily verifying the law and regulation model;
a storage unit for loading the original legal document and storing the legal model;
the method can also comprise the following steps: and the cleaning module is used for cleaning the law and regulation template.
Preferably, the pre-analysis module comprises: the device comprises a reading unit, a pre-analysis unit, an extraction unit and a template generation unit;
the reading unit is used for reading the content of the original legal document;
the pre-analysis unit is used for analyzing the structure of the original legal and legal document and judging whether a catalogue information item is included or not;
the extraction unit is used for extracting the directory information item and the sequence number value of the maximum index sequence number corresponding to the directory information item;
the template generating unit is used for generating a law and regulation template according to the directory information item and the maximum index sequence number value corresponding to the directory information item.
Preferably, the model generation module comprises: the device comprises an analysis unit and a comparison matching unit;
the analyzing unit is used for analyzing and extracting the directory information items and the serial numbers thereof of the original legal and legal documents line by line and analyzing the item directories, the target directories and the index serial numbers thereof;
the comparison matching unit is used for comparing the catalog information item, the item catalog and the item catalog of the original legal and legal documents and the index numbers thereof with the catalog information item, the item catalog and the item catalog in the legal and legal models and the index numbers thereof, finding out the specific position of each line of content in the original legal and legal documents corresponding to the legal and legal models and mounting the content.
According to a third aspect of the present invention, there is provided a computer device comprising a processor and a non-volatile memory storing computer instructions which, when executed by the processor, perform the method of law and regulation resolution of at least one possible implementation of the first aspect.
According to a fourth aspect of the present invention, there is provided a readable storage medium, which includes a computer program, and the computer program controls a computer device in which the readable storage medium is located to execute the law and regulation resolving method in at least one possible implementation manner of the first aspect.
The technical scheme of the invention has the following beneficial technical effects:
the method analyzes the rule contents based on the template object method, and the analyzed levels and relationships are more accurate; when the legal and legal content is adjusted, all quoted or used places in the system can be reminded in time, and all quoted places can be updated and adjusted in time.
Drawings
FIG. 1 is a flowchart of a law and regulation resolution method according to a first embodiment of the present invention;
FIG. 2 is a flow diagram of a pre-resolution method of one embodiment of the invention;
FIG. 3 is a flow chart of a legal model generation method of one embodiment of the present invention;
FIG. 4 is a screenshot of a legal template written into a temporary library of one embodiment of the present invention;
FIG. 5 is a logical block diagram of a legal model parsing method according to one embodiment of the invention;
FIG. 6 is a schematic view showing a structure of a law and regulation resolving device according to a second embodiment of the present invention;
FIG. 7 is a schematic diagram of the structure of a pre-analysis module of one embodiment of the present invention;
FIG. 8 is a block diagram of a model generation module according to an embodiment of the invention.
Reference numerals are as follows:
a pre-analysis module 1, a reading unit 11, a pre-analysis unit 12, an extraction unit 13, a template generation unit 14,
A cleaning module 2, a model generation module 3, an analysis unit 31, a comparison and matching unit 32,
A verification module 4 and a storage unit 5.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Fig. 1 is a flow chart of a law and regulation analysis method according to a first embodiment of the present invention, as shown in fig. 1, including the steps of:
s1, carrying out pre-analysis on a loaded original law and regulation document to obtain pre-analysis data;
s2, generating a law and regulation template according to the pre-analysis data;
s3, analyzing the original law and regulation document line by line, comparing the original law and regulation document with a law and regulation template, and mounting each clause obtained by analysis under each corresponding catalogue information item in the law and regulation template to form a law and regulation scale;
and S4, verifying the scale of the law method.
The invention aims to protect a law and regulation analysis method, the method analyzes the content of the law and regulation based on the method for constructing the law and regulation template, and the analysis level and the relation are more accurate; meanwhile, through secondary verification of the model, when the content of laws and regulations is adjusted, all quoted or used places in the system can be found and marked in time to serve as a prompt, and all quoted places can be updated and adjusted in time.
Fig. 2 is a flowchart of the pre-analysis method in step S1 according to an embodiment of the present invention, and referring to fig. 2, the method includes the steps of:
s11, reading and analyzing the structure of the original law and regulation document, and extracting directory information items; the directory information items include: the "volume" directory, "editing" directory, "chapter" directory, and "bar" content, wherein the "bar" content includes: "Bar" catalog and "section".
And S12, extracting the maximum index sequence number corresponding to each analyzed directory information item.
Specifically, the original legal document to be analyzed is loaded into the memory of the computer system, and the directory information item of the original legal document is analyzed and extracted by reading the structure of the original legal document. The computer system can support POI, web crawler and other technologies to read the original legal and legal documents, and the format of the original legal and legal documents is not limited to word documents, text documents, PDF documents or web page documents.
For current regulatory documents, and in particular for legal documents, the structural units generally comprise: the relationship among the items is as follows: volume, edit, chapter, section, bar, money, item and item.
Thus, in one embodiment of the present invention, depending on the structure of the original legal document to be parsed, a "volume" catalog, a "compilation" catalog, a "chapter" catalog, a "bar" content or a "volume" catalog, a "compilation" catalog, a "chapter" catalog, a "section" catalog, and a "bar" content is extracted.
For the extraction of the volume catalog, the editing catalog and the chapter catalog, the term content of some legal and legal documents contains keywords such as volume, editing, chapter and section. Therefore, in the pre-analysis, it is not necessary to compare the specific contents of the terms, and only the entries containing the "volume", "edition", "chapter", and "section", which are located before the first space of each line of the original legal document and are keywords such as "volume", "edition", "chapter", and "section", are determined as directory information items, that is, the xth volume, xth edition, xth chapter, and/or xth section.
For the extraction of the content of the "bar", since the "money" is a component of the content of the "bar", the extraction of the content of the "bar" is divided into the extraction of the catalog of the "bar" and the extraction of the paragraph of the "money".
Specifically, the entry with the first word as the "entry" in the first space of each line of the original legal document is determined as the "entry" directory, i.e., the X-th entry;
under the 'bar' catalogue, the natural segments without numerical sorting are judged as 'money'.
After the catalog information items are analyzed, index serial number values corresponding to the front of each catalog information item are extracted, the serial number value of the extracted maximum index serial number is used as the item number of each catalog information item, and a law and regulation template shown as follows is generated:
a first roll;
first weaving;
chapter 1 (ii) a
Section 1 (ii) a
A first bar;
a second bar;
the twenty-fourth step;
chapter 2 (ii) a
A first bar;
a second bar;
the twenty-fourth step;
a second roll;
first weaving;
chapter 1 (ii) a
Section 1 (ii) a
A first bar;
a second bar;
section 2 (ii) a
A first bar;
a second bar;
the twenty-fourth step;
second weaving;
…”
specifically, if it is detected that the maximum serial number of the "volume" directory is five, five "volume" directories, namely a first volume, a second volume, … and a fifth volume, are generated in the legal template;
if the maximum index serial number of the 'cataloging' is detected to be four, four 'cataloging' catalogs are generated under each 'volume' catalog in the legal and legal rule template, namely a first catalog, a second catalog, a third catalog and a fourth catalog;
if the maximum index serial number of the 'chapter' directory is six, in the legal regulation template, six 'chapter' directories, namely a first chapter, a second chapter, … and a sixth chapter, are generated under each 'chapter' directory;
if the maximum index serial number of the section directory is six, six section directories, namely a first section, a second section, … and a sixth section, are generated under each section directory in the legal and legal template;
if the maximum index serial number of the 'bar' content is twenty, twenty 'bar' catalogues are generated under each 'section' catalog in the legal and legal rule template, namely a first bar, a second bar, … and a twenty-th bar; if the section directory is not detected, directly mounting the detected section directory under the chapter directory;
the method analyzes the laws and regulations into the structured data by utilizing the laws and regulations template, so that the analyzed levels and relationships are more accurate, and the subsequent searching and use are convenient.
In a preferred embodiment of the present implementation, after the construction of the legal regulation template is completed and before the legal regulation is scaled, the parsing of the terms is more accurate by reading the content of each line and converting into ASCII code content, converting the serial number of the original legal regulation document and cleaning the content, and eliminating some useless symbols, such as \ r, \ n, and Chinese brackets, chinese spaces, english semi-corners, etc.
Fig. 3 is a flow chart of a method for generating law scale according to an embodiment of the present invention, as shown in fig. 3, comprising the steps of:
s31, reading the catalog information items and the index numbers of the original legal and legal documents line by line, comparing the catalog information items and the index numbers with the catalog information items and the index numbers in the legal and legal templates, finding out the specific position of each line of content in the original legal and legal documents corresponding to the legal and legal templates, and mounting the specific position;
s32, analyzing and mounting the bar content line by line;
and S33, analyzing and mounting the item directory and/or the item directory line by line.
Fig. 4 is a screenshot of a legal template temporarily stored in the temporary repository, where the legal template corresponds to a legal model with a maximum index number of twenty for the "bar" directory in the parsed "bar" content, and a process of generating the legal model is described in detail with reference to fig. 4.
In a preferred embodiment of the present invention, after the legal template is generated, the content of the original legal document is analyzed line by line from top to bottom according to the structure of the original legal document, the "volume" directory, the "compilation" directory, the "chapter" directory, the index number thereof, and the content corresponding to the line are read in sequence, that is, the content of the "first volume", "first compilation", "first chapter" and the line thereafter is read, and compared with the positions of the "first volume", "first compilation", and the "first chapter" in the legal template, and the read content of the "first volume", "first compilation", and "first chapter" of the original legal is mounted to the corresponding position in the legal template without errors.
Reading the "bar" content, including: reading the 'bar' catalog, the index serial number of the 'bar' catalog and the content corresponding to the line, namely reading the 'first bar' and the content of the line behind the 'first bar', comparing the read 'first bar' content with the position of the 'first bar' in the legal regulation template, and mounting the read 'first bar' content of the original legal regulation to the corresponding position in the legal regulation template after no error;
and parsing the section of "money" that may exist under the "bar" directory.
The application format of the 'section' under the 'bar' catalogue is divided by natural sections, and the sections of the 'item' catalogue and the 'item' catalogue under the 'bar' content are analyzed into the 'section'. The contents of the 'money' do not have serial number information, in order to distinguish and sort, in the process of analysis, catalog index and sorting information is added to the 'money' paragraphs temporarily stored in a temporary library in the system, namely 'first money', 'second money' and 'second money', and the 'first money', 'second money' and 'second money' are mounted under the 'bar' catalog and used for internal viewing of the system and maintenance and use of management personnel.
The original legal document in this embodiment is not parsed into the "section" directory, and thus the "first piece" is directly mounted to the "first chapter".
Next, the parsing and mounting based on the "item" and "item" directories will be described.
The application format of the item is analyzed in a mode of (one) and (two), the paragraph at the beginning of the information of (one) and (two) … under the directory of the item or the item is analyzed into the item, the context of the associated file is mounted to the corresponding paragraph of the item or the item, namely the content of the item or the item is mounted to the content of the item or the item, and then the system index information (one), the item and the item are added.
Continuing to parse down, the application format of "target" is expressed in segments in the manner of arabic numerals "1.," 2., "3." or "1"), 2), 3) ", and is distinguished by"; "or" differentiation. According to the format structure, a paragraph which is started by an Arabic number and ended by a mark number is analyzed as a system 'item', and the paragraph is associated to the corresponding 'item' directory according to the context structure, namely after the paragraph is associated to the content of (three), index information 1), 2) and 3) are added.
Continuing to analyze line by line, and after the next line is analyzed to be 'item', mounting the content of the step 3), adding system index Information (IV);
and the next row is analyzed as a second bar, the second bar is compared with the position of the second bar in the legal regulation template, and the read content of the second bar of the original legal regulation is mounted to the corresponding position in the legal regulation template without error.
By analogy, the upper-level and lower-level relations are established, and corresponding terms are matched to form a complete law and regulation model.
Further, if only five pieces of content are parsed from the first chapter of the original legal document, and the second chapter of the actual original legal document is ordered from "sixth", when the legal template of the present application is used, only the parsed content of the "sixth" chapter of the original legal document needs to be correspondingly mounted to the position of the "sixth" chapter of the second chapter of the legal template, and the later system will automatically delete the "first" to "fifth" chapters of the second chapter that do not match the corresponding clause content.
At present, in most laws and regulations, the index number of the content of the 'bar' is the index number of the 'bar' in the continuation previous chapter, and the index number of the 'bar' is not reordered every chapter. Therefore, the legal regulation template of the invention can be applied to the analysis of various legal regulation structures.
In other embodiments of the present invention, the contents of the "volume", "edition", "chapter" and "bar" contents may be analyzed and mounted first, and then the contents of the "money", "item" and "item" directories may be analyzed and mounted to the corresponding locations.
After the law and regulation model is formed, writing the law and regulation model into a temporary library, and verifying data taken out of the temporary library at least twice, wherein the verification contents comprise:
(1) And verifying the continuity of the index serial numbers of the catalog information items under the whole laws and regulations, namely verifying whether the index serial numbers of the volume catalog, the editing catalog, the chapter catalog, the section catalog and the bar catalog are continuous under the whole laws and regulations.
Regarding the verification of the index number of the volume catalog, as the volume belongs to the largest catalog, only the judgment of whether the index numbers of the catalogs of the volumes stored in the temporary library are continuous is needed;
for verifying the index number of the "catalog", the continuity of the index number of the "catalog" under a certain "volume" catalog needs to be judged;
in other embodiments, in order to avoid renumbering the index numbers of the "catalog" in the next "volume", it is necessary to verify the continuity of the index numbers of the "catalog" under all laws and regulations, and the data is determined to be valid as long as the continuous index numbers are all the index numbers;
the verification of the index sequence numbers of the chapter catalog, the section catalog and the strip catalog is the same as the verification method of the index sequence number of the coding catalog.
(2) Verifying the continuity of the index sequence numbers of the paragraphs under each 'item' directory;
for verification of the index number of the "section" paragraph, the continuity of the index number of the "section" paragraph needs to be judged, and the number of the "section" paragraph under each "bar" catalog must be ordered from the number "one".
(3) The continuity of the index sequence number of the 'item' catalog under each 'section' or 'item' catalog is verified, and the verification method is the same as that of the index sequence number of the 'coding' catalog;
(4) The continuity of the index sequence number of the 'item' catalog in each 'item' catalog is verified, and the verification method is the same as that of the index sequence number of the 'coding' catalog. In other embodiments, it is also necessary to verify the continuity of the index number of the "item" catalog under the whole law and regulation;
(5) And verifying the correctness of the content of each term obtained by analysis, namely reading the content of the original legal document and the data temporarily stored in the temporary library, finding out a corresponding directory number, comparing the directory number with the analyzed content, and judging whether the content is different, wherein if the content of the first item in the original legal document is ABC and the content corresponding to the analyzed first item is ABC, the system considers that the analyzed content is correct.
And after the verification is passed, writing the law and regulation model into a database for storage, and performing secondary verification on data which is not passed through the verification.
The secondary verification is to repeat the whole analysis process once again, namely the steps S1-S3 are repeated, so that on one hand, failure caused by abnormal communication among the components during analysis is avoided; on the other hand, the abnormal data can be specially marked during secondary verification, for example, all the quoted, used, added and deleted positions are distinguished from the law and regulation model obtained by primary analysis, the content obtained by secondary analysis is highlighted, and research and development personnel can conveniently adjust the content in time, so that corresponding modification can be conveniently made and the modified positions can be conveniently displayed according to the conditions of modification, abolishment or release of the law and regulation, and the reference and comparison are convenient.
In another preferred embodiment of the invention, data which fails in multiple times of analysis is written into a temporary library for research and development personnel to check and troubleshoot problems, and finally, a template is optimized to realize analysis of the diversity of regulations.
That is, when the system finds that there is a case where a regulation structure of the relevant law and regulation template has not been prepared, the research and development staff can write the law and regulation template according to the law and regulation structure, and adjust the relevant program at the same time, so that laws and regulations of different structures can be analyzed.
Fig. 5 shows a logic block diagram of the first embodiment of the present invention, referring to fig. 5, a law and regulation template is generated according to the pre-analysis result of the original law and regulation document, after the content of the original law and regulation document is cleaned by ASCII code, the original law and regulation document is analyzed line by line, a complete context is established, and the matching of the contents of the "volume", "edition", "chapter", "section", "bar", "money", "item" and "destination" is completed, so as to form a complete law and regulation model. And verifying the law and regulation model at least twice, and storing the law and regulation model into a database after the verification is correct.
Fig. 6 is a schematic structural diagram of a law and regulation resolving device according to a second embodiment of the present invention, as shown in the drawing, including: the device comprises a pre-analysis module 1, a cleaning module 2, a model generation module 3, a verification module 4 and a storage unit 5.
The system comprises a pre-analysis module 1, a data processing module and a data processing module, wherein the pre-analysis module 1 is used for generating a law and regulation template;
the cleaning module 2 is used for cleaning the law and regulation template;
the model generation module 3 is used for analyzing the original legal document, comparing the original legal document with the legal template, and mounting each analyzed clause under each corresponding catalogue information item in the legal template to form a legal scale;
the verification module 4 is used for verifying and secondarily verifying the law and regulation model;
and the storage unit 5 is used for loading the original legal and legal documents and storing legal and legal models.
Further, as shown in fig. 7, the pre-analysis module 1 includes: a reading unit 11, a pre-analysis unit 12, an extraction unit 13, a template generation unit 14;
the reading unit 11 is used for reading the content of an original legal document, and can read a word document, a text document, a PDF document or a webpage document;
the pre-analysis unit 12 is used for analyzing the structure of the original legal document, and determining whether the original legal document contains any category of directory information items, i.e. several categories of "volume", "edition", "chapter", "section", and "bar";
the extracting unit 13 is configured to extract a numerical value of the directory information item and a maximum index sequence number corresponding to the directory information item;
the template generating unit 14 is used for generating a legal and legal template according to the directory information items and the numerical value of the maximum index number corresponding to the directory information items.
Further, as shown in fig. 8, the model generation module 3 includes: an analysis unit 31 and a comparison matching unit 32;
the analyzing unit 31 is configured to analyze and extract the directory information items of the original legal document, that is, the "volume" directory, "editing" directory, "chapter" directory, "section" directory and the index numbers thereof, line "content," item "directory and the index numbers thereof, and establish the index numbers and index information of the" section "paragraph;
the comparison matching unit 32 is configured to compare the "volume" catalog, "editing" catalog, "chapter" catalog, "section" catalog, "bar" content, "item" catalog, and "item" catalog of the original legal document and the index numbers thereof with the "volume" catalog, "editing" catalog, "chapter" catalog, "section" catalog, "bar" content, "item" catalog, and "item" catalog in the legal model and the index numbers thereof, find a specific position of each line of the original legal document corresponding to the content in the legal model, and mount the content.
A third embodiment of the present invention provides a computer apparatus comprising a processor and a non-volatile memory storing computer instructions, wherein the computer instructions, when executed by the processor, cause the computer apparatus to perform the method for law and regulation resolution described above.
A fourth embodiment of the present invention provides a computer-readable storage medium storing a computer program that, when executed, performs the steps of the law and regulation resolving method described above.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modifications, equivalents, improvements and the like which are made without departing from the spirit and scope of the present invention shall be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundary of the appended claims, or the equivalents of such scope and boundary.
The present invention is described with reference to flowchart illustrations of methods, apparatus, and computer program products according to embodiments of the invention. It will be understood that each flow of the flowcharts, and combinations of flows in the flowcharts, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes the following procedures for implementing the embodiments of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), a Random Access Memory (RAM), or the like.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The modules in the device provided by the embodiment of the invention can be combined, divided and deleted according to actual needs.

Claims (15)

1. A law and regulation analytic method, comprising: carrying out pre-analysis on the loaded original legal and legal documents to obtain pre-analysis data, wherein the pre-analysis data comprises: directory information items and maximum index sequence numbers corresponding to the directory information items;
the directory information item includes: the contents of the volume catalog, the editing catalog, the chapter catalog and the bar are as follows: "Bar" catalog and "section" paragraphs;
the 'volume' catalog, the 'editing' catalog, the 'chapter' catalog and the 'bar' catalog are all positioned before the first blank of each line of the original legal and legal document, and the 'money' paragraph is a natural paragraph without numerical sequencing under the 'bar' catalog;
generating a law and regulation template according to the pre-analysis data;
analyzing the original legal regulation document line by line and comparing the original legal regulation document with the legal regulation template, and mounting the content of each clause obtained by analysis under each corresponding catalogue information item in the legal regulation template to form a legal regulation scale;
and verifying the law and regulation model.
2. The legal analysis method according to claim 1, wherein the directory information item further comprises: a "section" directory that precedes the first space of each line of the original legal document.
3. The legal analysis method according to claim 1 or 2, wherein a legal template is generated with the number of the largest index number as the number of items of each of the catalog information items.
4. The legal document parsing method of claim 3, wherein the format of the original legal document comprises: word documents, text documents, PDF documents, or web documents.
5. The legal analysis method of claim 1, wherein the method of forming the legal model specifically comprises:
analyzing the content of the original law and regulation document line by line according to the structure of the original law and regulation document, reading the directory information item and the index number of the original law and regulation document line by line, comparing the directory information item and the index number of the original law and regulation document with the directory information item and the index number of the original law and regulation template, finding out the specific position of each line of content in the original law and regulation document corresponding to the law and regulation template, and mounting the content.
6. The law and regulation analytic method of claim 1 or 2, wherein the method of forming the law and regulation model further comprises: analyzing and mounting the item directory and the item directory line by line;
the item is not the directory information item under the item directory or item section, and is analyzed into an item in a line starting from the application format of the item;
the "item" is parsed into "item" based on the line starting with the application format of "item" other than the directory information item under the "item" directory.
7. The law and regulation parsing method according to claim 1 or 6, wherein the content of the verification of the law and regulation model includes:
verifying the continuity of the index sequence numbers of the catalog information items of the whole laws and regulations; and/or
Verifying the continuity of the index sequence number of each item catalog of the section falling type; and/or
Verifying the continuity of the index sequence numbers of the 'item' directory under each 'item' directory;
and verifying the correctness of the contents of each clause obtained by analysis.
8. The legal analysis method according to claim 1 or 7, wherein the verifying the continuity of the index numbers of the directory information items under the whole legal regulations comprises:
the continuity verification of the index sequence number of the 'strip' catalog under each 'chapter' and/or 'section' catalog;
the verification of the continuity of the index sequence number of the 'bar' catalog under the whole law and regulation;
the method can also comprise the following steps:
and verifying the continuity of the index sequence number of the item catalog under the whole law and regulation.
9. The legal analysis method of claim 1 or 8, wherein the validation of the legal model further comprises a secondary validation for marking a non-validated term by parsing the original legal document again, comparing with the generated legal model.
10. The law and regulation analytic method of claim 1, further comprising: and cleaning the loaded original legal document by using ASCII code.
11. A law and regulation analysis device is applied to computer equipment and comprises:
the system comprises a pre-analysis module (1) for generating a law and regulation template;
the model generation module (3) is used for analyzing an original legal document, comparing the original legal document with the legal template, and mounting each analyzed clause under each corresponding catalogue information item in the legal template to form a legal scale;
the verification module (4) is used for verifying and secondarily verifying the law and regulation model;
a storage unit (5) for loading the original legal document and storing the legal model;
the system also comprises a cleaning module (2) used for cleaning the law and regulation template.
12. The law and regulation resolution device of claim 11,
the pre-resolution module (1) comprises: a reading unit (11), a pre-analysis unit (12), an extraction unit (13), and a template generation unit (14);
the reading unit (11) is used for reading the content of the original legal document;
the pre-analysis unit (12) is used for analyzing the structure of the original legal document and judging whether a catalogue information item is included;
the extraction unit (13) is used for extracting the sequence number value of the directory information item and the maximum index sequence number corresponding to the directory information item;
the template generating unit (14) is used for generating a law and regulation template according to the directory information item and the maximum index sequence number value corresponding to the directory information item.
13. The law and regulation resolving device according to claim 11, wherein the model generating module (3) comprises: an analysis unit (31) and a comparison matching unit (32);
the analyzing unit (31) is used for analyzing and extracting the directory information items and the index numbers thereof of the original legal and legal documents line by line and analyzing the item directory, the target directory and the index numbers thereof;
the comparison and matching unit (32) is used for comparing the catalog information items, item catalogs and the index numbers thereof of the original legal and legal documents with the catalog information items, item catalogs and the index numbers thereof in the legal and legal models, finding out the specific position of each line of content in the original legal and legal documents corresponding to the legal and legal models and mounting the content.
14. A computer device comprising a processor and a non-volatile memory storing computer instructions that, when executed by the processor, perform the law and regulation resolving method of any one of claims 1 to 10.
15. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed, implements the steps of the law and regulation resolving method according to any one of claims 1 to 10.
CN202210818710.XA 2022-07-13 2022-07-13 Legal and legal analysis method and device, computer equipment and readable storage medium Pending CN115374239A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210818710.XA CN115374239A (en) 2022-07-13 2022-07-13 Legal and legal analysis method and device, computer equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210818710.XA CN115374239A (en) 2022-07-13 2022-07-13 Legal and legal analysis method and device, computer equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN115374239A true CN115374239A (en) 2022-11-22

Family

ID=84061433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210818710.XA Pending CN115374239A (en) 2022-07-13 2022-07-13 Legal and legal analysis method and device, computer equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN115374239A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029262A (en) * 2023-02-17 2023-04-28 广东远景信息科技有限公司 Legal and legal code generation method, database construction method and device
CN116468021A (en) * 2023-03-07 2023-07-21 天津市滨海新区司法局 Encoding-based law enforcement evidence data processing and using method and system

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6584459B1 (en) * 1998-10-08 2003-06-24 International Business Machines Corporation Database extender for storing, querying, and retrieving structured documents
CN101853252A (en) * 2009-04-02 2010-10-06 深圳市辰飞信息技术有限公司 Legal searching method and legal searching system
CN104008171A (en) * 2014-06-03 2014-08-27 中国科学院计算技术研究所 Legal database establishing method and legal retrieving service method
CN106528877A (en) * 2016-12-12 2017-03-22 远光软件股份有限公司 Modular method and system for word document
CN106815256A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 Set up the method and device of laws and regulations bar fund incidence relation
CN109033478A (en) * 2018-09-12 2018-12-18 重庆工业职业技术学院 A kind of text information law analytical method and system for search engine
CN111402092A (en) * 2020-06-08 2020-07-10 杭州识度科技有限公司 Law and regulation retrieval system based on multilevel semantic analysis
CN111553150A (en) * 2020-04-02 2020-08-18 深圳壹账通智能科技有限公司 Method, system, device and storage medium for analyzing and configuring automatic API (application program interface) document
CN112396539A (en) * 2019-07-30 2021-02-23 曾建生 Implementation method of administrative law enforcement self-adaptive auxiliary system based on artificial intelligence
CN112559677A (en) * 2019-09-26 2021-03-26 北京国双科技有限公司 Retrieval method of laws and regulations and related device
CN112765939A (en) * 2021-02-04 2021-05-07 浪潮云信息技术股份公司 Policy and law and regulation analysis method and system based on regular expression matching algorithm
CN114065719A (en) * 2021-11-23 2022-02-18 中国工商银行股份有限公司 Document processing method and device, electronic equipment and computer readable storage medium
CN114330284A (en) * 2021-11-09 2022-04-12 世纪保众(北京)网络科技有限公司 Rule model-based automatic insurance clause analysis method
CN114564938A (en) * 2020-11-27 2022-05-31 阿里巴巴集团控股有限公司 Document parsing method and device, storage medium and processor

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6584459B1 (en) * 1998-10-08 2003-06-24 International Business Machines Corporation Database extender for storing, querying, and retrieving structured documents
CN101853252A (en) * 2009-04-02 2010-10-06 深圳市辰飞信息技术有限公司 Legal searching method and legal searching system
CN104008171A (en) * 2014-06-03 2014-08-27 中国科学院计算技术研究所 Legal database establishing method and legal retrieving service method
CN106815256A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 Set up the method and device of laws and regulations bar fund incidence relation
CN106528877A (en) * 2016-12-12 2017-03-22 远光软件股份有限公司 Modular method and system for word document
CN109033478A (en) * 2018-09-12 2018-12-18 重庆工业职业技术学院 A kind of text information law analytical method and system for search engine
CN112396539A (en) * 2019-07-30 2021-02-23 曾建生 Implementation method of administrative law enforcement self-adaptive auxiliary system based on artificial intelligence
CN112559677A (en) * 2019-09-26 2021-03-26 北京国双科技有限公司 Retrieval method of laws and regulations and related device
CN111553150A (en) * 2020-04-02 2020-08-18 深圳壹账通智能科技有限公司 Method, system, device and storage medium for analyzing and configuring automatic API (application program interface) document
CN111402092A (en) * 2020-06-08 2020-07-10 杭州识度科技有限公司 Law and regulation retrieval system based on multilevel semantic analysis
CN114564938A (en) * 2020-11-27 2022-05-31 阿里巴巴集团控股有限公司 Document parsing method and device, storage medium and processor
CN112765939A (en) * 2021-02-04 2021-05-07 浪潮云信息技术股份公司 Policy and law and regulation analysis method and system based on regular expression matching algorithm
CN114330284A (en) * 2021-11-09 2022-04-12 世纪保众(北京)网络科技有限公司 Rule model-based automatic insurance clause analysis method
CN114065719A (en) * 2021-11-23 2022-02-18 中国工商银行股份有限公司 Document processing method and device, electronic equipment and computer readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029262A (en) * 2023-02-17 2023-04-28 广东远景信息科技有限公司 Legal and legal code generation method, database construction method and device
CN116029262B (en) * 2023-02-17 2023-06-09 广东远景信息科技有限公司 Legal and legal code generation method, database construction method and device
CN116468021A (en) * 2023-03-07 2023-07-21 天津市滨海新区司法局 Encoding-based law enforcement evidence data processing and using method and system

Similar Documents

Publication Publication Date Title
CN115374239A (en) Legal and legal analysis method and device, computer equipment and readable storage medium
JP3425408B2 (en) Document reading device
JP2968145B2 (en) Advanced data collection method and data processing system
US7814111B2 (en) Detection of patterns in data records
EP0772142A1 (en) A method for electronically recognizing and parsing information contained in a financial statement
US20020103834A1 (en) Method and apparatus for analyzing documents in electronic form
US20110270858A1 (en) File type recognition analysis method and system
US20040107386A1 (en) Test data generation system for evaluating data cleansing applications
US20050171965A1 (en) Contents reuse management apparatus and contents reuse support apparatus
WO2019077405A1 (en) Method, device, and system, for identifying data elements in data structures
CN110457302A (en) A kind of structural data intelligence cleaning method
CN109002768A (en) Medical bill class text extraction method based on the identification of neural network text detection
JP2007535771A (en) Document information mining tool
CN112036144B (en) Data analysis method, device, computer equipment and readable storage medium
CN116757808A (en) Automatic bidding document generation method and system based on big data
CN112149387A (en) Visualization method and device for financial data, computer equipment and storage medium
CN105701076A (en) Thesis plagiarism detection method and system
US6792145B2 (en) Pattern recognition process for text document interpretation
JP3812818B2 (en) Database generation apparatus, database generation method, and database generation processing program
Besagni et al. Citation recognition for scientific publications in digital libraries
US20220198133A1 (en) System and method for validating tabular summary reports
CN105701086A (en) Method and system for detecting literature through sliding window
CN116360794A (en) Database language analysis method, device, computer equipment and storage medium
CN115270723A (en) PDF document splitting method, device, equipment and storage medium
CN105677641A (en) Paper self-inspection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20221122