CN116029262A - Legal and legal code generation method, database construction method and device - Google Patents
Legal and legal code generation method, database construction method and device Download PDFInfo
- Publication number
- CN116029262A CN116029262A CN202310125882.3A CN202310125882A CN116029262A CN 116029262 A CN116029262 A CN 116029262A CN 202310125882 A CN202310125882 A CN 202310125882A CN 116029262 A CN116029262 A CN 116029262A
- Authority
- CN
- China
- Prior art keywords
- information
- legal
- name
- coding
- release
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Abstract
The embodiment of the invention relates to the field of coding technology research and discloses a method for generating legal and legal codes, a method for constructing a database and a device thereof; the method for generating the codes comprises the following steps: encoding the release information based on a preset name encoding rule to obtain name encoding information of corresponding laws and regulations; coding the legal regulation content information based on a preset content coding rule to obtain content coding information of corresponding legal regulation content; the name code information and the content code information constitute code information of legal and regulatory information to be coded. The legal regulation coding method can well complete coding and updating of legal regulation contents by adopting a multi-stage coding mode, is convenient for other systems to carry out identification citation and has low information integration cost, and the coding mode also has the characteristics of uniqueness, universality, simplicity, consistency and readability.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method for generating legal and legal codes, a method for constructing a database and a device thereof.
Background
At present, although there are various kinds of legal, department regulation and regulation documents (hereinafter referred to as legal and regulation), there are cases such as different regulations and repeated numbers (for example, "the eighth of chairman's office), and there are cases of eighth and thirteenth of the office, etc. due to historical reasons. When people refer to the information system, the information system is usually described by using a mode of legal and legal names and text sending numbers, and the mode is friendly to readers, but is inconvenient for the information system to use, and particularly when a plurality of independent systems are used for data aggregation and fusion, the integration cost is extremely high due to different coding rules of the systems. Therefore, designing a solution for information integration and citation is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
Aiming at the defects, the invention discloses a method for generating legal regulation codes and a method and a device for constructing a database, which can well finish the coding and updating of legal regulation contents by adopting a multi-stage coding mode, are convenient for other systems to carry out identification quotation and have lower information integration cost, and the coding mode also has the characteristics of uniqueness, universality, simplicity, consistency and readability.
The first aspect of the embodiment of the invention discloses a method for generating legal regulation codes, which comprises the following steps:
acquiring legal regulation information to be encoded, wherein the legal regulation information to be encoded comprises release information and legal regulation content information;
coding the release information based on a preset name coding rule to obtain name coding information of corresponding laws and regulations;
coding the legal regulation content information based on a preset content coding rule to obtain content coding information of corresponding legal regulation content; the name coding information and the content coding information form coding information of the legal and regulatory information to be coded.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the release information includes release unit information, release date information, and release name information; the name coding rules comprise efficacy grade mapping relation, compartment code mapping relation, name definition rules and time definition rules;
the encoding the release information based on the preset name encoding rule to obtain the name encoding information of the corresponding legal regulation comprises the following steps:
identifying the release unit information to obtain efficacy grade information of the legal and regulatory information to be coded, and obtaining efficacy grade codes of release information based on the efficacy grade information and efficacy grade mapping relation;
Identifying the release unit information to obtain administrative division information of the legal and regulatory information to be coded, and obtaining division codes of the release information based on the administrative division information and division code mapping relation;
acquiring a name code of the release information based on the release name information and the name definition rule;
obtaining a time code of release information based on the release date information and the time definition rule; the efficacy grade code of the release information, the region code of the release information, the name code of the release information and the time code of the release information jointly form the name code information of the corresponding laws and regulations.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the step of obtaining a name code of the published information based on the published name information and the name definition rule includes:
performing character filtering operation on the release name information to obtain filtered release name information; the character filtering operation comprises time and number filtering, punctuation character filtering and special character filtering, wherein in the special character filtering, the filtering operation is carried out when the special character appears at a set position, otherwise, the special character filtering is not carried out;
And arranging the preset number of characters in the release name information in a reverse coding mode to obtain corresponding name codes according to the name definition rules.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the preset number is 10;
the method for arranging the preset number of characters in the published name information in a reverse coding mode according to the name definition rule to obtain corresponding name codes comprises the following steps:
character recognition is carried out on the filtered release name information to determine corresponding character information;
when the character information is character information, acquiring pinyin initials corresponding to the character information and first position information of each pinyin initial in the release name information;
when the character information is a number or a letter, acquiring the corresponding number or letter and the second position information of each number or letter in the release name information;
and arranging the pinyin initial letters, numbers or letters in a reverse coding mode according to the first position information and the second position information to obtain corresponding name codes, and adding placeholders in the name codes until the total coding bit number is 10 when the number of the character information is not more than 10.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the obtaining a name code of the published information based on the published name information and the name definition rule further includes:
extracting each character information in the filtered release name information;
determining the representation value corresponding to each character information according to a preset alphanumeric correspondence table;
determining a weighting factor of each character position based on the representation value and a weighting factor calculation formula, wherein the weighting factor calculation formula is as follows: w (W) i =2 (i-1) (mod 11) wherein W is i Is a weighting factor;
determining and obtaining corresponding anti-duplication codes according to an anti-duplication code calculation formula and a conversion relation table, wherein the conversion relation table is a mapping relation table between anti-duplication codes and anti-duplication values, the anti-duplication codes are in one-to-one correspondence with the anti-duplication values, and the anti-duplication code calculation formula is as follows: x is X 31 =The method comprises the steps of carrying out a first treatment on the surface of the Wherein X is 31 To prevent duplicate codes, a i For representing the value, W, corresponding to the character information i And representing the corresponding weighting factors of the values for each character, wherein i is the position serial number from left to right of each character of the filtered release name information.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the method for generating the code further includes:
When the release date information is detected to be missing, reading implementation time information of the legal and regulatory information to be coded to carry out date coding;
the content information of the laws and regulations comprises hierarchy name information, wherein the hierarchy name information comprises a hierarchy name and a hierarchy serial number;
the content coding information of the legal and legal content is coded based on a preset content coding rule to obtain content coding information of corresponding legal and legal content, which comprises,
determining a hierarchy code according to hierarchy name information and a hierarchy code mapping relation, wherein the hierarchy name information corresponds to the hierarchy code one by one; the hierarchical code and the hierarchical sequence number constitute the content encoding information.
The second aspect of the embodiment of the invention discloses a method for constructing a database of laws and regulations, which comprises the following steps:
acquiring legal and regulatory information and corresponding coding information obtained by the legal and regulatory coding method disclosed in the first aspect of the embodiment of the invention;
and storing the legal and legal information and corresponding coding information in a correlated way.
A third aspect of the embodiment of the present invention discloses a device for generating legal regulation codes, including:
an input module: the method comprises the steps of acquiring legal and legal information to be encoded, wherein the legal and legal information to be encoded comprises release information and legal content information;
A first encoding module: the name code information is used for coding the release information to obtain corresponding legal regulations;
and a second encoding module: the content coding information is used for coding the legal and legal content information to obtain corresponding legal and legal content;
and an output module: and the code information is used for outputting the legal and regulatory information to be coded according to the name code information and the content code information.
A fourth aspect of an embodiment of the present invention discloses an electronic device, including: a memory storing executable program code; a processor coupled to the memory; the processor invokes the executable program code stored in the memory to perform the method for generating the legal codes disclosed in the first aspect of the embodiment of the invention.
A fifth aspect of the embodiments of the present invention discloses a computer-readable storage medium storing a computer program, wherein the computer program causes a computer to execute the method for generating the legal codes disclosed in the first aspect of the embodiments of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the method for coding the legal regulations divides the coding of the legal regulations into two parts, wherein the first part is the coding of the legal regulation release information and the second part is the coding of the legal regulation content information; wherein the first part codes the legal and legal release information based on preset name coding rules; the second part codes legal and legal release information based on preset content coding rules; the name coding information obtained in the first part is combined with the content coding information obtained in the second part, so that all coding information of legal and legal information to be coded is obtained, coding and updating of legal and legal content can be well completed in the multi-level coding mode, identification and quotation of other systems are facilitated, the information integration cost is low, and the coding mode also has the characteristics of uniqueness, universality, simplicity, consistency and readability.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a method of legal code generation disclosed by an embodiment of the invention;
FIG. 2 is a flow chart of the name encoding steps of the method for legal code generation disclosed in the embodiments of the present invention;
FIG. 3 is a schematic flow chart of a date removal method of legal code generation disclosed in an embodiment of the present invention;
FIG. 4 is a flow chart of the method for generating legal codes disclosed in the embodiment of the invention with labels removed;
fig. 5 is a schematic structural diagram of a device for generating legal and legal codes according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a user interface disclosed in an embodiment of the invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present invention are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Referring to fig. 1-4, fig. 1 is a flow chart of a method for generating legal codes according to an embodiment of the present invention. The method is suitable for intelligent equipment such as mobile phones, tablet computers and the like with processing functions, and computing equipment such as computers, servers and the like. As shown in fig. 1, the method for generating the legal code comprises the following steps:
s1: reading legal regulation information to be encoded, identifying each item of basic information of the legal regulation to be encoded to obtain basic identification information, and encoding according to the basic identification information to obtain the code X of the legal regulation to be encoded;
The coding step of the coding X of the legal regulation to be coded comprises the following steps:
s11: reading the legal regulations to be compiled, and identifying various basic information of the legal regulations to be compiled, wherein the basic information content comprises efficacy grade information, administrative division information, legal regulation name information and release date information;
s12: acquiring efficacy grade information, and coding according to the efficacy grade information to obtain an efficacy grade code X 1 ;
S13: acquiring administrative division information, and coding according to the administrative division information to obtain a division code X 2 ;
S14: acquiring name information of laws and regulations to be coded, and coding according to the name information of the laws and regulations to be coded to obtain name code X 3 ;
S15: acquiring release date information, and coding according to the release date information to obtain release date code X 4 ;
S16: acquisition class code X 1 Division code X 2 Name code X 3 Date code X 4 According to X 1 X 2 -X 3 -X 4 Obtaining the coding X of the legal rule to be coded;
s2: reading the legal regulations to be compiled, identifying the text of the legal regulations to be compiled to obtain text identification information, and encoding according to the text identification information to obtain chapter and item codes Y of the legal regulations to be compiled;
s3: acquiring the code X of the legal regulation to be coded and the code Y of the chapter clause, and obtaining the code X of the legal regulation to be coded according to the ordering rule of X-Y 1 X 2 -X 3 -X 4 -Y。
In closed systems, unique IDs are typically used for encoding, but it is apparent that IDs do not have versatility and readability; coding generality means that no matter any person or any system is coding the same legal rule, the coding results are consistent as long as the same coding rule is followed; only by following the principle, the requirement that the independent coding of different systems is consistent in coding result can be met; the generality principle determines that the coding cannot use sequential codes; the coding rule needs to be simplified as much as possible, the coding threshold is extremely low, no machine equipment is relied on, and any person can accurately code according to the coding rule; the simplicity principle is embodied in that the elements involved in the encoding are as few as possible and readily available. The consistency principle refers to the same legal regulation, the coding main bodies of the principle are consistent, and the revision condition of the legal regulation can be embodied through the principle; meanwhile, the code result is kept readable as much as possible, namely, people can know which legal rule approximately by reading the code.
In view of the above, the method divides the code of the law and regulation into two parts, wherein the first part is the code of the law and regulation, and the second part is the code of the chapter and clause item; the first part codes the laws and regulations per se according to a certain rule respectively by acquiring efficacy grade information, administrative division information, laws and regulations name information and release date information, and then integrating and sequencing to acquire codes of the laws and regulations per se to be coded; the code of the legal rule to be coded obtained in the first part is combined with the code of the chapter clause obtained in the second part, so that the code of the legal rule to be coded is obtained, the code of the legal rule obtained by the method is convenient for other systems to carry out identification and quotation, and the information integration cost is low, and the coding mode also has the characteristics of uniqueness, universality, simplicity, consistency and readability.
Specifically, in step S12, the efficacy level of the efficacy level information may be classified into national laws, legal interpretations, national administration regulations, national institutes regulation files, local regulations, local government regulations, local regulation files.
When the efficacy level of the law to be compiled is national law, X 1 =GF;
When the efficacy level of the law to be compiled is a legal interpretation, X 1 =JS;
When the efficacy level of the legal regulation to be coded is national administration regulation, X 1 =GG;
X when the efficacy level of the legal regulation to be compiled is national institutes of regulation 1 =GZ;
When the efficacy level of the legal regulation to be compiled is national institutes of standardization document, X 1 =GW;
When the efficacy level of the legal regulation to be coded is local regulation, X 1 =DG;
X when the level of effectiveness of the legal regulation to be coded is local government regulation 1 =DZ;
When the efficacy level of the law and regulation to be compiled is a local regulation file, X 1 =DW;
The specific coding methods for efficacy ratings are shown in Table 1.
TABLE 1
Code X1 | Efficacy rating | Code description |
GF | National law | Take the initial letters of Pinyin of "Guo", "Fang |
JS | Interpretation of law | Take the initial letters of "explaining" and "releasing" Pinyin |
GG | National administration regulations | Take the initial letters of Pinyin of "Guo" and "rule |
GZ | National institutes of sector regulations | Take the initial letters of Pinyin of "Guo" and "Chapter |
GW | National department standardization file | Take the initial letters of Pinyin of "Guo", "Wen |
DG | Local regulations | Take the initial letters of Pinyin of "Di", "Jiang" and "Pinyin |
DZ | Local government regulations | Take the initial letters of Pinyin of "Di", "Chapter |
DW | Local normalization file | Take the initial letters of Pinyin of "Di", "Wen |
The supervision level of China is mainly divided into four levels of country level, province level, city level and county level, so the administrative division related in the step S13 is also limited to four levels of country, province, city and county. According to the regulations of the 5.1 th and 5.2 th of the code of the administrative division of the people's republic of China of GB/T2260-2007, the code of the 6-bit division is taken as the code of the administrative division of the method.
Meanwhile, "000000" for 4 efficacy grades of national laws, administrative laws, national institutes of regulations and national institutes of regulation, and "660000" for Xinjiang production and construction weapons, "810000" for hong Kong special administrative district and "820000" for Australian special administrative district are expanded. Namely:
when the administrative district to be compiled with laws and regulations is divided into national usages, X 2 =000000;
When the administrative division of the law and regulation to be compiled is Beijing city, X 2 =110000;
When the administrative division of the law and regulation to be compiled is Tianjin city, X 2 =120000;
When the administrative division of the law and regulation to be compiled is Hebei province, X 2 =130000;
When the administrative area to be compiled with laws and regulations is divided into Xinjiang production and construction weapon, X 2 =660000;
When the administrative district to be compiled with legal regulations is divided into hong Kong special administrative district, X 2 =810000;
When the administrative area to be compiled of law and regulation is divided into Australian special administrative areas, X 2 =820000;
When the administrative district to be compiled with legal regulations is divided into Taiwan provinces, X 2 =830000;
The specific coding method of the administrative division is shown in table 2.
TABLE 2
Code X2 | Administrative division |
000000 | Universal for China |
110000 | Beijing city |
120000 | Tianjin city |
130000 | Hebei province |
140000 | Shanxi province |
150000 | Inner Mongolia Autonomous Region |
... | Other than just to list |
660000 | Xinjiang production and construction weapon |
810000 | Hong Kong Special Administrative Region |
820000 | Australian special administrative district |
830000 | Taiwan province |
Further, for the city and county (district) level administrative division codes, 6-bit administrative division codes are taken as administrative division codes of the method according to the regulations of the 5.2 th item of the administrative division code of the people's republic of China of GB/T2260-2007, and the embodiment is exemplified by Zhaoqing city in Guangdong province:
when the administrative division to be compiled into legal regulations is Zhaoqing city, X 2 =441200;
When the administrative region to be compiled with the law and regulation is an end state region, X 2 =441202;
When a government area to be compiled with laws and regulations is divided into ancient cooking vessel area, X 2 =441203;
When the administrative area to be compiled with laws and regulations is divided into high-priority areas, X 2 =441204;
The specific coding method of the administrative division of the city and county (district) level is shown in table 3.
TABLE 3 Table 3
Code | Administrative division |
441200 | Zhaoqing city |
441202 | End state region |
441203 | Tripod lake region |
441204 | High importance area |
441223 | Guangning county |
441224 | Pocket-size county |
... | Other than just to list |
In some laws and regulations, the main part of the name is often behind, and the front part is more information such as application scope, textbook unit, etc., if the traditional coding rule is adopted, more coding is wasted. Of course, these characters may be removed for encoding, but it is obvious that this increases the difficulty of encoding, so in step S14, the encoding method of the legal name adopts a reverse encoding method from the back to the front.
This example lists the names of some laws and regulations, as shown in table 4.
TABLE 4 Table 4
Efficacy rating | Examples of the examples |
National law | Safety production method for people's republic of China |
National administration regulations | Registration and management regulation of names of grain circulation management regulation enterprises of national market registration and management regulation regulations of the people's republic of China |
National institutes of sector regulations | Medical security fund useSupervision and report processing temporary method agricultural rural department decides public institution finance about modification of regulations such as agricultural transgenic biological safety evaluation management method Rules of |
National department standardization file | The financial department, tax administration and public notice water conservancy department for prolonging the execution period of partial tax preferential policy further increase the place where the printout has obvious effect on the real dry effect of the long-made work of the river and the lake Informing the market regulatory agency of the implementation of incentive support, people bank bulletin 2022, no. 5-about publishing "financial technology product authentication catalog (second lot)" financial technology product Bulletin of the rules for authentication (2022 revision) |
Local regulations | Ecological protection and high-quality development advanced area promotion regulation Yangquan city people representative large conference rule Ningxia Hui autonomous region construction yellow river basin ecological protection and high-quality development advanced area promotion regulation Yangquan city people representative large conference common committee Determination of modification of "Yangquan City greening regulations" on Yangquan City Capacity and environmental sanitation management regulations |
Local government regulations | Decision of Harbin city people government on consignment to implement transportation administration penalty using supervision and management method by Anhui province medicine and medical apparatus |
Local normalization file | The development and reform committee of Yunnan province, the housing of Yunnan province and the urban and rural construction hall, the public security hall of Yunnan province and the natural resource hall of Yunnan province are about to push the development and implementation of urban parking facilities of the Indonesia province See several comments by the Huzhou municipality government about supporting the growth of industrial enterprises to stay in staff to promote production |
In this embodiment, step S14 further includes:
s141: and identifying bracket characters in the name information of the legal regulations to be compiled, and identifying whether the bracket characters contain date elements or not, if so, deleting all the brackets and all the characters in the brackets to obtain the name information of the removed date.
Such as for name information: after the step S141 of the "safety production method of the people' S republic of China (2021 modification)", the name information of the removal date obtained is: "safe production method of the people's republic of China".
In order to intuitively embody the version of the law and regulation, the revision year mark such as "the safety production method of the people's republic of China (2021 revised)" is often added in the name of the law and regulation, and in order to ensure that the codes of the same law and regulation name part are consistent, elements such as "(2021 revised)" and the like need to be removed. In this embodiment, the information of the removed date element such as "(2021 modified)" is not lost, and this part of the content can be displayed in step S15, encoded by date X 4 Version distinction and revision of laws and regulations can be better embodied.
S142: further processing the date name information obtained in the step S141, identifying whether the date name information contains punctuation characters or not, and identifying the punctuation characters;
if the punctuation character is a character (trial) ", identifying whether the character (trial)", if so, deleting the character (trial) "; if not, reserving the character 'trial';
if the punctuation character is not a character (trial) ", deleting the punctuation character;
thereby obtaining name information from which the reference number is removed;
such as for name information: after the step S142, the "determination by agricultural rural area regarding modification of regulations such as" agricultural transgenic biosafety evaluation management method ", the name information of the obtained removal label is: "determination of regulations for modification of agricultural transgenic biosafety evaluation management methods and the like" by agricultural rural areas.
Such as for name information: after the step S142 of "notification of" the highest national institute of people "concerning several opinions (trial runs) of the review project of the standard national institute of people," the name information of the obtained removal label is: "the highest national court issues notice of several opinion trial runs regarding the review of the standardized national court".
Such as for name information: after the step S142, the explanation of the law applicable to the case of handling the dangerous food security for the highest national institute of inspection and the highest national institute of inspection is performed, the name information of the obtained removal label is: "the highest people's court, the highest people's inspection court, is about the interpretation of several questions of law applicable to transacting criminal cases that jeopardize food safety".
Such as for name information: after the step S142, the "rules for supervision of litigation of civil inspection and hospital (trial run)" is performed, the name information of the obtained removal label is: "rules for supervision of civil inspection hospital administrative litigation".
The punctuation marks contained in legal and legal names to be coded are deleted, so that the coding standardization is better maintained. And considering the principle of the consistency of the code main body of the same legal rule, all names ending with the 'trial' are removed, and the 'trial' is not removed if the 'trial' appears in the middle of the name information.
S143: the name information with the labels removed obtained in the step S142 is reversely encoded, which includes the following steps:
s143.1: reading name information of the removed labels, identifying the number of characters of the name information of the removed labels, and intercepting the last 10 characters if the number of characters of the name information is larger than 10; if the number of the name information characters is smaller than 10, adding a placeholder '0' after the name information to make the name information characters be 10;
S143.2: for step S143.1: the obtained 10 characters are encoded, the 10 characters are respectively identified, and if the characters are Chinese characters, the pinyin initial of the Chinese characters are extracted; if the character is Arabic numerals or letters, directly extractingThe character; arranging the extracted characters according to the extraction source sequence to obtain a name code X 3 。
When the name information is: x in the process of 'safety production method of the people's republic of China 3 =RMGHGAQSCF;
When the name information is: "determination of agricultural rural department regarding modification of regulations such as" agricultural transgenic biological safety evaluation management method "and the like", X 3 =GLBFDGZDJD;
When the name information is: x when "rules for supervision of civil inspection and administration litigation (trial run)" 3 =CYXZSSJDGZ;
When the name information is: x in the case of "grain circulation management regulations 3 =LSLTGLTL00;
In some preferred embodiments, the name encodes X 3 The method also comprises an anti-duplicate code X 31 ;
When the name information is: x in the process of 'safety production method of the people's republic of China 3 =RMGHGAQSCFX 31 ;
When the name information is: x in the case of "grain circulation management regulations 3 =LSLTGLTL00X 31 ;
Wherein the repetition code X is prevented 31 The calculation formula of (2) is as follows:
in the above formula:
ifor the position number from left to right of the name information of the removal label obtained in step S142, the character positioniExamples of (a) are shown in table 5.
TABLE 5
Switch for closing | In the following | Repair tool | Improvement of | 《 | Teaching aid | Nursing out | Storage device | Storage device | Pipe | Management device | Office work | Method of | 》 | Etc | Gauge | Seal | A kind of electronic device | Block for solving the problem of | Fixing | |
i | ||||||||||||||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
α i A representation value of the character for position i; numerical values of 1 to 35 are given as their representation values according to the content represented by the character.
The representation values can be given by the following method, if the character is a Chinese character, alpha i A representing value corresponding to the initial phonetic letter of the character value; if the character is a letter, alpha i Representing values for characters corresponding to the letters; if the character is a number, alpha i Numerical values for this number. The numerical values corresponding to the letters are shown in table 6.
TABLE 6
W i Represent the firstiA weighting factor in location, wherein:
W i =2 (i-1) (mod11);
in this embodiment, the positions are listedi isWeighting factors corresponding to 1-35W i The values are shown in Table 7.
TABLE 7
The method encodes X by name 3 Setting up the anti-duplication code X in the middle 31 Anti-repetition code X 31 35 characters which effectively participate in coding are intercepted and coded through operation, so that coding repetition is further reduced, and coding uniqueness is ensured.
Specifically, in step S15, the release date is encoded in the form of yyymmdd, where yyy is the year of the release date; mm is the month of release date, when the month is less than two digits, adding a virtual character '0' in front of the month for filling; dd is the date of release as expected, and when the date is less than two digits, the virtual digit character '0' is added before the date for filling.
For example, when the release date of the legal regulations to be coded is 2022, 1 month and 5 days, the release date codes X 4 =20220105;
Considering that laws and regulations which are long in the past may have a condition of lacking release date, if no release is as expected, identifying the implementation date of the laws and regulations to be coded, and coding by using the implementation date.
The method has the advantages that the release date code is set, the readability of the release date code is realized, the uniqueness of the code can be further enhanced, and the revision condition of the same legal regulation is embodied.
The present embodiment exemplifies that after coding the legal regulation to be coded by the method, the obtained legal regulation to be coded is coded by itself, as shown in table 8.
TABLE 8
In step S2, the method for encoding chapter and item of the legal regulation to be edited includes:
reading the legal regulations to be encoded, identifying the text of the legal regulations to be encoded, dividing the text into seven stages of encoding, chapter, section, bar, money, item and item, and encoding in a manner of taking the pinyin initial and the sequence number of the hierarchical name.
If the text of the legal regulation to be compiled is "first compilation", chapter entry code y=p1;
when the body of the legal regulation to be compiled is "first compiling a first chapter", chapter clauses code y=p1z1;
When the text of the legal and legal regulation to be compiled is "first section of first chapter of first compilation", chapter entry code Y=P1Z1J1;
in this embodiment, after the chapter entries of the legal regulations to be compiled are encoded by the method, the obtained chapter entry codes are listed as shown in table 9.
TABLE 9
Code | Hierarchy level | Chapter entry code Y | Coding interpretation | |
P | Braiding machine | P1 | First braiding | |
Z | Seal | P1Z1 | First plaited first chapter | |
J | Node | P1Z1J1 | First section of first plaited | |
T | Strip | T10 | Tenth item | |
K | Money type | T10K1 | Tenth item of the first item | |
X | Items | T10K1X1 | Tenth item of first item | |
M | Order of (A) | | Item | 1 of the tenth first item |
By using the method of the legal code, the issuing units and various informationized systems can be independently coded, and the maximum probability follows the principles of uniqueness, universality and consistency. With the overall advancement of the "internet + regulatory" there is an urgent need to build a dynamic national legal regulation database and to perform a structured process to achieve quote associations with licensing matters, regulatory matters, rights lists, etc. By the method, the realization of the comprehensive record registration of newly issued national laws, administrative regulations, department regulations, normative files, local regulations, local department regulations and local department normative files can be accelerated, and the automatic generation of codes and the automatic structured warehousing of text texts are realized through the record registration. Secondly, the license matters, the supervision matters, the responsibility list, the track list and the like are associated with references according to laws and regulations and the like. Thirdly, dynamic monitoring of legal regulations and the like according to the aging state is realized, automatic early warning is carried out on the effective states of permission matters and the like, and legal compliance of the Internet and supervision is ensured.
Example two
Embodiment two discloses a method for constructing a database of structured laws and regulations, which comprises,
the information of the laws and regulations is obtained,
the method for coding legal and legal information according to the first embodiment codes legal and legal information to obtain corresponding name coding information and content coding information.
The method encodes the existing laws and regulations, not only meets the basic principles of GB/T7027-2002 information classification and encoding and the basic principles of methods, but also has the characteristics of uniqueness, universality, simplicity, consistency and readability. The method can realize the comprehensive record registration of newly released national laws, administrative laws, department regulations, normative files, local regulations, local department regulations and local department normative files, and realize the automatic generation of codes and the automatic structured warehousing of text texts through the record registration; and the license matters, the supervision matters, the authority list, the track list and the like are associated with references according to laws and regulations and the like. The method realizes dynamic monitoring of laws and regulations and the like according to the aging state, carries out automatic early warning on the effective state of permission matters and the like, and ensures legal compliance of the Internet and supervision.
Example III
Referring to fig. 5-6, fig. 5 is a schematic structural diagram of an apparatus for generating legal codes according to an embodiment of the present invention. As shown in fig. 5, the device for generating the legal code may include:
an input module: for entering legal regulations to be compiled;
a first encoding module: the method is used for reading the legal regulations to be coded, identifying and analyzing various information of the target legal regulations and outputting codes of the legal regulations;
and a second encoding module: the method comprises the steps of reading legal regulations to be compiled, intelligently identifying and analyzing texts of target legal regulations, and outputting chapter and item codes;
and an output module: the method comprises the steps of reading codes of laws and regulations and codes of terms of an output section, sorting and integrating, and outputting codes of laws and regulations to be coded; the output module is provided with a user interface for displaying legal and legal codes to be coded.
The device for coding the legal regulations in the embodiment of the invention can better improve the coding efficiency and avoid deviation of codes, and the device provides an automatic coding function and an interface to help a compiling unit and an informatization system to quickly generate and correct codes. And can be realized by the device:
a) And inputting legal and legal names, release dates, efficacy grades and administrative districts, and automatically generating codes.
b) And intelligently analyzing the name structure and automatically acquiring administrative division codes and efficacy grades.
c) Matching databases have laws and regulations, and codes are directly generated through names.
Example IV
Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the invention. The electronic device may be a computer, a server, or the like, and of course, may also be an intelligent device such as a mobile phone, a tablet computer, a monitoring terminal, or the like under certain circumstances. As shown in fig. 7, the electronic device may include:
a memory storing executable program code;
a processor coupled to the memory;
wherein the processor invokes executable program code stored in the memory to perform some or all of the steps in the method of legal codes in embodiment one.
The embodiment of the invention discloses a computer-readable storage medium storing a computer program, wherein the computer program causes a computer to execute part or all of the steps in the method of legal and legal coding in the first embodiment.
The embodiment of the invention also discloses a computer program product, wherein the computer program product enables a computer to execute part or all of the steps in the method for coding laws and regulations in the first embodiment.
The embodiment of the invention also discloses an application release platform, wherein the application release platform is used for releasing a computer program product, and the computer program product, when running on the computer, causes the computer to execute part or all of the steps in the method for coding laws and regulations in the first embodiment.
In various embodiments of the present invention, it should be understood that the size of the sequence numbers of the processes does not mean that the execution sequence of the processes is necessarily sequential, and the execution sequence of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-accessible memory. Based on this understanding, the technical solution of the present invention, or a part contributing to the prior art or all or part of the technical solution, may be embodied in the form of a software product stored in a memory, comprising several requests for a computer device (which may be a personal computer, a server or a network device, etc., in particular may be a processor in a computer device) to execute some or all of the steps of the method according to the embodiments of the present invention.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.
Those of ordinary skill in the art will appreciate that some or all of the steps of the various methods of the described embodiments may be implemented by hardware associated with a program that may be stored in a computer-readable storage medium, including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable ProgrammableRead-Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), or other optical disk Memory, magnetic disk Memory, tape Memory, or any other medium capable of being used to carry or store data that is readable by a computer.
The above method, device, electronic equipment and storage medium for coding laws and regulations disclosed in the embodiments of the present invention are described in detail, and specific examples are applied herein to illustrate the principles and embodiments of the present invention, where the above description of the embodiments is only for helping to understand the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.
Claims (10)
1. A method of legal code generation, comprising:
acquiring legal regulation information to be encoded, wherein the legal regulation information to be encoded comprises release information and legal regulation content information;
coding the release information based on a preset name coding rule to obtain name coding information of corresponding laws and regulations;
coding the legal regulation content information based on a preset content coding rule to obtain content coding information of corresponding legal regulation content; the name coding information and the content coding information form coding information of the legal and regulatory information to be coded.
2. The method of claim 1, wherein the release information includes release unit information, release date information, and release name information; the name coding rules comprise efficacy grade mapping relation, compartment code mapping relation, name definition rules and time definition rules;
the encoding the release information based on the preset name encoding rule to obtain the name encoding information of the corresponding legal regulation comprises the following steps:
classifying and identifying the release information to obtain release unit information, release date information and release name information;
identifying the release unit information to obtain efficacy grade information of the legal and regulatory information to be coded, and obtaining efficacy grade codes of release information based on the efficacy grade information and efficacy grade mapping relation;
identifying the release unit information to obtain administrative division information of the legal and regulatory information to be coded, and obtaining division codes of the release information based on the administrative division information and division code mapping relation;
acquiring a name code of the release information based on the release name information and the name definition rule;
Obtaining a time code of release information based on the release date information and the time definition rule; the efficacy grade code of the release information, the region code of the release information, the name code of the release information and the time code of the release information jointly form the name code information of the corresponding laws and regulations.
3. The method of claim 2, wherein the step of obtaining the name code of the published information based on the published name information and the name definition rule comprises:
performing character filtering operation on the release name information to obtain filtered release name information; the character filtering operation comprises time and number filtering, punctuation character filtering and special character filtering, wherein in the special character filtering, the filtering operation is carried out when the special character appears at a set position, otherwise, the special character filtering is not carried out;
and arranging the preset number of characters in the release name information in a reverse coding mode to obtain corresponding name codes according to the name definition rules.
4. The method of legal codes generation according to claim 3, wherein said preset number is 10;
The method for arranging the preset number of characters in the published name information in a reverse coding mode according to the name definition rule to obtain corresponding name codes comprises the following steps:
character recognition is carried out on the filtered release name information to determine corresponding character information;
when the character information is character information, acquiring pinyin initials corresponding to the character information and first position information of each pinyin initial in the release name information;
when the character information is a number or a letter, acquiring the corresponding number or letter and the second position information of each number or letter in the release name information;
and arranging the pinyin initial letters, numbers or letters in a reverse coding mode according to the first position information and the second position information to obtain corresponding name codes, and adding placeholders in the name codes until the total coding bit number is 10 when the number of the character information is not more than 10.
5. The method for generating a legal code according to claim 3, wherein said obtaining a name code of the published information based on the published name information and the name definition rule further comprises:
Extracting each character information in the filtered release name information;
determining the representation value corresponding to each character information according to a preset alphanumeric correspondence table;
determining a weighting factor of each character position based on the representation value and a weighting factor calculation formula, wherein the weighting factor calculation formula is as follows: w (W) i =2 (i-1) (mod 11) wherein W is i Is a weighting factor;
determining and obtaining corresponding anti-duplication codes according to an anti-duplication code calculation formula and a conversion relation table, wherein the conversion relation table is a mapping relation table between anti-duplication codes and anti-duplication values, the anti-duplication codes are in one-to-one correspondence with the anti-duplication values, and the anti-duplication code calculation formula is as follows: x is X 31 =Wherein X is 31 To prevent duplicate codes, a i For representing the value, W, corresponding to the character information i And representing the corresponding weighting factors of the values for each character, wherein i is the position serial number from left to right of each character of the filtered release name information.
6. The method of legal codes generation of claim 2, wherein said method of code generation further comprises:
when the release date information is detected to be missing, reading implementation time information of the legal and regulatory information to be coded to carry out date coding;
The content information of the laws and regulations comprises hierarchy name information, wherein the hierarchy name information comprises a hierarchy name and a hierarchy serial number;
the content coding information of the legal and legal content is coded based on a preset content coding rule to obtain content coding information of corresponding legal and legal content, which comprises,
determining a hierarchy code according to hierarchy name information and a hierarchy code mapping relation, wherein the hierarchy name information corresponds to the hierarchy code one by one; the hierarchical code and the hierarchical sequence number constitute the content encoding information.
7. A method for constructing a database of legal regulations, comprising:
acquiring legal regulation information and corresponding coding information obtained by the legal regulation coding method according to any one of claims 1 to 6;
and storing the legal and legal information and corresponding coding information in a correlated way.
8. An apparatus for generating legal code, comprising:
an input module: the method comprises the steps of acquiring legal and legal information to be encoded, wherein the legal and legal information to be encoded comprises release information and legal content information;
a first encoding module: the name code information is used for coding the release information to obtain corresponding legal regulations;
And a second encoding module: the content coding information is used for coding the legal and legal content information to obtain corresponding legal and legal content;
and an output module: and the code information is used for outputting the legal and regulatory information to be coded according to the name code information and the content code information.
9. An electronic device, comprising: a memory storing executable program code; a processor coupled to the memory; the processor invokes the executable program code stored in the memory for performing the method of legal code generation of any of claims 1 to 6.
10. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to execute the method of legal code generation according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310125882.3A CN116029262B (en) | 2023-02-17 | 2023-02-17 | Legal and legal code generation method, database construction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310125882.3A CN116029262B (en) | 2023-02-17 | 2023-02-17 | Legal and legal code generation method, database construction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116029262A true CN116029262A (en) | 2023-04-28 |
CN116029262B CN116029262B (en) | 2023-06-09 |
Family
ID=86081200
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310125882.3A Active CN116029262B (en) | 2023-02-17 | 2023-02-17 | Legal and legal code generation method, database construction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116029262B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116205496A (en) * | 2023-04-04 | 2023-06-02 | 广东远景信息科技有限公司 | Compliance risk management and control method, system, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106815256A (en) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | Set up the method and device of laws and regulations bar fund incidence relation |
CN111783399A (en) * | 2020-06-24 | 2020-10-16 | 北京计算机技术及应用研究所 | Legal referee document information extraction method |
CN114461915A (en) * | 2022-02-11 | 2022-05-10 | 哈尔滨学院 | Accurate recommendation system for legal provision |
CN115374239A (en) * | 2022-07-13 | 2022-11-22 | 北京中海住梦科技有限公司 | Legal and legal analysis method and device, computer equipment and readable storage medium |
CN115545671A (en) * | 2022-11-02 | 2022-12-30 | 广州明动软件股份有限公司 | Method and system for structured processing of laws and regulations |
CN115604076A (en) * | 2022-09-27 | 2023-01-13 | 中国建设银行股份有限公司(Cn) | Information processing method, device, equipment and computer readable storage medium |
-
2023
- 2023-02-17 CN CN202310125882.3A patent/CN116029262B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106815256A (en) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | Set up the method and device of laws and regulations bar fund incidence relation |
CN111783399A (en) * | 2020-06-24 | 2020-10-16 | 北京计算机技术及应用研究所 | Legal referee document information extraction method |
CN114461915A (en) * | 2022-02-11 | 2022-05-10 | 哈尔滨学院 | Accurate recommendation system for legal provision |
CN115374239A (en) * | 2022-07-13 | 2022-11-22 | 北京中海住梦科技有限公司 | Legal and legal analysis method and device, computer equipment and readable storage medium |
CN115604076A (en) * | 2022-09-27 | 2023-01-13 | 中国建设银行股份有限公司(Cn) | Information processing method, device, equipment and computer readable storage medium |
CN115545671A (en) * | 2022-11-02 | 2022-12-30 | 广州明动软件股份有限公司 | Method and system for structured processing of laws and regulations |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116205496A (en) * | 2023-04-04 | 2023-06-02 | 广东远景信息科技有限公司 | Compliance risk management and control method, system, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN116029262B (en) | 2023-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Andresen et al. | Spatial heterogeneity in crime analysis | |
US11263714B1 (en) | Automated document analysis for varying natural languages | |
CN110597870A (en) | Enterprise relation mining method | |
Lotti et al. | Matching of PATSTAT applications to AIDA firms: discussion of the methodology and results | |
CN110795524B (en) | Main data mapping processing method and device, computer equipment and storage medium | |
Chapman | Principles and methods of data cleaning | |
CN106066866A (en) | A kind of automatic abstracting method of english literature key phrase and system | |
CN116029262B (en) | Legal and legal code generation method, database construction method and device | |
CN110196848B (en) | Cleaning and duplicate removal method and system for public resource transaction data | |
CN112732915A (en) | Emotion classification method and device, electronic equipment and storage medium | |
WO2018194799A1 (en) | Multi-factor document analysis | |
US10657368B1 (en) | Automatic human-emulative document analysis | |
CN113449187A (en) | Product recommendation method, device and equipment based on double portraits and storage medium | |
CN113033198A (en) | Similar text pushing method and device, electronic equipment and computer storage medium | |
Khudyakova et al. | Improving the sustainability of regional development in the context of waste management | |
CN114399775A (en) | Document title generation method, device, equipment and storage medium | |
CN113032403A (en) | Data insight method, device, electronic equipment and storage medium | |
CN111738008B (en) | Entity identification method, device and equipment based on multilayer model and storage medium | |
Egger et al. | A new algorithm for matching Chinese NBS firm-level with customs data | |
CN106897429A (en) | SaaS system tenant information acquisition methods and apply its server | |
CN112733537A (en) | Text duplicate removal method and device, electronic equipment and computer readable storage medium | |
CN111159183A (en) | Report generation method, electronic device and computer readable storage medium | |
CN111538768A (en) | Data query method and device based on N-element model, electronic equipment and medium | |
Fernández-Arias et al. | Global Review of International Nuclear Waste Management | |
De Santis et al. | The digital Gazetteer of Ancient Arabia |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |