CN115687632A - Criminal measuring plot decomposition analysis method and system - Google Patents

Criminal measuring plot decomposition analysis method and system Download PDF

Info

Publication number
CN115687632A
CN115687632A CN202211026955.5A CN202211026955A CN115687632A CN 115687632 A CN115687632 A CN 115687632A CN 202211026955 A CN202211026955 A CN 202211026955A CN 115687632 A CN115687632 A CN 115687632A
Authority
CN
China
Prior art keywords
plot
sentencing
criminal
names
plots
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211026955.5A
Other languages
Chinese (zh)
Other versions
CN115687632B (en
Inventor
段智峰
任呈祥
梁新
郭伟登
刘贤艳
谭晓颖
孙晓锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Judicial Big Data Research Institute Co ltd
Original Assignee
China Judicial Big Data Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Judicial Big Data Research Institute Co ltd filed Critical China Judicial Big Data Research Institute Co ltd
Priority to CN202211026955.5A priority Critical patent/CN115687632B/en
Publication of CN115687632A publication Critical patent/CN115687632A/en
Application granted granted Critical
Publication of CN115687632B publication Critical patent/CN115687632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method and a system for decomposing and analyzing criminal episodes. The method comprises the following steps: obtaining an original plot name according to a criminal method and a criminal standardization file, and constructing a criminal plot map; constructing a case plot tree based on the referee document and the constructed criminal plot map; based on case plot trees, carrying out model training on the sentencing plots, and training prediction tasks between the sentencing plots and sentencing results; and the parameter distribution of the trained model is utilized to assist in the analysis of the criminal plot. The invention is based on extracting the sentencing plot definition from the sentencing standardized file and the sentencing plot and the judgment result from the sentencing judgment document, and carries out the joint analysis of the sentencing plot and the judgment result, and utilizes the splitting point of the decision tree to automatically identify the sentencing plot name and the unbalanced cutting point of the corresponding measurement of the sentencing plot, thereby being capable of assisting the legislative and judicial industry personnel to establish a better judicial improvement scheme and carrying out the social situation analysis.

Description

Criminal measuring plot decomposition analysis method and system
Technical Field
The invention belongs to the field of criminal scientific research and judicial application, and particularly relates to a criminal plot decomposition analysis and auxiliary criminal regulation law based on machine learning, decision tree division decomposition and plot concept extraction technologies, and an intelligent regular criminal treatment method and system.
Background
With the rapid advancement of the informatization of the court, in the process of criminal litigation, the work tasks required to be processed by the criminal suspect by the criminal advice and the law are more and more in the process of judge, the highest people's law establishes a standardized set of criminal problems, the standardized problems of the criminal are substantially investigated from 2005, on the basis of repeatedly proving and widely listening to opinions of all boundaries, two documents of criminal instruction opinions (trial) of people's law and criminal program instruction opinions (trial) of people's law are drafted, and on the basis of 9-16 days in 2010, the highest people's law decides to carry out standardized improvement on the criminal instruction opinions (trial) of the national law from 10-1 days, so that the purposes of further standardizing the quantitative activities, standardizing the criminal adjudication rights, bringing the quantitative counsels into the forensic procedures, introducing the quantitative counseling suggestions, and enhancing the openness and the transparency of the criminal advice. Particularly, in the development of law governing society, the reasonability of plots in the criminal process needs to be well explained, and the critical plots related to criminal facts can be rapidly identified for judgment, while the description of the critical plots simultaneously has non-plots and quantitative criminal plots, and how to accurately and fairly measure the plots is a very important and extremely difficult part of the content in criminal research.
The traditional criminal regulation research is very time-consuming and labor-consuming, under the promotion of the criminal case criminal regulation reform, corresponding ' criminal instruction opinions about common crimes ' implementation rules ' are published in all provinces in the country at present, the implementation rules of all the provinces are referred to for reference, and regional detailed regulations also exist, but the cases covered by the rules at present are only a few, and account for about 5% of the total cases.
Based on a big data mode, by utilizing the constraint action of the criminal law of the people's republic of China, the example judgment of numerous judges is exerted as data assistance, and the objective law of criminal measurement is summarized from practice. By utilizing a scientific statistical method, a correlation analysis method, deep learning and other scientific means, reasonable decomposition points or intervals of the sentencing plot are searched, and reasonable explanation of the sentencing and the improvement of the sentencing standardized guidance opinion legislation are promoted better.
The decision tree splitting decomposition technology is a very important technology for reasoning and causal analysis, dependent variables are disassembled according to a method with the best contribution degree and the lowest global loss, and decomposed points of the technology accord with mathematical characteristics on statistics and can be recognized by criminal law experts. At present, no processing method aiming at intelligently decomposing the sentencing situation by adopting a big data mode and assisting in drawing a judicial interpretation of a special case handling case, drawing a normalized sentencing guide and intelligent regular sentencing exists.
Disclosure of Invention
The invention aims at the problems and provides a criminal plot decomposition analysis and auxiliary criminal normative law and intelligent rule criminal treatment method and system based on machine learning, decision tree splitting decomposition and plot concept extraction technologies.
The invention solves the problems by the following technical scheme:
in a first aspect, the invention provides a criminal plot decomposition analysis method, which is a criminal plot decomposition analysis and auxiliary criminal regulation law establishment and intelligent rule criminal evaluation processing method based on machine learning, decision tree division decomposition and plot concept extraction technologies, and comprises the following steps:
acquiring an original plot name from a criminal method and a criminal standardization file, and constructing a criminal plot map;
constructing a case plot tree based on the referee document and the constructed criminal plot map;
model training is carried out on the sentencing plots based on case plot trees, and prediction tasks between the sentencing plots and between the sentencing plots and sentencing results are trained;
and the parameter distribution of the trained model is utilized to assist in the analysis of the criminal plot.
Further, the method for constructing the criminal plot map comprises the following steps of obtaining an original plot name according to a criminal law and a criminal standardization file, and constructing the criminal plot map, wherein the criminal plot map comprises the following steps:
obtaining an original plot name from a criminal law and criminal standardization file;
normalizing the original plot names extracted from each data source to form universal generalized candidate standard plot names;
acquiring the incidence relation among the candidate standard plot names, and forming a candidate standard plot name undirected graph which takes the candidate standard plot names as vertexes and the relation categories as edges;
clustering the candidate standard plot name undirected graphs, and generating recommended candidate standard plot names from each clustering center;
carrying out manual auxiliary confirmation on the recommended candidate standard plot names to form standard plot names;
performing tree classification on the standard scenario names with the upper-lower relation to form standard scenario names of a tree structure;
and associating the original plot name with the standard plot name of the tree structure to form the sentention plot map.
Further, the obtaining of the association relationship between the candidate standard story names includes: similarity calculation is carried out between the candidate standard story names, the story names exceeding a set similarity threshold are considered to be in quasi-alignment, a quasi-alignment relation is formed between the candidate standard story names and the story names, and meanwhile, the association relation between the candidate standard story names is obtained according to the positions in the paragraph and the sentence of the original amount story normalized file before each candidate standard story name, wherein the association relation comprises but is not limited to the relation types of a front-back co-occurrence relation, a parallel relation under the same constraint, a reference relation, a child-parent relation and the like.
Further, the clustering candidate standard episode name undirected graphs includes: and clustering the candidate standard plot names in the candidate standard plot name undirected graph by adopting a graph clustering, community discovery and other related field algorithms.
Furthermore, the recommended candidate standard plot names are confirmed through manual assistance to form the standard plot names, so that hundreds of standard plot names which need to be considered in the actual criminal process can be effectively selected from tens of thousands of candidate standard plot names; the standard plot names confirmed by legal experts can be used as a unified term standard specification for criminal use, and the expression normalization of provinces and regions in the country to consistent standard plot names is realized.
Further, the standard story names with upper and lower position relations are subjected to tree classification, for example, the standard story names are "less than fourteen", "less than sixteen" and "sixteen less than eighteen", and the upper position is classified as "minor".
Further, based on the referee document and the constructed criminal plot map, the case plot tree is constructed, and the method comprises the following steps:
preprocessing the referee document, calling a structured engine, and identifying an important plot area in the referee document;
for an important plot area, a plot extraction engine is called to call the knowledge of the criminal plot map, and the criminal plots of cases are extracted;
extracting the judgment result, and taking the judgment result as a special case scenario;
all plots (sentencing plots, case plots) are combined to form a case plot tree.
Furthermore, the model training of the sentencing plot is to perform model training of the sentencing plot by adopting a machine learning and deep learning method, wherein the model is a common model and model combination, including but not limited to LightGBM, XGBoost, transform, MLP, CNN and the like; the training process comprises the following steps:
expanding, deforming, converting and the like all case plot trees according to the input requirements of the training model;
setting different reasoning tasks, training forecasting tasks between the sentencing plots and the case plots, and realizing joint forecasting on the sentencing plots, the sentencing results, the finding fact in the referee document and the court opinions in the referee document by adopting a multi-task joint training mode.
Further, the assisting of the criminal episode analysis by using the parameter distribution of the trained model is a task model with a significant training effect, which analyzes a network of the task model and assists the criminal episode analysis by using the parameter distribution, and includes:
extracting a Tree model of an Estimator (decision Tree predictor) from the LightGBM-regression model, and sequencing plot characteristic relevance (opportunity);
the effective and ineffective plots related to the judgment result can be analyzed through the data sorted by the relevance;
using the histogram to show the distribution of decision hits for each convincing episode's grouping of values when making mission inferences (e.g., "minor" less than fourteen "," sixteen ") and eighteen);
analyzing feature _ names, threshold, cat _ boundaries, and cat _ threshold in the model parameters of the decision tree, and determining the split points to be divided in the quantitative type of the sentential plot. For example: "woman (human)", the existence of the cleavage point [0,3,8], "theft amount (Yuan)", the cleavage point [0,3000,30000,400000], etc. Through the analysis of the split point, the popular judgment interval of 'theft amount (element)' can be obtained according to the mass referee documents when the actual judge amount is obtained, so that the formation or the correction of judicial interpretation is assisted, and the formation of new criminal name criminal standardized guidance opinions is assisted.
The above method steps are not limited to LightGBM, XGboost, etc. machine learning algorithm models with decision tree features, but also include other types of deep learning models that can generate feature value splits or groups of candidate features.
Further, the criminal plot is compared and analyzed by adopting the following steps:
analyzing the degree of correlation between each year's sentencing plot and the referee result, observing and analyzing the change condition of the significance fluctuation of the sentencing plots all the year round, reflecting the change trend of the sentencing plots in the social-economic development process and the change influence of the sentencing plots on the criminal period;
analyzing an anguish-offender model, and performing effect analysis on punishment quantity brought by an offender and rethrough interval length on an anguish result, for example, decomposing the anguish into five-year rethrough, two-year rethrough and one-year rethrough to judge whether an obvious scare effect can be obtained or not;
analyzing the occurrence rate of the criminal plots over the years, observing and inducing the effects of the criminal plots in the social governance process, for example, decomposing the 'positive indemnity' into 'all positive indemnities', 'most positive indemnities', 'part positive indemnities', wherein the proportion of 'all positive indemnities' is increased year by year;
and analyzing the expression standardization of the name of the sentencing plot by using the sentencing plot.
In a second aspect, the present invention provides a system for criminal episode resolution analysis, comprising:
the criminal plot map building module is used for obtaining an original plot name according to a criminal method and a criminal standardized file and building a criminal plot map;
the case plot tree construction module is used for constructing a case plot tree based on the referee document and the constructed sentencing plot;
the combined training module is used for carrying out model training on the sentencing plots based on case plot trees, and training prediction tasks between the sentencing plots and between the sentencing plots and sentencing results;
and the sentencing plot analysis module is used for assisting in sentencing plot analysis by utilizing the parameter distribution of the trained model.
Compared with the prior art, the invention has the advantages and beneficial effects that:
1) The method can effectively solve the problem that the sentencing terms used in the normalized sentencing process are similar but not uniform across the country, and can quickly standardize the sentencing terms which are not uniform into uniform professional terms by utilizing the graph clustering technology and the similarity technology, thereby greatly reducing the difficulty and the workload of the legal expert in the combing process.
2) By means of the algorithms of deep learning and machine learning, the influence condition of each standard sentencing scenario is explained by reversely utilizing the parameters after the model is learned, the non-uniform sentencing scales and standards of each place are rapidly verified, the situations that standardized sentencing opinions made by each place are different in thickness and scale are avoided, and a basis is provided for subsequent improvement and improvement of opinions.
Drawings
FIG. 1 is a diagram illustrating the standardized instruction in the provinces of the country.
Fig. 2 is a flow chart of the analysis of sentencing episodes.
FIG. 3 is a flow chart of the construction of an angularity plot map.
Fig. 4 is a flowchart of the sentencing episode multitasking process.
FIG. 5 is a flow chart of the training of the criminal episode and the analysis of feature break points.
FIG. 6 is a schematic diagram of a feature break-up point.
Fig. 7 is a flow chart of the analysis of the sentencing episode.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions and specific implementations of the present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 2, the present invention mainly has the following modules:
1. the sentencing plot concept processing module: used for establishing an angularity plot map.
2. Referee document element extraction module: the method is used for establishing a case plot tree based on the sentencing plot.
3. A combined training module: the method is used for multi-task joint training and prediction based on case plot trees.
4. The sentencing plot analysis and report module: the method is used for carrying out sentencing plot analysis and forming a comprehensive analysis report of sentencing plots.
The invention mainly adopts the following technologies:
1. sentencing plot map construction technology
The construction method of the sentencing plot map comprises the following steps:
the original episode name was obtained from the criminal law and the sentenced normative documentation, referee's document in figure 1, as in step "2.1" of figure 3.
As shown in step "3.1" of fig. 3, the original story names extracted from each data source are normalized to form generalized candidate standard story names.
Calculating the similarity between the candidate standard plot names, and considering the plot quasi-alignment if the candidate standard plot names exceed the set similarity threshold, so as to form a quasi-alignment relationship between the candidate standard plot names and the plot quasi-alignment relationship; meanwhile, obtaining the incidence relation between the candidate standard story names in the paragraphs and sentences of the original sentention normalized document according to the candidate standard story names, wherein the incidence relation comprises but is not limited to the relation types such as the front-back co-occurrence relation, the parallel relation under the same constraint, the reference relation, the son-father relation and the like, and forming a candidate standard story name undirected graph which takes the candidate standard story names as the vertexes and the relation types as the edges.
And (3) as the step of '4.1' in the figure 3, clustering the undirected graphs by adopting related domain algorithms such as graph clustering, community discovery and the like in the candidate standard plot names undirected graphs, and generating recommended candidate standard plot names for manual confirmation from each clustering center.
As shown in the step "5.1" in fig. 3, the recommended candidate standard story names are confirmed by manual assistance to form standard story names, and the above steps can effectively select hundreds of standard story names which need to be considered in the actual criminal process from tens of thousands of candidate standard story names. The standard plot names confirmed by legal experts can be used as a unified term standard specification for criminal use, and the expression normalization of provinces and regions in the country to consistent standard plot names is realized.
As shown in the step "6.1" in fig. 3, the standard episode names with the upper and lower relation are tree-classified to form the sentencing episode names with tree structure. For example, the standard story names "less than fourteen", "less than sixteen" and "less than eighteen" are classified as "minor".
As shown in step 7.1 in fig. 3, the original episode name is associated with the tree structured sentencing episode name to form a sentencing episode map.
2. Sentencing plot drawing technology
Through the sentencing plot map, the sentencing plot of each referee document is extracted by adopting a plot extraction engine, and the method comprises the following steps:
the plot extraction engine calls the knowledge of the sentencing plot map; initializing an episode extraction engine;
as shown in fig. 4, step "2.1", the referee document is preprocessed and the structuring engine is invoked to identify important plot areas in the referee document. The structured engine is a comprehensive structured engine technology, covers a set of composite technologies of syntax, lexical, regularization, structure definition, word segmentation, part of speech tagging, relation extraction, hierarchical decomposition and the like, converts a sequence format text into a data structure of a tree form and a reference format, the output format can be output as XML and Json, and the reference relation can be connected by using ID or entity names.
As shown in step "3.1" in fig. 4, for the important episode region, the episode extraction engine is called to extract case criminal episodes.
As shown in the step "3.2" in fig. 4, the referee results are extracted and used as special case scenarios.
As step "4.1" in fig. 4, all plots (sentencing plot, case plot) are merged to form a case plot tree.
3. Joint training prediction technology for sentencing plot and criminal period
Model training is carried out on the sentencing plot by adopting a machine learning and deep learning method, and the model is a common model and model combination, including but not limited to LightGBM, XGboost, transform, MLP, CNN and the like.
As shown in the step "2.1" in fig. 5, all case plot trees are expanded, deformed, transformed, etc. according to the input requirements of the training model. The OneHot-Label in the step "2.1" in fig. 5 refers to scenario feature labels, for example, zhang san zhu case, the basic scenario includes three items "sixteen is not full and eighteen", "gun is held", and "first self", and there are 58 scenario factors to be considered in the case, the scenario labels corresponding to the three items are 1, and the labels of the rest scenarios are 0, and are used for model training of LightGBM.
As in the step "3.1" in fig. 5, different reasoning tasks are set, and forecasting tasks between the sentencing plots and case plots are trained, such as "self-holding, mechanical holding, family entering, theft, 25000 yuan" plot reasoning "for 3 months in futures and" self-following, non-adoptive "reduction" are trained.
And (3) automatically optimizing the training model by adopting Flaml (an efficient lightweight automatic machine learning framework) to obtain the optimal model effect.
The joint prediction is realized on the criminal episodes, the criminal results, the finding facts in the referee documents and the court opinions in the referee documents by adopting multi-task joint training, the criminal results are predicted by the court opinions (seq) in the referee documents, the finding facts (seq) in the referee documents and the criminal episodes (label), and the criminal episodes (label) are predicted by the court opinions (seq) in the referee documents and the finding facts (seq) in the referee documents.
4. Model parameter feature analysis technique
And analyzing the network of the task model for the task model with the remarkable training effect, and assisting the analysis of the sentencing plot from the parameter distribution.
As shown in steps "4.1" and "5.1" in fig. 5, a Tree model of an Estimator is extracted from the LightGBM-progress model, and relevance ranking (Importance) is performed on the plot characteristics. The "episode characteristics" herein refers to episode characteristic factors that affect increase or decrease of the sentencing criminal stage, such as characteristics of "whether to enter room" in a theft crime.
Then, the effective scenario and the ineffective scenario related to the judgment result can be analyzed through the feature sorting data.
As step "6.1" in fig. 5, the histogram is used to show the distribution of value groupings for each criminal episode at the time of mission reasoning.
As step "6.2" in fig. 5, split points in the decision tree are extracted, feature _ names, threshold, cat _ boundaries, cat _ threshold of the decision tree model are analyzed, and in a quantitative type of a sentry plot, split points that are divided are determined, for example: "woman (man)", the division points exist [0,3,8], "theft amount (yuan)", the division point is [0,3000,30000,400000], the individual illegally absorbs the public deposit (yuan) ], the east area is [0,200000,1000000], and the west area is [0,200000,1000000]. Through the analysis of the split point, the value judgment difference possibly existing in a popular judgment interval of the 'theft amount (element)' and different areas when the actual judge amount is obtained from the massive judge documents. Thereby assisting the formation or correction of judicial interpretation and the formation of standardized instruction opinions of new crime naments. As shown in fig. 6, the number of death people in crime-causing traffic is 2.5 as a digital cut point, namely < =2.5, and the 3 people are a split point influencing the qualitative change of the beginning of the sentry. Fig. 6 shows the splitting visualization of the feature values in the decision tree, and if the values are smaller than or equal to the values of the splitting points, the judgment is continued to the left side, and if the values are larger than the values, the judgment is continued according to the rule on the right side. In the actual case of traffic crime, the number of dead people can be more than 0 to 100, but the criminal period is obviously aggravated by more than a few people, namely, the quality change is changed into the quality change, and the point of the quality change, namely, the splitting point can be observed through the visualization technology. "understanding" at the top right of FIG. 6 represents an episode name, meaning whether a criminal obtained understanding of the invaded or family, and if "understanding" was obtained, the term of crippling was alleviated in the right amount. Similarly, for economic crimes, the crime amount is a continuous array of values that can be divided into intervals by the split point to observe qualitative changes.
The above method steps are not limited to LightGBM, XGboost, etc. machine learning algorithm models with decision tree features, but also include other types of deep learning models that can generate feature value splits or groups of candidate features.
5. Sentencing plot analysis technology
The analysis of the sentencing plots mainly analyzes the distribution of the sentencing plots, the sentencing plots and the judgment results, the basic situation of the sentencing plots and the parties, the sentencing plots and the social and economic situations, the change of the number of cases of the sentencing plots over the years and the like.
The comparative analysis of the criminal plots is carried out by adopting the following steps:
and (3) analyzing the degree of correlation between each year of the sentencing episodes and the judgment results as in the step 2.1 in the figure 7, observing and analyzing the fluctuation and change conditions of the significance of the sentencing episodes over the years, reflecting the change trend of the sentencing episodes in the socioeconomic development process and the change influence of the sentencing episodes on the criminal period.
As shown in the step "2.2" in fig. 7, the criminal plot-offender model is analyzed, and the number of punishments caused by the criminal outcome is analyzed by the interval length between offenders and rescissors, for example, the criminal outcome is decomposed into five-year rescissors, two-year rescissors and one-year rescissors, so as to determine whether an obvious scare effect can be obtained.
As shown in steps "2.3" and "3.1" in fig. 7, the occurrence rate of the historic episodes is analyzed, and the effects of the historic episodes in the social governance process are observed and generalized, such as "active indemnity" is decomposed into "total active indemnity", "major active indemnity" and "partial active indemnity", wherein the proportion of "total active indemnity" is increased year by year.
The sentry plot name expressions are analyzed in a standardized manner by utilizing the sentry plot maps, for example, "fraud disaster relief, rescue, flood prevention, poverty relief, medical money", "fraud disaster relief, rescue, flood prevention, poverty relief, immigration, medical money", "fraud disaster relief, rescue, flood prevention, poverty relief, immigration, epidemic prevention and medical money" can be comprehensively formed into "rescue, disaster relief, relief, epidemic prevention, flood prevention, preferential relief, poverty relief, immigration and medical money", and standardized expressions of various provinces are improved.
As in step "4.1" of fig. 7, a comprehensive analysis report of the sentencing episode was formed.
In the invention, the extraction of the candidate plot names can be established by a concept entity extraction method of natural language processing.
In the invention, the combined multi-task training model can adopt other implementation schemes such as a distributed training model and the like.
In the invention, the analysis of the characteristics of the sentencing plot can be other scientific analysis methods of correlation analysis and causal analysis.
Based on the same inventive concept, another embodiment of the present invention provides a system for criminal investigation episode decomposition analysis, comprising:
the criminal plot map building module is used for obtaining an original plot name according to a criminal method and a criminal standardized file and building a criminal plot map;
the case plot tree construction module is used for constructing a case plot tree based on the referee document and the constructed sentencing plot;
the combined training module is used for carrying out model training on the sentencing plots based on case plot trees, and training prediction tasks between the sentencing plots and between the sentencing plots and sentencing results;
and the sentencing plot analysis module is used for assisting in sentencing plot analysis by utilizing the parameter distribution of the trained model.
The specific implementation process of each module is referred to the description of the method of the invention.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps of the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.
Portions of the invention not described in detail are within the knowledge of those skilled in the art.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for criminal investigation episode decomposition analysis, comprising the steps of:
obtaining an original plot name according to a criminal method and a criminal standardization file, and constructing a criminal plot map;
constructing a case plot tree based on the referee document and the constructed sentencing plot;
based on case plot trees, carrying out model training on the sentencing plots, and training prediction tasks between the sentencing plots and sentencing results;
and the parameter distribution of the trained model is utilized to assist in the analysis of the criminal plot.
2. The method according to claim 1, wherein said obtaining original episode names from criminal law and criminal standardization documents, constructing a criminal episode map, comprises:
obtaining an original plot name from a criminal law and criminal standardization file;
normalizing the original plot name to form a candidate standard plot name;
acquiring the incidence relation among the candidate standard plot names, and forming a candidate standard plot name undirected graph which takes the candidate standard plot names as vertexes and the relation categories as edges;
clustering the candidate standard plot name undirected graphs, and generating recommended candidate standard plot names from each clustering center;
carrying out manual auxiliary confirmation on the recommended candidate standard plot names to form standard plot names;
performing tree classification on the standard plot names with the upper-lower relation to form standard plot names of a tree structure;
and associating the original plot name with the standard plot name of the tree structure to form the sentencing plot map.
3. The method of claim 2, wherein obtaining the association between candidate standard episode names comprises:
similarity calculation is carried out between the candidate standard plot names, the plot names exceeding the set similarity threshold are considered to be quasi-aligned, and a quasi-alignment relation is formed between the candidate standard plot names and the plot names;
and acquiring the association relationship between the candidate standard episode names according to the positions of the candidate standard episode names in the paragraph and the sentence of the original criminal normalized file, wherein the association relationship comprises a front-back co-occurrence relationship, a parallel relationship under the same constraint, a reference relationship and a child-parent relationship.
4. The method according to claim 1, wherein said constructing case scenario tree based on referee documents and constructed sentencing scenario maps comprises:
preprocessing the referee document, calling a structured engine, and identifying an important plot area in the referee document;
for an important plot area, calling a plot extraction engine to call knowledge of the sentencing plot map, and extracting sentencing plots of cases;
extracting a judging result, and taking the judging result as a special case scenario;
and combining all the sentencing plots and case plots to form a case plot tree.
5. The method according to claim 1, wherein the model training for the sentencing episode is model training for the sentencing episode by adopting a machine learning and deep learning method; the training process comprises the following steps:
expanding, deforming and converting all case plot trees according to the input requirements of the training model;
the mode of adopting multitask joint training realizes joint prediction to finding out the fact in the sentencing plot, the sentencing result, the referee document, the court suggestion in the referee document, includes: the sentencing result is predicted by the court opinions in the referee document, the found facts in the referee document, and the sentencing episode is predicted by the court opinions in the referee document, the found facts in the referee document.
6. The method according to claim 1, wherein said using the parameter distribution of the trained model to assist in the analysis of criminal episodes comprises:
sorting the relevance of the plot features;
analyzing effective plots and ineffective plots related to the judgment result through the data sorted by the relevance;
utilizing the histogram to display the value grouping distribution of each sentencing plot during task reasoning;
and determining a split point in quantitative criminal plots, and obtaining a judgment interval of actual judge judgment according to massive judge documents through split point analysis, so as to assist the formation or correction of judicial interpretation and the formation of standardized guidance opinions of new criminal criminals.
7. The method according to claim 6, wherein said using the parameter distribution of the trained model to assist in the analysis of criminal episodes further comprises:
analyzing the degree of correlation between each year's sentencing plot and the referee result, observing and analyzing the change condition of the sentencing plot importance fluctuation all the year round, reflecting the change trend of the sentencing plot in the social and economic development process and the change influence of the sentencing plot on the sentencing period;
analyzing an sentencing plot-offender model, and analyzing the effect of the punishment quantity brought by the sentencing result by using the interval length of offenders and rescissions to judge whether an obvious deterrence effect can be obtained or not;
analyzing the occurrence rate of the sentencing plots in all years, and observing and summarizing the role of the sentencing plots in the social treatment process;
and analyzing the expression standardization of the name of the sentencing plot by using the sentencing plot.
8. A system for criminal investigation episode decomposition analysis, comprising:
the criminal plot building module is used for obtaining an original plot name according to a criminal method and a criminal standardization file and building a criminal plot map;
the case plot tree construction module is used for constructing a case plot tree based on the referee document and the constructed sentencing plot;
the combined training module is used for carrying out model training on the sentencing plots based on case plot trees, and training prediction tasks between the sentencing plots and between the sentencing plots and sentencing results;
and the sentencing plot analysis module is used for assisting in sentencing plot analysis by utilizing the parameter distribution of the trained model.
9. An electronic apparatus, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a computer, implements the method of any one of claims 1 to 7.
CN202211026955.5A 2022-08-25 2022-08-25 Criminal investigation plot decomposition analysis method and system Active CN115687632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211026955.5A CN115687632B (en) 2022-08-25 2022-08-25 Criminal investigation plot decomposition analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211026955.5A CN115687632B (en) 2022-08-25 2022-08-25 Criminal investigation plot decomposition analysis method and system

Publications (2)

Publication Number Publication Date
CN115687632A true CN115687632A (en) 2023-02-03
CN115687632B CN115687632B (en) 2024-04-09

Family

ID=85061442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211026955.5A Active CN115687632B (en) 2022-08-25 2022-08-25 Criminal investigation plot decomposition analysis method and system

Country Status (1)

Country Link
CN (1) CN115687632B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120191502A1 (en) * 2011-01-20 2012-07-26 John Nicholas Gross System & Method For Analyzing & Predicting Behavior Of An Organization & Personnel
US20130297540A1 (en) * 2012-05-01 2013-11-07 Robert Hickok Systems, methods and computer-readable media for generating judicial prediction information
KR20150039357A (en) * 2013-10-02 2015-04-10 김성인 An expert system for determining penalty and operating method thereof
AU2018101652A4 (en) * 2018-11-04 2018-12-06 GU, Ming MR Criminal Case Intelligent Management and Processing System (CCIMPS)
CN111104798A (en) * 2018-10-27 2020-05-05 北京智慧正安科技有限公司 Analysis method, system and computer readable storage medium for criminal plot in legal document
CN111126057A (en) * 2019-12-09 2020-05-08 航天科工网络信息发展有限公司 Case plot accurate criminal measuring system of hierarchical neural network
CN111428466A (en) * 2018-12-24 2020-07-17 北京国双科技有限公司 Legal document analysis method and device
CN112100212A (en) * 2020-09-04 2020-12-18 中国航天科工集团第二研究院 Case scenario extraction method based on machine learning and rule matching
CN112732865A (en) * 2020-12-29 2021-04-30 长春市把手科技有限公司 Method and device for measuring and calculating criminal period influence ratio of criminal case plots
CN112948571A (en) * 2019-12-11 2021-06-11 中国司法大数据研究院有限公司 Calendar trial case association method and device based on referee document, electronic equipment and computer readable medium
CN113239130A (en) * 2021-06-18 2021-08-10 广东博维创远科技有限公司 Criminal judicial literature-based knowledge graph construction method and device, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120191502A1 (en) * 2011-01-20 2012-07-26 John Nicholas Gross System & Method For Analyzing & Predicting Behavior Of An Organization & Personnel
US20130297540A1 (en) * 2012-05-01 2013-11-07 Robert Hickok Systems, methods and computer-readable media for generating judicial prediction information
KR20150039357A (en) * 2013-10-02 2015-04-10 김성인 An expert system for determining penalty and operating method thereof
CN111104798A (en) * 2018-10-27 2020-05-05 北京智慧正安科技有限公司 Analysis method, system and computer readable storage medium for criminal plot in legal document
AU2018101652A4 (en) * 2018-11-04 2018-12-06 GU, Ming MR Criminal Case Intelligent Management and Processing System (CCIMPS)
CN111428466A (en) * 2018-12-24 2020-07-17 北京国双科技有限公司 Legal document analysis method and device
CN111126057A (en) * 2019-12-09 2020-05-08 航天科工网络信息发展有限公司 Case plot accurate criminal measuring system of hierarchical neural network
CN112948571A (en) * 2019-12-11 2021-06-11 中国司法大数据研究院有限公司 Calendar trial case association method and device based on referee document, electronic equipment and computer readable medium
CN112100212A (en) * 2020-09-04 2020-12-18 中国航天科工集团第二研究院 Case scenario extraction method based on machine learning and rule matching
CN112732865A (en) * 2020-12-29 2021-04-30 长春市把手科技有限公司 Method and device for measuring and calculating criminal period influence ratio of criminal case plots
CN113239130A (en) * 2021-06-18 2021-08-10 广东博维创远科技有限公司 Criminal judicial literature-based knowledge graph construction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115687632B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN107918921B (en) Criminal case judgment result measuring method and system
CN112528036B (en) Knowledge graph automatic construction method for evidence correlation analysis
CN110675023B (en) Litigation request rationality prediction model training method based on neural network, and litigation request rationality prediction method and device based on neural network
CN113051365A (en) Industrial chain map construction method and related equipment
CN113779260B (en) Pre-training model-based domain map entity and relationship joint extraction method and system
CN112163424A (en) Data labeling method, device, equipment and medium
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN112269949B (en) Information structuring method based on accident disaster news
CN112199415A (en) Data feature preprocessing method and implementation system and application thereof
CN113204967B (en) Resume named entity identification method and system
CN112765974B (en) Service assistance method, electronic equipment and readable storage medium
CN108052504A (en) Mathematics subjective item answers the structure analysis method and system of result
CN113239130A (en) Criminal judicial literature-based knowledge graph construction method and device, electronic equipment and storage medium
CN113312490B (en) Event knowledge graph construction method for emergency
CN113239208A (en) Mark training model based on knowledge graph
CN110825839A (en) Incidence relation analysis method for targets in text information
CN114860882A (en) Fair competition review auxiliary method based on text classification model
CN115099310A (en) Method and device for training model and classifying enterprises
CN110362828B (en) Network information risk identification method and system
Zhang et al. Credit risk control algorithm based on stacking ensemble learning
CN115687632A (en) Criminal measuring plot decomposition analysis method and system
CN112163423B (en) Method and system for calculating check-in case handling work amount
CN113742495A (en) Rating characteristic weight determination method and device based on prediction model and electronic equipment
CN115080732A (en) Complaint work order processing method and device, electronic equipment and storage medium
CN117541044B (en) Project classification method, system, medium and equipment based on project risk analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant