CN110990560B - Judicial data processing method and system - Google Patents

Judicial data processing method and system Download PDF

Info

Publication number
CN110990560B
CN110990560B CN201811156961.6A CN201811156961A CN110990560B CN 110990560 B CN110990560 B CN 110990560B CN 201811156961 A CN201811156961 A CN 201811156961A CN 110990560 B CN110990560 B CN 110990560B
Authority
CN
China
Prior art keywords
network model
vector
processing
word
legal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811156961.6A
Other languages
Chinese (zh)
Other versions
CN110990560A (en
Inventor
戴威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201811156961.6A priority Critical patent/CN110990560B/en
Publication of CN110990560A publication Critical patent/CN110990560A/en
Application granted granted Critical
Publication of CN110990560B publication Critical patent/CN110990560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a judicial data processing method and system, wherein the judicial data processing method comprises the steps of obtaining case information of a to-be-handled case of judicial data to be predicted, then performing word segmentation processing on the text information to obtain document word segmentation data, and then processing the document word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result of the corresponding to the to-be-handled case. And processing word segmentation data of the document based on a pre-trained network model, and outputting the criminal names and the legal strips corresponding to the to-be-handled cases, so that the prediction of the criminal names and the legal strips of the cases is facilitated for the legal staff.

Description

Judicial data processing method and system
Technical Field
The invention relates to the technical field of data processing, in particular to an auxiliary crime and legal recommendation method and system.
Background
With the development of modern society, law is one of the products in the development process of civilized society. Law generally refers to a special behavioral specification that is formulated by a socially approved national validation legislation and by national obligations warranty to define principal rights and obligations as content, with general constraints on the population of social members. When disputes occur among the social members, the judicial authorities decide on the legislation.
In the process of arbitrating by standing a proposal, a legal staff makes a final arbitration based on the legal rules corresponding to the specific event. In the prior art, legal texts and case analysis are read and understood by law officers and lawyers, and criminal investigation of personnel to be judged, searching of relevant legal strips related to the scheme, criminal period length of the personnel to be reported and the like are completed.
However, when a case involves a plurality of laws, the workload of the law enforcement personnel is great and the efficiency is low when the case is handled.
Disclosure of Invention
In view of the above, the embodiment of the invention provides a method and a system for assisting crime and legal recommendation, which are used for establishing a model in a mode of fusing a convolution network and a capsule network and achieving the purpose of assisting crime and legal recommendation through the prediction of the model.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the first aspect of the invention discloses a judicial data processing method, which comprises the following steps:
acquiring case information of a to-be-handled case of judicial data to be predicted, wherein the judicial data comprise crime names and legal strips, and the case information comprises text information of the to-be-handled case;
word segmentation processing is carried out on the text information to obtain word segmentation data of the document;
and processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, wherein the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model.
Preferably, the network model is obtained by fusing the following modes, including:
acquiring a published judicial document as a training text, and performing word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a criminal name and a legal rule which are judged;
taking the word vector model as an input layer of a neural network model, and taking the textCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
Preferably, the text word segmentation data is processed through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, including:
mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector for representing a criminal name and a normal feature vector for representing a normal;
inputting the word vectors into a textCNN convolution network and a capsule network model respectively for processing, and collecting output vectors of the textCNN convolution network and the capsule network model to obtain a collection vector;
and connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
Preferably, the inputting the word vector into the TextCNN convolutional network and the capsule network model for processing respectively, collecting output vectors of the TextCNN convolutional network and the capsule network model, and obtaining a collection vector includes:
inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number;
inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension number;
and collecting the first vector and the second vector to obtain a collection vector, wherein the number of dimensions of the collection vector is the sum of the number of the first dimensions and the number of the second dimensions.
A second aspect of the present invention discloses a judicial data processing system, comprising:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring case information of a to-be-handled case of judicial data to be predicted, the judicial data comprise crime names and laws, and the case information comprises text information of the to-be-handled case;
the word segmentation unit is used for carrying out word segmentation processing on the text information to obtain document word segmentation data;
the prediction unit is used for processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, and the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model.
Preferably, the method further comprises: a network model generation unit including:
the word vector training module is used for acquiring the published judicial document as a training text, carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and the judged criminal names and legal strips;
the fusion module is used for taking the word vector model as an input layer of a neural network model, and taking the textCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
the training module is used for training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
Preferably, the prediction unit includes:
the word vector processing module is used for mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector used for representing a criminal name and a normal feature vector used for representing a normal;
the processing module is used for inputting the word vectors into the textCNN convolution network and the capsule network model respectively for processing, and collecting the output vectors of the textCNN convolution network and the capsule network model to obtain a collection vector;
and the output module is used for connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full-connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
Preferably, the processing module includes:
the first processing submodule is used for inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number;
the second processing submodule is used for inputting the word vectors into the capsule network model for processing to obtain second vectors with a second dimension number;
the vector superposition module is used for aggregating the first vector and the second vector to obtain an aggregate vector, and the number of dimensions of the aggregate vector is the sum of the number of the first dimensions and the number of the second dimensions.
A third aspect of the present invention discloses a storage medium including a stored program, wherein the program, when executed, controls a device in which the storage medium is located to execute the judicial data processing method disclosed in the first aspect of the present invention.
A fourth aspect of the present invention discloses a processor for running a program, wherein the program when run performs the judicial data processing method disclosed in the first aspect of the present invention.
Based on the judicial data processing method and system provided by the invention, the judicial data processing method comprises the steps of obtaining case information of a to-be-handled case of judicial data to be predicted, then performing word segmentation processing on the text information to obtain document word segmentation data, and then processing the document word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result of the corresponding to the to-be-handled case. And processing word segmentation data of the document based on a pre-trained network model, and outputting the criminal names and the legal strips corresponding to the to-be-handled cases, so that the prediction of the criminal names and the legal strips of the cases is facilitated for the legal staff.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a judicial data processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of another judicial data processing method according to an embodiment of the present invention;
FIG. 3 is a flowchart of another judicial data processing method according to an embodiment of the present invention;
FIG. 4 is a flowchart of another judicial data processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a judicial data processing system according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another judicial data processing system according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of another judicial data processing system according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of another judicial data processing system according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiment of the invention provides a judicial data processing method, referring to fig. 1, the method at least comprises the following steps:
step S101: obtaining case information of to-be-handled cases for which judicial data are required to be predicted, wherein the judicial data comprise crime names and legal strips, and the case information comprises text information of the to-be-handled cases.
In the process of embodying step S101, judicial data includes, but is not limited to, crime names and laws. The case information includes, but is not limited to, text information of the transaction case.
Step S102: and performing word segmentation processing on the text information to obtain document word segmentation data.
It should be noted that, the text information herein is a fact description paragraph in the case information of the proxy case, and the fact description paragraph includes: the major crime facts, crimes are described, and the contents related to cases such as the identification of a hospital and the self-beginning of the hospital.
Step S103: and processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case.
In the process of executing step S103, as shown in fig. 2, the method specifically includes the following steps:
step S201: and mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector for representing a criminal name and a normal feature vector for representing a normal.
In step S201, the processing procedure of the word vector model on text word segmentation data is that the text word segmentation data is mapped into a space of 100 dimensions, and word vectors are obtained by representing the similarity between words. The dimension value here is generally 50 to 250, and 100 dimensions are preferable here, as the case may be.
In addition, the word vector model comprises low-frequency long-tail words appearing in the corpus, and the low-frequency long-tail times have unique word vector expression in the word vector model.
Step S202: and respectively inputting the word vectors into a textCNN convolutional network and a capsule network model for processing, and collecting output vectors of the textCNN convolutional network and the capsule network model to obtain a collection vector.
In the process of executing step S202, as shown in fig. 3, the specific process includes the following steps:
step S301: and inputting the word vector into a textCNN convolutional network model for processing to obtain a first vector with a first dimension number.
It should be noted that the first of the first vectors is only for distinguishing between the two different vectors.
Step S302: and inputting the word vector into a capsule network model for processing to obtain a second vector with a second dimension number.
Step S303: and collecting the first vector and the second vector to obtain a collection vector, wherein the number of dimensions of the collection vector is the sum of the number of the first dimensions and the number of the second dimensions.
In order to clearly describe the process of processing the word vector by respectively inputting the TextCNN convolutional network and the capsule network model in step S202, an example will be described below.
For example, textCNN convolutional networks have 1*1 to 5*5 convolutional kernels, with a channel number of 256. The capsule network model contains 5 neuron units, with a single neuron unit output dimension of 256. After a document is processed by using a word vector model, a word vector with 100 dimensions is obtained, the word vector with 100 dimensions is divided into two paths, one path converts the word vector with 100 dimensions into a vector with 256 dimensions through a full connection layer of a textCNN convolution network and a capsule network model, and a vector with 1280 dimensions is output on the convolution network. The other way processes the 100-dimensional word vector through 5 neuron units of the capsule network model, and the capsule network outputs a 1280-dimensional vector.
Step S203: and connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
It should be noted that, the criminal name class mark and the legal strip class mark are multi-label and multi-object multi-classification. Multiple classifications refer to, for example: criminals have theft, robbery, dangerous driving, etc.; a multi-label refers to a piece of text that is not only assigned to one category, but there may be multiple categories or classifications. For example, a case may be both a robbery crime and a deliberate injury crime, i.e., multiple tags.
The embodiment of the invention discloses a judicial data processing method, which comprises the steps of obtaining case information of a to-be-handled case of judicial data to be predicted, then performing word segmentation processing on the text information to obtain document word segmentation data, and then processing the document word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result of the corresponding to the to-be-handled case. Therefore, based on the pre-trained network model, the word segmentation data of the document is processed, and the criminal names and the legal laws corresponding to the to-be-handled cases are output, so that the prediction of the case criminal names and the legal laws is facilitated for the legal staff.
Based on the judicial data processing method disclosed in the above embodiment of the present invention, in step S103, the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model, as shown in fig. 4, and the specific process is as follows:
step S401: and acquiring the published judicial document as a training text, and carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and the determined criminal names and legal strips.
It should be noted that, training the Word vector training on the training text to obtain the training of the Word vector model may be performed by Word2vec, or may be performed by Glove, but is not limited to the above training.
In addition, the principle of the dimension selection of the word vector in step S401 is the same as that of the word vector in step S201, and redundant description is omitted here.
Step S402: and taking the word vector model as an input layer of a neural network model, and taking the textCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model.
Step S403: and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
It should be noted that, based on the training text, training is performed on the neural network model, and a specific training process is as follows:
first, an aesthetic finding and fact judging paragraph in a training text is obtained by using a rule judging system, wherein the aesthetic finding and fact judging paragraph is a paragraph in the document with detailed description on a plot.
Then, the information of crime names, laws, single or multiple crimes and the like of the judgment books can be obtained through the document analysis system.
And finally, training the neural network model through the document and the analyzed information, thereby taking the obtained trained neural network model as a network model.
Further, in order to obtain a better network model, training times are set in the training process of the network model, initial learning rate is reduced, and learning rate is reduced according to preset steps in the learning process, so that learning capacity is optimized, and when the network model reaches the preset training times after multiple times of training, training is stopped, so that the expected network model can be obtained. For easier understanding, examples are presented herein.
For example, selecting a document to be learned from a document library, inputting a first learning document into a network model for learning, and then attenuating the learning rate by 0.65 times as much as the original learning rate based on the initial learning rate of 1e-3 every 25000 training steps, wherein the process is 1 network model learning, and after training the input 15 books, stopping acquiring the documents in the library for training learning.
It should be noted that training data of the network model is generally very large, from hundreds of thousands to millions. Because of the limitation of the hardware video memory, a batch of data is generally read in one training, and the reading of the batch of data is a training step. For example, assuming that the batch of data is 256 pieces, then reading 256 pieces of data once is a training step.
According to the embodiment of the invention, the neural network model is trained by the judicial data processing method disclosed by the invention, and the trained neural network model can be used for more accurately predicting the criminal names and legal conditions of cases processed by legal personnel.
Corresponding to the judicial data processing method provided in the embodiment of the present application, the embodiment of the present application further provides a corresponding judicial data processing system, referring to fig. 5, which is a judicial data processing system disclosed in the embodiment of the present application, including:
the obtaining unit 501 is configured to obtain case information of a to-be-handled case that needs to predict judicial data, where the judicial data includes a crime name and a legal rule, and the case information includes text information of the to-be-handled case.
And the word segmentation unit 502 is configured to perform word segmentation processing on the text information to obtain document word segmentation data.
The prediction unit 503 is configured to process the text word segmentation data through a pre-trained network model, and obtain a predicted result of the crime name and the legal rule corresponding to the to-be-handled case, where the network model is obtained by fusing a TextCNN convolutional network, a capsule network model, and a neural network model.
Preferably, the method further comprises: a network model generation unit 601, as shown in fig. 6, the network model generation unit includes:
the word vector training unit 602 is configured to obtain a public judicial document as a training text, and perform word vector training on the training text to obtain a word vector model, where the training text includes case information, and a criminal name and a legal rule that have been determined.
And the fusion module 603 is configured to use the word vector model as an input layer of a neural network model, and use the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model.
The training module 604 is configured to train the neural network model based on the training text, and take the neural network model with the iteration number reaching the preset iteration number or the training round reaching the specified number as the network model.
Preferably, the prediction unit 503, as shown in fig. 7, includes:
the word vector processing module 701 is configured to map the text word segmentation data to a word vector model for word vector processing, so as to obtain a word vector, where the word vector model is an input layer of the network model, and the word vector includes a crime feature vector for representing a crime and a normal feature vector for representing a normal.
The processing module 702 is configured to input the word vector into a TextCNN convolutional network and a capsule network model for processing, and aggregate output vectors of the TextCNN convolutional network and the capsule network model to obtain an aggregate vector.
And the output module 703 is configured to connect the set vector to a pre-established crime class mark and a legal class mark based on a full connection layer of the network model, obtain a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and use the crime and the legal regression result as predicted judicial data.
Preferably, the processing module 702, as shown in fig. 8, includes:
the first processing sub-module 801 is configured to input the word vector into a TextCNN convolutional network model for processing, to obtain a first vector with a first dimension number.
And a second processing sub-module 802, configured to input the word vector into a capsule network model for processing, to obtain a second vector with a second dimension number.
The vector superposition module 803 is configured to aggregate the first vector and the second vector to obtain an aggregate vector, where the number of dimensions of the aggregate vector is a sum of the first number of dimensions and the second number of dimensions.
The specific execution principle of each unit in the judicial data processing system disclosed in the embodiment of the present invention and the execution process thereof are the same as the judicial data processing method disclosed in the embodiment of the present invention, and reference may be made to corresponding parts in the judicial data processing method disclosed in the embodiment of the present invention, so that redundant descriptions are omitted herein.
Based on the judicial data processing system disclosed in the embodiment of the invention, each module can be realized by a hardware device formed by a processor and a memory. The method comprises the following steps: the above modules are stored in the memory as program elements, and the processor executes the above program elements stored in the memory to implement judicial data processing.
The processor comprises a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel can be provided with one or more than one, and judicial data processing is realized by adjusting kernel parameters.
The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.
Further, an embodiment of the present invention provides a processor, where the processor is configured to execute a program, where the program executes the judicial data processing method.
Further, an embodiment of the present invention provides an apparatus, including a processor, a memory, and a program stored in the memory and executable on the processor, where the processor executes the program to implement the following steps:
acquiring case information of a to-be-handled case of judicial data to be predicted, wherein the judicial data comprise crime names and legal strips, and the case information comprises text information of the to-be-handled case; word segmentation processing is carried out on the text information to obtain word segmentation data of the document; and processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, wherein the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model.
The network model is obtained by fusion in the following way, and comprises the following steps: acquiring a published judicial document as a training text, and performing word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a criminal name and a legal rule which are judged; taking the word vector model as an input layer of a neural network model, taking the textCNN convolutional network as a second layer of the neural network model, and taking the capsule network as a third layer of the neural network model to construct the neural network model; and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
The text word segmentation data is processed through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, and the method comprises the following steps: mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector for representing a criminal name and a normal feature vector for representing a normal; inputting the word vectors into a textCNN convolution network and a capsule network model respectively for processing, and collecting output vectors of the textCNN convolution network and the capsule network model to obtain a collection vector; and connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
The word vector is respectively input into a TextCNN convolution network and a capsule network model for processing, and output vectors of the TextCNN convolution network and the capsule network model are collected to obtain a collection vector, which comprises the following steps: inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number; inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension number; and collecting the first vector and the second vector to obtain a collection vector, wherein the number of dimensions of the collection vector is the sum of the number of the first dimensions and the number of the second dimensions.
The client disclosed in the embodiment of the invention can be a PC, a PAD, a mobile phone and the like.
Further, the embodiment of the invention also provides a storage medium, on which a program is stored, and the program is executed by the processor to realize the display of the progress bar.
The present application also provides a computer program product adapted to perform, when executed on a data processing device, a program initialized with the method steps of:
acquiring case information of a to-be-handled case of judicial data to be predicted, wherein the judicial data comprise crime names and legal strips, and the case information comprises text information of the to-be-handled case; word segmentation processing is carried out on the text information to obtain word segmentation data of the document; and processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, wherein the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model.
The network model is obtained by fusion in the following way, and comprises the following steps: acquiring a published judicial document as a training text, and performing word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a criminal name and a legal rule which are judged; taking the word vector model as an input layer of a neural network model, taking the textCNN convolutional network as a second layer of the neural network model, and taking the capsule network as a third layer of the neural network model to construct the neural network model; and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
The text word segmentation data is processed through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, and the method comprises the following steps: mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector for representing a criminal name and a normal feature vector for representing a normal; inputting the word vectors into a textCNN convolution network and a capsule network model respectively for processing, and collecting output vectors of the textCNN convolution network and the capsule network model to obtain a collection vector; and connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
The word vector is respectively input into a TextCNN convolution network and a capsule network model for processing, and output vectors of the TextCNN convolution network and the capsule network model are collected to obtain a collection vector, which comprises the following steps: inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number; inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension number; and collecting the first vector and the second vector to obtain a collection vector, wherein the number of dimensions of the collection vector is the sum of the number of the first dimensions and the number of the second dimensions.
According to the embodiment of the invention, the hardware equipment formed by the processor and the memory is used for obtaining the case information of the to-be-handled case of which the judicial data is required to be predicted, wherein the judicial data comprises a crime name and a legal rule, the case information comprises text information of the to-be-handled case, word segmentation processing is carried out on the text information to obtain document word segmentation data, the document word segmentation data is processed through a pre-trained network model to obtain a crime name and a legal rule prediction result corresponding to the to-be-handled case, and the network model is obtained through training by fusing a textCNN convolution network and a capsule network model. The method and the device can obtain the criminal name and the legal rule of the case to be handled based on the pre-trained network model, so that the hardware equipment consisting of the processor and the memory can predict the criminal name and the legal rule of the case to be handled for the legal staff.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, client, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transshipment) such as modulated data signals and carrier waves.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. A judicial data processing method, comprising:
acquiring case information of a to-be-handled case of judicial data to be predicted, wherein the judicial data comprise crime names and legal strips, and the case information comprises text information of the to-be-handled case;
word segmentation processing is carried out on the text information to obtain text word segmentation data;
processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, wherein the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model;
processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, wherein the processing comprises the following steps:
mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector for representing a criminal name and a normal feature vector for representing a normal;
inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number;
inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension number;
the first vector and the second vector are assembled to obtain an assembled vector, and the number of dimensions of the assembled vector is the sum of the number of the first dimensions and the number of the second dimensions;
and connecting the set vector to a pre-established crime class mark and a pre-established legal class mark based on a full connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal class mark in the set vector, and taking the crime and the legal regression result as predicted judicial data.
2. The method of claim 1, wherein the network model is fused by:
acquiring a published judicial document as a training text, and performing word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a criminal name and a legal rule which are judged;
taking the word vector model as an input layer of a neural network model, and taking the textCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
3. A judicial data processing system, comprising:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring case information of a to-be-handled case of judicial data to be predicted, the judicial data comprise crime names and laws, and the case information comprises text information of the to-be-handled case;
the word segmentation unit is used for carrying out word segmentation processing on the text information to obtain text word segmentation data;
the prediction unit is used for processing the text word segmentation data through a pre-trained network model to obtain a criminal name and a legal prediction result corresponding to the to-be-handled case, and the network model is obtained by fusing a textCNN convolutional network, a capsule network model and a neural network model;
the prediction unit includes:
the word vector processing module is used for mapping the text word segmentation data into a word vector model to perform word vector processing to obtain a word vector, the word vector model is an input layer of the network model, and the word vector comprises a criminal name feature vector used for representing a criminal name and a normal feature vector used for representing a normal;
the processing module is used for inputting the word vectors into the textCNN convolution network and the capsule network model respectively for processing, and collecting the output vectors of the textCNN convolution network and the capsule network model to obtain a collection vector;
the output module is used for connecting the set vector to a pre-established crime class mark and a pre-established legal standard on the basis of a full-connection layer of the network model, acquiring a crime regression result corresponding to the crime class mark and a legal regression result corresponding to the legal standard in the set vector, and taking the crime and the legal regression result as predicted judicial data;
the processing module comprises:
the first processing submodule is used for inputting the word vector into a textCNN convolution network model for processing to obtain a first vector with a first dimension number;
the second processing submodule is used for inputting the word vectors into the capsule network model for processing to obtain second vectors with a second dimension number;
the vector superposition module is used for aggregating the first vector and the second vector to obtain an aggregate vector, and the number of dimensions of the aggregate vector is the sum of the number of the first dimensions and the number of the second dimensions.
4. A system according to claim 3, further comprising: a network model generation unit including:
the word vector training module is used for acquiring the published judicial document as a training text, carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and the judged criminal names and legal strips;
the fusion module is used for taking the word vector model as an input layer of a neural network model, and taking the textCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
the training module is used for training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training rounds reaching the specified times as a network model.
5. A storage medium comprising a stored program, wherein the program, when run, controls a device in which the storage medium is located to perform the judicial data processing method according to any one of claims 1-2.
6. A processor for running a program, wherein the program when run performs the judicial data processing method according to any of claims 1-2.
CN201811156961.6A 2018-09-30 2018-09-30 Judicial data processing method and system Active CN110990560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811156961.6A CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811156961.6A CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Publications (2)

Publication Number Publication Date
CN110990560A CN110990560A (en) 2020-04-10
CN110990560B true CN110990560B (en) 2023-07-07

Family

ID=70059786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811156961.6A Active CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Country Status (1)

Country Link
CN (1) CN110990560B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552808A (en) * 2020-04-20 2020-08-18 北京北大软件工程股份有限公司 Administrative illegal case law prediction method and tool based on convolutional neural network
CN113626557A (en) * 2021-05-17 2021-11-09 四川大学 Intelligent law enforcement recommendation auxiliary system based on element labeling and BERT and RCNN algorithms
CN113065005B (en) * 2021-05-19 2024-01-09 南京烽火星空通信发展有限公司 Legal provision recommendation method based on knowledge graph and text classification model
CN113360657B (en) * 2021-06-30 2023-10-24 安徽商信政通信息技术股份有限公司 Intelligent document distribution handling method and device and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952193A (en) * 2017-03-23 2017-07-14 北京华宇信息技术有限公司 A kind of criminal case aid decision-making method based on fuzzy depth belief network
US9754219B1 (en) * 2016-07-01 2017-09-05 Intraspexion Inc. Using classified text and deep learning algorithms to identify entertainment risk and provide early warning
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107818138A (en) * 2017-09-28 2018-03-20 银江股份有限公司 A kind of case legal regulation recommends method and system
CN108133436A (en) * 2017-11-23 2018-06-08 科大讯飞股份有限公司 Automatic method and system of deciding a case

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9754219B1 (en) * 2016-07-01 2017-09-05 Intraspexion Inc. Using classified text and deep learning algorithms to identify entertainment risk and provide early warning
CN106952193A (en) * 2017-03-23 2017-07-14 北京华宇信息技术有限公司 A kind of criminal case aid decision-making method based on fuzzy depth belief network
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107818138A (en) * 2017-09-28 2018-03-20 银江股份有限公司 A kind of case legal regulation recommends method and system
CN108133436A (en) * 2017-11-23 2018-06-08 科大讯飞股份有限公司 Automatic method and system of deciding a case

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于动态路由的胶囊网络在文本分类上的探索;maomao2017;《https://blog.csdn.net/sumiyou8385/article/details/80045058?》;20180423;第1-9页 *

Also Published As

Publication number Publication date
CN110990560A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN110990560B (en) Judicial data processing method and system
Žliobaitė Measuring discrimination in algorithmic decision making
Bobadilla et al. A collaborative filtering approach to mitigate the new user cold start problem
WO2018204701A1 (en) Systems and methods for providing machine learning model explainability information
KR20200039852A (en) Method for analysis of business management system providing machine learning algorithm for predictive modeling
Corlosquet-Habart et al. Big data for insurance companies
Rosati et al. A novel deep ordinal classification approach for aesthetic quality control classification
CN112015896A (en) Emotion classification method and device based on artificial intelligence
Nirav Shah et al. A systematic literature review and existing challenges toward fake news detection models
Asim et al. An adaptive model for identification of influential bloggers based on case-based reasoning using random forest
CN111709225A (en) Event cause and effect relationship judging method and device and computer readable storage medium
CN110969549B (en) Judicial data processing method and system
CN111143665A (en) Fraud qualitative method, device and equipment
Subhiksha et al. Prediction of phone prices using machine learning techniques
CN116467523A (en) News recommendation method, device, electronic equipment and computer readable storage medium
CN110163470B (en) Event evaluation method and device
CN111191007A (en) Article keyword filtering method and device based on block chain and medium
Chopra et al. An adaptive RNN algorithm to detect shilling attacks for online products in hybrid recommender system
Shyr et al. Automated data analysis
Hemanth et al. Intelligent Data Communication Technologies and Internet of Things: ICICI 2019
CN113221762A (en) Cost balance decision method, insurance claim settlement decision method, device and equipment
CN110990522B (en) Legal document determining method and system
CN114020757A (en) Scenic spot passenger dishonest list updating method based on block chain and related device
CN113837836A (en) Model recommendation method, device, equipment and storage medium
Janev Chapter 1 Ecosystem of Big Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant