CN110990560A - Judicial data processing method and system - Google Patents

Judicial data processing method and system Download PDF

Info

Publication number
CN110990560A
CN110990560A CN201811156961.6A CN201811156961A CN110990560A CN 110990560 A CN110990560 A CN 110990560A CN 201811156961 A CN201811156961 A CN 201811156961A CN 110990560 A CN110990560 A CN 110990560A
Authority
CN
China
Prior art keywords
network model
vector
word
processing
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811156961.6A
Other languages
Chinese (zh)
Other versions
CN110990560B (en
Inventor
戴威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201811156961.6A priority Critical patent/CN110990560B/en
Publication of CN110990560A publication Critical patent/CN110990560A/en
Application granted granted Critical
Publication of CN110990560B publication Critical patent/CN110990560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a judicial data processing method and a judicial data processing system, wherein the judicial data processing method comprises the steps of obtaining case information of a case to be treated, of which judicial data needs to be predicted, then carrying out word segmentation processing on text information to obtain document word segmentation data, and then processing the document word segmentation data through a pre-trained network model to obtain a criminal name and a legal provision prediction result of the case to be treated. Word segmentation data of the document are processed based on a pre-trained network model, and the corresponding names of crimes and law rules of cases to be dealt with are output, so that law workers are helped to predict the names of the cases of the crimes and the law rules.

Description

Judicial data processing method and system
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a system for assisting in conviction and law enforcement recommendation.
Background
With the development of modern society, law is one of the products in the development process of civilized society. Law is generally a specific behavior rule which is set by a social approved national validation legislation and has general constraint on all members of the society, and the national mandatory guarantees define the rights and obligations of parties as contents. When disputes occur among the members of the society, the judicial authorities carry out official working and adjudication according to laws.
In the process of setting up a case for adjudication, the legal staff makes final adjudication based on the corresponding law of the specific event. In the prior art, judges and related legal articles related to the cases, the duration of the criminal period of the defendant and the like are found by reading and understanding legal texts and case analysis by judges and lawyers.
However, when a plurality of laws related to a case are applied, the workload is large and the efficiency is low when the legal staff handles the case.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a method and a system for assisting in conviction and law enforcement recommendation, which establish a model by fusing a convolutional network and a capsule network, and achieve the purpose of assisting in conviction and law enforcement recommendation by prediction of the model.
In order to achieve the purpose, the invention provides the following technical scheme:
the invention discloses a judicial data processing method in a first aspect, which comprises the following steps:
acquiring case information of a case to be handled, of which judicial data needs to be predicted, wherein the judicial data comprises a criminal name and a legal bar, and the case information comprises text information of the case to be handled;
performing word segmentation processing on the text information to obtain document word segmentation data;
and processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal item prediction result corresponding to the case to be handled, wherein the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
Preferably, the network model is obtained by fusing the following modes:
acquiring a published judicial writing as a training text, and carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged;
taking the word vector model as an input layer of a neural network model, and taking the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
Preferably, the processing the text word segmentation data by a pre-trained network model to obtain the criminal name and the law bar prediction result corresponding to the case to be handled comprises:
mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar;
respectively inputting the word vectors into a TextCNN convolutional network and a capsule network model for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector;
and connecting the set vector to a pre-established criminal name class mark and a pre-established legal title class mark based on a full connection layer of the network model, obtaining a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
Preferably, the inputting the word vector into the TextCNN convolutional network and the capsule network model respectively for processing, and collecting the output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector includes:
inputting the word vector into a TextCNN convolution network model for processing to obtain a first vector with a first dimension quantity;
inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension quantity;
and collecting the first vector and the second vector to obtain a collection vector, wherein the dimension number of the collection vector is the sum of the first dimension number and the second dimension number.
The second aspect of the present invention discloses a judicial data processing system, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring case information of a case to be handled, of which judicial data needs to be predicted, the judicial data comprises a criminal name and a legal provision, and the case information comprises text information of the case to be handled;
the word segmentation unit is used for carrying out word segmentation processing on the text information to obtain document word segmentation data;
and the prediction unit is used for processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal provision prediction results corresponding to the case to be handled, and the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
Preferably, the method further comprises the following steps: a network model generation unit, the network model generation unit comprising:
the word vector training module is used for acquiring a published judicial literature as a training text, carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged;
the fusion module is used for taking the word vector model as an input layer of a neural network model, taking the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model and constructing the neural network model;
and the training module is used for training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
Preferably, the prediction unit includes:
the word vector processing module is used for mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar;
the processing module is used for inputting the word vectors into a TextCNN convolutional network and a capsule network model respectively for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector;
and the output module is used for connecting the set vector to a pre-established criminal name class mark and a pre-established legal title based on the full connection layer of the network model, acquiring a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
Preferably, the processing module includes:
the first processing submodule is used for inputting the word vectors into a TextCNN convolution network model for processing to obtain first vectors with a first dimensionality;
the second processing submodule is used for inputting the word vectors into the capsule network model for processing to obtain second vectors with a second dimension quantity;
and the vector superposition module is used for aggregating the first vector and the second vector to obtain an aggregate vector, and the dimension number of the aggregate vector is the sum of the first dimension number and the second dimension number.
In a third aspect, the present invention discloses a storage medium, which includes a stored program, wherein the program controls a device on which the storage medium is located to execute the judicial data processing method disclosed in the first aspect of the present invention when running.
In a fourth aspect of the present invention, a processor is disclosed, the processor is configured to run a program, wherein the program executes the judicial data processing method disclosed in the first aspect of the present invention.
Based on the judicial data processing method and the judicial data processing system provided by the invention, the judicial data processing method comprises the steps of obtaining case information of a case to be treated of which the judicial data needs to be predicted, then carrying out word segmentation processing on the text information to obtain word segmentation data of the document, and then processing the word segmentation data of the document through a pre-trained network model to obtain a criminal name and a legal provision prediction result of the case to be treated. Word segmentation data of the document are processed based on a pre-trained network model, and the corresponding names of crimes and law rules of cases to be dealt with are output, so that law workers are helped to predict the names of the cases of the crimes and the law rules.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a judicial data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another judicial data processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another judicial data processing method according to an embodiment of the present invention;
FIG. 4 is a flow chart of another judicial data processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a judicial data processing system according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another judicial data processing system according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of another judicial data processing system according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of another judicial data processing system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
An embodiment of the present invention provides a judicial data processing method, which, referring to fig. 1, at least includes the following steps:
step S101: acquiring case information of a case to be handled, of which judicial data needs to be predicted, wherein the judicial data comprises a criminal name and a legal bar, and the case information comprises text information of the case to be handled.
In the process of implementing step S101 specifically, judicial data includes, but is not limited to, criminal names and french rules. Case information includes, but is not limited to, text information of a proxy case.
Step S102: and performing word segmentation processing on the text information to obtain document word segmentation data.
It should be noted that the text information here is a fact description paragraph in case information of a proxy case, and the fact description paragraph includes: the fact of a major crime, the description of the crime, the identification of the inspection yard, and the first and other related matters of the case.
Step S103: and processing the text word segmentation data through a pre-trained network model to obtain the criminal names and law bar prediction results corresponding to the cases to be handled.
In the process of executing step S103, as shown in fig. 2, the method specifically includes the following steps:
step S201: and mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar.
In step S201, the word vector model processes the text segmentation data by mapping the text segmentation data into a 100-dimensional space, and obtaining a word vector by representing similarity between words. The dimension value here is generally 50 to 250, and may be determined according to specific situations, and 100 dimensions are preferred here.
In addition, the word vector model comprises low-frequency long-tail words appearing in the corpus, and the low-frequency long-tail words have unique word vector expressions in the word vector model.
Step S202: and respectively inputting the word vectors into a TextCNN convolutional network and a capsule network model for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector.
In the process of executing step S202, as shown in fig. 3, the specific process includes the following steps:
step S301: and inputting the word vector into a TextCNN convolution network model for processing to obtain a first vector with a first dimension quantity.
It should be noted that the first of the first vectors is only for distinguishing between two different vectors.
Step S302: and inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension quantity.
Step S303: and collecting the first vector and the second vector to obtain a collection vector, wherein the dimension number of the collection vector is the sum of the first dimension number and the second dimension number.
In order to clearly describe the process of inputting the word vectors into the TextCNN convolutional network and the capsule network model for processing in step S202, the following example is illustrated.
For example, the TextCNN convolutional network has 1 × 1 to 5 × 5 convolutional kernels, and the number of channels is 256. The capsule network model comprises 5 neuron units, and the output dimension of a single neuron unit is 256. After a word vector model is used for processing a document, 100-dimensional word vectors are obtained, the 100-dimensional word vectors are divided into two paths, one path of the 100-dimensional word vectors is converted into 256 dimensions through a full connection layer of a TextCNN convolutional network and a capsule network model, and 1280-dimensional vectors are output in a convolutional network. The other path processes the 100-dimensional word vectors through 5 neuron units of the capsule network model, and the capsule network outputs vectors with 1280 dimensions.
Step S203: and connecting the set vector to a pre-established criminal name class mark and a pre-established legal title class mark based on a full connection layer of the network model, obtaining a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
It should be noted that the criminal name and the legal system are multi-label multi-target multi-classification. Multi-classification refers to, for example: the criminal names are stolen, robbed, dangerous driving and the like; multi-label means that a piece of text he will not be assigned to one class only, there may be multiple classes or categories. For example, a case may be both a robbery and a casualty crime, i.e., multi-tag.
The embodiment of the invention discloses a judicial data processing method, which comprises the steps of obtaining case information of a case to be treated and needing to predict judicial data, then carrying out word segmentation processing on text information to obtain document word segmentation data, and then processing the document word segmentation data through a pre-trained network model to obtain a criminal name and a legal provision prediction result corresponding to the case to be treated. Therefore, word segmentation data of the document are processed based on a pre-trained network model, and the names of the cases to be dealt with and the corresponding rules of the cases to be dealt with are output, so that legal personnel can be helped to predict the names of the cases and the rules of the cases.
Based on the judicial data processing method disclosed in the embodiment of the invention, in step S103, the network model is obtained by fusing the TextCNN convolutional network, the capsule network model and the neural network model, as shown in fig. 4, the specific process is as follows:
step S401: the method comprises the steps of obtaining a published judicial writing as a training text, carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged.
It should be noted that training for training a Word vector to obtain a Word vector model may be performed by Word2vec or Glove, but is not limited to the above training.
In addition, the principle of selecting the dimension of the word vector in step S401 is the same as that of selecting the dimension of the word vector in step S201, and thus, the description thereof is omitted.
Step S402: and taking the word vector model as an input layer of a neural network model, and taking the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model.
Step S403: and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
It should be noted that, the neural network model is trained based on the training text, and the specific training process is as follows:
firstly, an abridged finding and factual determination section in a training text is obtained by using a rule determination system, wherein the abridged finding and factual determination section is a section which is described in detail about a scenario in a document.
Then, the file analysis system can obtain the information of crime names, law rules, single-person or multi-person crimes and the like of the judgment books.
And finally, training the neural network model through the document and the analyzed information, and taking the obtained trained neural network model as a network model.
Furthermore, in order to obtain a better network model, training times and an initial learning rate are set in the network model training process, and the learning rate is attenuated according to preset steps in the learning process, so that the learning capacity is optimized. For ease of understanding, this is illustrated here.
For example, a document to be learned is selected from a document library, a first learning document is input into the network model for learning, based on the initial learning rate of 1e-3, then every 25000 training steps, the learning rate is attenuated to 0.65 times of the original learning rate, the process is 1 time of network model learning, and after 15 input documents are trained, the acquisition of the documents in the library is stopped for training and learning.
It should be noted that the training data of the network model is typically large, hundreds of thousands to millions. Due to the limitation of hardware video memory, one training will generally read a batch of data, and this reading of the batch of data is a training step. For example, if the amount of the batch of data is 256, then reading 256 pieces of data once is a training step.
The embodiment of the invention trains the neural network model through the judicial data processing method disclosed above, and the trained neural network model can more accurately predict the names of crimes and the laws of cases processed by legal personnel.
Corresponding to the judicial data processing method provided by the embodiment of the present application, the embodiment of the present application further provides a corresponding judicial data processing system, and referring to fig. 5, the judicial data processing system disclosed by the embodiment of the present application includes:
the acquiring unit 501 is configured to acquire case information of a case to be handled, where judicial data needs to be predicted, where the judicial data includes a criminal name and a legal provision, and the case information includes text information of the case to be handled.
And a word segmentation unit 502, configured to perform word segmentation processing on the text information to obtain document word segmentation data.
The prediction unit 503 is configured to process the text word segmentation data through a pre-trained network model to obtain a crime name and law bar prediction result corresponding to the case to be handled, where the network model is obtained by fusing a TextCNN convolutional network, a capsule network model, and a neural network model.
Preferably, the method further comprises the following steps: the network model generating unit 601, as shown in fig. 6, includes:
the word vector training unit 602 is configured to obtain a published judicial literature as a training text, perform word vector training on the training text, and obtain a word vector model, where the training text includes case information, and a crime name and a law bar that have been determined.
And a fusion module 603, configured to use the word vector model as an input layer of a neural network model, and use the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model.
The training module 604 is configured to train the neural network model based on the training text, and use the neural network model with iteration times reaching a preset iteration time or training rounds reaching a specified time as the network model.
Preferably, as shown in fig. 7, the prediction unit 503 includes:
a word vector processing module 701, configured to map the text word segmentation data to a word vector model for word vector processing to obtain a word vector, where the word vector model is an input layer of the network model, and the word vector includes a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar.
And the processing module 702 is configured to input the word vector into a TextCNN convolutional network and a capsule network model respectively for processing, and aggregate output vectors of the TextCNN convolutional network and the capsule network model to obtain an aggregate vector.
An output module 703, configured to connect the set vector to a pre-established guilty name class label and a law bar class label based on a full connection layer of the network model, obtain a guilty name regression result corresponding to the guilty name class label and a law bar regression result corresponding to the law bar class label in the set vector, and use the guilty name and the law bar regression result as predicted judicial data.
Preferably, the processing module 702, as shown in fig. 8, includes:
the first processing sub-module 801 is configured to input the word vector into a TextCNN convolutional network model for processing, so as to obtain a first vector with a first dimension number.
The second processing sub-module 802 is configured to input the word vectors into a capsule network model for processing, so as to obtain second vectors of a second dimension quantity.
A vector superposition module 803, configured to aggregate the first vector and the second vector to obtain an aggregate vector, where the dimension number of the aggregate vector is a sum of the first dimension number and the second dimension number.
The specific execution principle and the further execution process of each unit in the judicial data processing system disclosed in the embodiment of the invention are the same as the judicial data processing method disclosed in the embodiment of the invention, and reference may be made to the corresponding parts in the judicial data processing method disclosed in the embodiment of the invention, so that redundant description is not repeated here.
Based on the judicial data processing system disclosed by the embodiment of the invention, the modules can be realized by a hardware device consisting of a processor and a memory. The method specifically comprises the following steps: the modules are stored in the memory as program units, and the processor executes the program units stored in the memory to realize judicial data processing.
The processor comprises a kernel, and the kernel calls a corresponding program unit from the memory. The kernel can be set to be one or more, and judicial data processing is realized by adjusting kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
Further, an embodiment of the present invention provides a processor, where the processor is configured to execute a program, where the program executes the judicial data processing method when running.
Further, an embodiment of the present invention provides an apparatus, where the apparatus includes a processor, a memory, and a program stored in the memory and executable on the processor, and the processor implements the following steps when executing the program:
acquiring case information of a case to be handled, of which judicial data needs to be predicted, wherein the judicial data comprises a criminal name and a legal bar, and the case information comprises text information of the case to be handled; performing word segmentation processing on the text information to obtain document word segmentation data; and processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal item prediction result corresponding to the case to be handled, wherein the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
The network model is obtained by fusing the following modes, including: acquiring a published judicial writing as a training text, and carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged; taking the word vector model as an input layer of a neural network model, taking the TextCNN convolutional network as a second layer of the neural network model, and taking the capsule network as a third layer of the neural network model to construct the neural network model; and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
The method comprises the following steps of processing the text word segmentation data through a pre-trained network model to obtain the criminal names and the law bar prediction results corresponding to the cases to be handled, and comprises the following steps: mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar; respectively inputting the word vectors into a TextCNN convolutional network and a capsule network model for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector; and connecting the set vector to a pre-established criminal name class mark and a pre-established legal title class mark based on a full connection layer of the network model, obtaining a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
Wherein, the said word vector is input into TextCNN convolution network and capsule network model to process separately, the output vector of the said TextCNN convolution network and capsule network model is collected, get the collection vector, include: inputting the word vector into a TextCNN convolution network model for processing to obtain a first vector with a first dimension quantity; inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension quantity; and collecting the first vector and the second vector to obtain a collection vector, wherein the dimension number of the collection vector is the sum of the first dimension number and the second dimension number.
The client disclosed in the embodiment of the invention can be a PC, a PAD, a mobile phone and the like.
Further, an embodiment of the present invention also provides a storage medium having a program stored thereon, where the program is executed by a processor to implement display of a progress bar.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
acquiring case information of a case to be handled, of which judicial data needs to be predicted, wherein the judicial data comprises a criminal name and a legal bar, and the case information comprises text information of the case to be handled; performing word segmentation processing on the text information to obtain document word segmentation data; and processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal item prediction result corresponding to the case to be handled, wherein the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
The network model is obtained by fusing the following modes, including: acquiring a published judicial writing as a training text, and carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged; taking the word vector model as an input layer of a neural network model, taking the TextCNN convolutional network as a second layer of the neural network model, and taking the capsule network as a third layer of the neural network model to construct the neural network model; and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
The method comprises the following steps of processing the text word segmentation data through a pre-trained network model to obtain the criminal names and the law bar prediction results corresponding to the cases to be handled, and comprises the following steps: mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar; respectively inputting the word vectors into a TextCNN convolutional network and a capsule network model for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector; and connecting the set vector to a pre-established criminal name class mark and a pre-established legal title class mark based on a full connection layer of the network model, obtaining a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
Wherein, the said word vector is input into TextCNN convolution network and capsule network model to process separately, the output vector of the said TextCNN convolution network and capsule network model is collected, get the collection vector, include: inputting the word vector into a TextCNN convolution network model for processing to obtain a first vector with a first dimension quantity; inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension quantity; and collecting the first vector and the second vector to obtain a collection vector, wherein the dimension number of the collection vector is the sum of the first dimension number and the second dimension number.
The case information of the case to be handled, of which judicial data need to be predicted, is acquired through a hardware device consisting of a processor and a memory, the judicial data comprise a criminal name and a legal bar, the case information comprises text information of the case to be handled, then the text information is subjected to word segmentation to obtain document word segmentation data, the text word segmentation data is processed through a pre-trained network model to acquire criminal name and legal bar prediction results corresponding to the case to be handled, and the network model is trained by fusing a TextCNN convolutional network and a capsule network model. The names of the cases to be dealt with and the rules of the cases to be dealt with can be obtained by processing based on the pre-trained network model, so that the names of the cases to be dealt with and the rules of the cases to be dealt with can be predicted for legal personnel through the hardware equipment consisting of the processor and the memory disclosed by the invention.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus, client, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transmyedia) such as modulated data signals and carrier waves.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A judicial data processing method, comprising:
acquiring case information of a case to be handled, of which judicial data needs to be predicted, wherein the judicial data comprises a criminal name and a legal bar, and the case information comprises text information of the case to be handled;
performing word segmentation processing on the text information to obtain document word segmentation data;
and processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal item prediction result corresponding to the case to be handled, wherein the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
2. The method of claim 1, wherein the network model is fused by:
acquiring a published judicial writing as a training text, and carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged;
taking the word vector model as an input layer of a neural network model, and taking the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model to construct the neural network model;
and training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
3. The method of claim 1, wherein the text participle data is processed through a pre-trained network model to obtain the criminal name and legal item prediction results corresponding to the case to be handled, and the method comprises the following steps:
mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, wherein the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar;
respectively inputting the word vectors into a TextCNN convolutional network and a capsule network model for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector;
and connecting the set vector to a pre-established criminal name class mark and a pre-established legal title class mark based on a full connection layer of the network model, obtaining a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
4. The method according to claim 3, wherein the inputting the word vector into a TextCNN convolutional network and a capsule network model for processing, respectively, and aggregating output vectors of the TextCNN convolutional network and the capsule network model to obtain an aggregate vector comprises:
inputting the word vector into a TextCNN convolution network model for processing to obtain a first vector with a first dimension quantity;
inputting the word vectors into a capsule network model for processing to obtain second vectors with a second dimension quantity;
and collecting the first vector and the second vector to obtain a collection vector, wherein the dimension number of the collection vector is the sum of the first dimension number and the second dimension number.
5. A judicial data processing system, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring case information of a case to be handled, of which judicial data needs to be predicted, the judicial data comprises a criminal name and a legal provision, and the case information comprises text information of the case to be handled;
the word segmentation unit is used for carrying out word segmentation processing on the text information to obtain document word segmentation data;
and the prediction unit is used for processing the text word segmentation data through a pre-trained network model to obtain the criminal name and legal provision prediction results corresponding to the case to be handled, and the network model is obtained by fusing a TextCNN convolutional network, a capsule network model and a neural network model.
6. The system of claim 5, further comprising: a network model generation unit, the network model generation unit comprising:
the word vector training module is used for acquiring a published judicial literature as a training text, carrying out word vector training on the training text to obtain a word vector model, wherein the training text comprises case information, and a crime name and a law bar which are judged;
the fusion module is used for taking the word vector model as an input layer of a neural network model, taking the TextCNN convolutional network and the capsule network in parallel as a second layer of the neural network model and constructing the neural network model;
and the training module is used for training the neural network model based on the training text, and taking the neural network model with the iteration times reaching the preset iteration times or the training turns reaching the specified times as the network model.
7. The system of claim 5, wherein the prediction unit comprises:
the word vector processing module is used for mapping the text word segmentation data to a word vector model for word vector processing to obtain a word vector, the word vector model is an input layer of the network model, and the word vector comprises a crime name feature vector for representing a crime name and a law bar feature vector for representing a law bar;
the processing module is used for inputting the word vectors into a TextCNN convolutional network and a capsule network model respectively for processing, and collecting output vectors of the TextCNN convolutional network and the capsule network model to obtain a collection vector;
and the output module is used for connecting the set vector to a pre-established criminal name class mark and a pre-established legal title based on the full connection layer of the network model, acquiring a criminal name regression result corresponding to the criminal name class mark and a legal title regression result corresponding to the legal title class mark in the set vector, and taking the criminal name and the legal title regression result as predicted judicial data.
8. The system of claim 7, wherein the processing module comprises:
the first processing submodule is used for inputting the word vectors into a TextCNN convolution network model for processing to obtain first vectors with a first dimensionality;
the second processing submodule is used for inputting the word vectors into the capsule network model for processing to obtain second vectors with a second dimension quantity;
and the vector superposition module is used for aggregating the first vector and the second vector to obtain an aggregate vector, and the dimension number of the aggregate vector is the sum of the first dimension number and the second dimension number.
9. A storage medium, characterized in that the storage medium comprises a stored program, wherein a device on which the storage medium is located is controlled to perform the judicial data processing method according to any one of claims 1 to 4 when the program is run.
10. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the judicial data processing method of any one of claims 1 to 4.
CN201811156961.6A 2018-09-30 2018-09-30 Judicial data processing method and system Active CN110990560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811156961.6A CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811156961.6A CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Publications (2)

Publication Number Publication Date
CN110990560A true CN110990560A (en) 2020-04-10
CN110990560B CN110990560B (en) 2023-07-07

Family

ID=70059786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811156961.6A Active CN110990560B (en) 2018-09-30 2018-09-30 Judicial data processing method and system

Country Status (1)

Country Link
CN (1) CN110990560B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552808A (en) * 2020-04-20 2020-08-18 北京北大软件工程股份有限公司 Administrative illegal case law prediction method and tool based on convolutional neural network
CN113065005A (en) * 2021-05-19 2021-07-02 南京烽火星空通信发展有限公司 Legal provision recommendation method based on knowledge graph and text classification model
CN113360657A (en) * 2021-06-30 2021-09-07 安徽商信政通信息技术股份有限公司 Intelligent document distribution and handling method and device and computer equipment
CN113626557A (en) * 2021-05-17 2021-11-09 四川大学 Intelligent law enforcement recommendation auxiliary system based on element labeling and BERT and RCNN algorithms

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952193A (en) * 2017-03-23 2017-07-14 北京华宇信息技术有限公司 A kind of criminal case aid decision-making method based on fuzzy depth belief network
US9754219B1 (en) * 2016-07-01 2017-09-05 Intraspexion Inc. Using classified text and deep learning algorithms to identify entertainment risk and provide early warning
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107818138A (en) * 2017-09-28 2018-03-20 银江股份有限公司 A kind of case legal regulation recommends method and system
CN108133436A (en) * 2017-11-23 2018-06-08 科大讯飞股份有限公司 Automatic method and system of deciding a case

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9754219B1 (en) * 2016-07-01 2017-09-05 Intraspexion Inc. Using classified text and deep learning algorithms to identify entertainment risk and provide early warning
CN106952193A (en) * 2017-03-23 2017-07-14 北京华宇信息技术有限公司 A kind of criminal case aid decision-making method based on fuzzy depth belief network
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107818138A (en) * 2017-09-28 2018-03-20 银江股份有限公司 A kind of case legal regulation recommends method and system
CN108133436A (en) * 2017-11-23 2018-06-08 科大讯飞股份有限公司 Automatic method and system of deciding a case

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MAOMAO2017: "基于动态路由的胶囊网络在文本分类上的探索", 《HTTPS://BLOG.CSDN.NET/SUMIYOU8385/ARTICLE/DETAILS/80045058?》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552808A (en) * 2020-04-20 2020-08-18 北京北大软件工程股份有限公司 Administrative illegal case law prediction method and tool based on convolutional neural network
CN113626557A (en) * 2021-05-17 2021-11-09 四川大学 Intelligent law enforcement recommendation auxiliary system based on element labeling and BERT and RCNN algorithms
CN113065005A (en) * 2021-05-19 2021-07-02 南京烽火星空通信发展有限公司 Legal provision recommendation method based on knowledge graph and text classification model
CN113065005B (en) * 2021-05-19 2024-01-09 南京烽火星空通信发展有限公司 Legal provision recommendation method based on knowledge graph and text classification model
CN113360657A (en) * 2021-06-30 2021-09-07 安徽商信政通信息技术股份有限公司 Intelligent document distribution and handling method and device and computer equipment
CN113360657B (en) * 2021-06-30 2023-10-24 安徽商信政通信息技术股份有限公司 Intelligent document distribution handling method and device and computer equipment

Also Published As

Publication number Publication date
CN110990560B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN110990560A (en) Judicial data processing method and system
US11080475B2 (en) Predicting spreadsheet properties
CN108920654A (en) A kind of matched method and apparatus of question and answer text semantic
US20140279583A1 (en) Systems and Methods for Classifying Entities
CN111383030B (en) Transaction risk detection method, device and equipment
CN110674188A (en) Feature extraction method, device and equipment
WO2020063524A1 (en) Method and system for determining legal instrument
CN110968688A (en) Judicial data processing method and system
CN111882426A (en) Business risk classifier training method, device, equipment and storage medium
CN110969549B (en) Judicial data processing method and system
CN110020134B (en) Knowledge service information pushing method and system, storage medium and processor
CN113743618A (en) Time series data processing method and device, readable medium and electronic equipment
CN107016028B (en) Data processing method and apparatus thereof
CN110969017A (en) Judicial data processing method and system
Zaffar et al. Comparing the performance of FCBF, Chi-Square and relief-F filter feature selection algorithms in educational data mining
Ghofrani et al. Applying product line engineering concepts to deep neural networks
Al-Ahmari et al. Analysis of a multimachine flexible manufacturing cell using stochastic Petri nets
CN110163470B (en) Event evaluation method and device
CN111191007A (en) Article keyword filtering method and device based on block chain and medium
Volna et al. Pattern recognition and classification in time series data
Santos et al. Modelling a deep learning framework for recognition of human actions on video
Liu et al. A sequential Latin hypercube sampling method for metamodeling
Shyr et al. Automated data analysis
CN114969253A (en) Market subject and policy matching method and device, computing device and medium
CN110990522B (en) Legal document determining method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant