CN116070624A - Class case pushing method based on environment-friendly case elements - Google Patents
Class case pushing method based on environment-friendly case elements Download PDFInfo
- Publication number
- CN116070624A CN116070624A CN202310359002.9A CN202310359002A CN116070624A CN 116070624 A CN116070624 A CN 116070624A CN 202310359002 A CN202310359002 A CN 202310359002A CN 116070624 A CN116070624 A CN 116070624A
- Authority
- CN
- China
- Prior art keywords
- case
- representation
- handled
- candidate
- key element
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 claims abstract description 24
- 230000004927 fusion Effects 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 31
- 238000012545 processing Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 9
- 230000007246 mechanism Effects 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 7
- 230000010354 integration Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 230000005484 gravity Effects 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 description 10
- 239000002699 waste material Substances 0.000 description 9
- 230000015654 memory Effects 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 239000002253 acid Substances 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 239000002351 wastewater Substances 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000010802 sludge Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000002920 hazardous waste Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application is applicable to the technical field of judicial case pushing, and provides a case pushing method based on environment-friendly case elements, which comprises the following steps: the case text information of the case to be handled is converted into a text sequence, and vectorized representation and entity sequences of the text sequence are obtained; acquiring a characteristic representation of a text sequence; inputting the feature representation into a BiLSTM neural network to perform modeling of the front-back semantic dependency relationship, and obtaining a key feature representation; the key element representation of the case to be handled is obtained; information fusion is carried out on the key element representation and the entity sequence, and global semantic information representation of the case to be handled is obtained; determining candidate cases of the case to be handled from the retrieval case pool; constructing a matching objective function according to key element representations of the to-be-handled cases and the candidate cases and the global semantic information representations; and completing the Top-K recommendation task by matching the objective function to obtain Top-K candidate cases of the case to be handled. The matching precision of the class case of ecological environment-friendly class case can be improved.
Description
Technical Field
The application belongs to the technical field of judicial case pushing, and particularly relates to a case pushing method based on environment-friendly case elements.
Background
In recent years, with the development of deep fusion of judicial work and novel information technology, a text mining technology based on deep learning can capture key elements of texts from case documents, and search and screening of similar cases can be performed in a targeted manner. However, the conventional case pushing method and system have the problems of inaccurate key element information mining, inaccurate case-like pushing and the like, so that the practicability of the method or system in the process of assisting a judicial analysis and judgment by a judges is seriously affected, and the method or system is more obvious in the process of processing ecological environment-friendly cases. The ecological environment-friendly cases are provided with more complex case elements, the cases are generally complicated, the causal relationships among different factors are related, the interests among the main bodies are overlapped and interweaved, a certain challenge is brought to the case analysis and the case pushing of the ecological environment-friendly cases, and the conventional case pushing method and system cannot meet the judicial judgment requirements of the current ecological environment-friendly cases.
Disclosure of Invention
The embodiment of the application provides a class case pushing method based on environment-friendly case elements, which can solve the problem of low matching precision of the class case of the ecological environment-friendly case.
The embodiment of the application provides a class case pushing method based on environment-friendly case elements, which comprises the following steps:
converting case text information of a case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence;
acquiring a characteristic representation of the text sequence according to the vectorization representation;
inputting the feature representation into a BiLSTM neural network to perform modeling of front-back semantic dependency relationship, and obtaining key feature representation integrating context semantic information; the key feature representation refers to a vectorized representation of key information in a text sequence;
extracting core key elements in the key feature representation by using an attention mechanism to obtain key element representations of the to-be-handled cases; the core key element refers to vectorized representation of words which can represent the subject and content of the case to be handled in the key feature representation;
inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the case to be handled;
based on the key element representation, a plurality of candidate cases of the case to be handled are retrieved from a retrieval case pool, and the key element representation and the global semantic information representation of each candidate case are obtained;
constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case;
constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases;
and taking the Top-K candidate cases as the pushing cases of the cases to be handled.
The scheme of the application has the following beneficial effects:
in the embodiment of the application, the key characteristics and semantic information in the case to be handled are mined by utilizing the BiLSTM neural network, and the attention mechanism is applied to the extraction of the core key elements, so that the search efficiency of the candidate cases is improved; meanwhile, based on key element representation and global semantic information representation of the to-be-handled case and the candidate case, the purpose of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is achieved by constructing a matching objective function between the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, top-K recommendation tasks are completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
Other advantages of the present application will be described in detail in the detailed description section that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a class pushing method based on environment-friendly case elements according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a case pushing system based on environmental case elements according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
Aiming at the problem of low matching precision of the current ecological environment-friendly case, the embodiment of the application provides a case pushing method based on environment-friendly case elements, which improves the searching efficiency of candidate cases by utilizing a BiLSTM neural network to mine key features and semantic information in the case to be handled and applying an attention mechanism to the extraction of core key elements; meanwhile, the aim of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is fulfilled by constructing a matching objective function between the to-be-handled case and the candidate case based on the key element representation and the global semantic information representation of the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, the Top-K recommendation task is completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
An exemplary description of the case pushing method based on the environment-friendly case element provided in the application is provided below with reference to specific embodiments.
As shown in fig. 1, the class case pushing method based on the environment-friendly case element provided in the embodiment of the application includes the following steps:
and 11, converting the case text information of the case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence.
The vectorized representation refers to converting case text information into computer-processable digital vectors for subsequent processing and analysis.
For convenience of description, the case text information of the to-be-handled case may be recorded as. In some embodiments of the present application, the specific process of obtaining text sequences, vectorized representations, and entity sequences may be:
step 11.1, filtering stop words and useless words in case text information of a case to be handled;
step 11.2, word segmentation processing is carried out on the filtered case text information;
step 11.3, performing data processing such as text segmentation and text serialization on the case text information subjected to word segmentation processing to obtain a text sequence of the case text information;/>,/>Representing the first of a text sequenceSentence (S)>,/>Representing the number of sentences in the text sequence;
step 11.4, inputting the text sequence into an embedding layer in the ERNIE model to obtain a vectorized representation of the text sequenceAnd the entity sequence corresponding to the text sequence +.>。
Wherein the ERNIE model is a pre-trained language model,,,/>representation->Is represented by a vectorization of>Representing text sequence +.>Middle->Personal entity object->,/>Representing text sequence +.>The number of the entity objects in the middle, and->。
And step 12, acquiring the characteristic representation of the text sequence according to the vectorization representation.
In some embodiments of the present application, the feature representation of the text sequence may be obtained by pre-training the vectorized representation into an encoder (e.g., a T-Econder encoder) in the ERNIE model. In particular, the method comprises the steps of,,/>is generated by the encoder with respect to sentence +.>Is a case feature representation of (a).
The characteristic representation refers to the vectorization representation of words or words such as names of people, names of places, names of institutions, time, emotion words and the like in the text information of the case, and the words or words can be used for extracting topics and key points related to the text information.
And 13, inputting the feature representation into a BiLSTM neural network to perform modeling of the front-back semantic dependency relationship, and obtaining the key feature representation of the fused context semantic information.
In the related art, the BiLSTM neural network is formed by combining a forward long short-term memory (LSTM) with a backward LSTM. The key feature representation refers to a vectorized representation of key information in a text sequence. That is, the key feature representation refers to a vectorized representation of extracting words or terms from the feature representation that contribute significantly to the text content that may be used to describe and understand the topic or emotional propensity of the text.
In some embodiments of the present application, feature representations are input into a BiLSTM neural network for modeling of contextual semantic dependencies in order to obtain key feature representations that integrate contextual semantic information,,/>No. I representing BiLSTM neural network output>The number of key elements is one,,/>representing the number of key elements of the BiLSTM neural network output, < ->,/>Representing the number of sentences in the text sequence.
indicate->Results output by the forward LSTM at each instant, < >>Indicate->And outputting results to the LSTM after each moment.
It is worth mentioning that, through utilizing forward LSTM to carry out the extraction of key element to the feature representation, obtain forward feature representation, simultaneously, carry out the extraction of key element to the feature representation through backward LSTM, obtain backward feature representation, finally, through carrying out the concatenation with the context semantic information of two directions, obtain the key feature representation of the context semantic information of fusion, effectively alleviate the incompleteness of one-way semantic representation, thereby improved the accuracy of extracting the key element of case.
The core key elements refer to vectorized representations of words in the key feature representation, which can represent the subject and content of the to-be-handled case. That is, in some embodiments of the present application, core key element refers to extracting a vectorized representation of words from the key feature representation that best represent the subject matter and content of the case information, resulting in a key element representation. The words can assist a judge or a computer to quickly know the core theme and content of the text, and provide a retrieval basis for subsequent retrieval candidate cases.
And step 14, extracting core key elements in the key feature representation by using an attention mechanism to obtain the key element representation of the case to be handled.
In some embodiments of the present application, by applying Attention mechanism (Attention) to the extraction process of the core key element, the extraction effect of the BiLSTM neural network on the case key element is further improved, and the Attention on the non-case key element information is reduced, so that the accuracy of extracting the case key element is further improved.
In some embodiments of the present application, the specific formula may beExtracting core key elements in the key feature representation to obtain a key element representation +.>。/>
wherein ,,/>representation->The%>Key element of core->,/>Representing the number of key elements of the core,/->Representation->Attention weighting coefficient of->,/>Representation->By->The weight obtained by the activation function is used,,/>representation->Weight matrix of>Representation->Is a deviation of (2).
And step 15, inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the case to be handled.
In some embodiments of the present application, the specific formula may beCalculating to obtain global semantic information representation +.>。
wherein ,representing the activation function of a fully connected neural network, +.>Representing a weight matrix (i.e +.>Weight matrix of (d) ->Representing the bias vector +_>Representing a stitching function->Representing the sequence of entities.
For example, assume that the case text information of the case to be handled is: through mass report, law enforcement personnel conduct field investigation on waste abrasive disc processing factories responsible for the law enforcement personnel, and the dangerous wastes such as acid-containing wastewater, waste mud and the like generated by the factory production are directly discharged to nearby ditches without effective treatment, so that serious influence is caused to the surrounding environment. Through the steps, key characteristics of law enforcement personnel, waste abrasive disc processing factories, field investigation, production, acid-containing wastewater, waste sludge, dangerous waste, ditches, surrounding environment and the like can be obtained from the case text information, the core key elements of acid-containing wastewater, waste sludge, dangerous waste, illegal dumping, surrounding environment pollution and the like, and the global semantic information of the case text represents: * Illegal dumping of hazardous waste such as acid-containing waste water, waste mud, etc., results in serious environmental pollution.
And step 16, searching a plurality of candidate cases of the case to be handled from the search case pool based on the key element representation, and acquiring the key element representation and the global semantic information representation of each candidate case.
In some embodiments of the present application, the key element representation of the to-do case includes a plurality of core key elements. Accordingly, when searching candidate cases, cases containing any core key element of the core key elements in the search case pool can be used as candidate cases of the cases to be handled. That is, for any case of the search case pool, if the case contains a key element representationAnd taking the case as a candidate case of the case to be handled. It should be noted that, the above search case pool stores a large number of history cases (such as ecological environmental protection cases in recent years).
It should be noted that, in some embodiments of the present application, after determining the candidate cases of the to-be-handled case, the key element representation and the global semantic information representation of each candidate case may be obtained according to the above-mentioned manners from step 11 to step 15.
And step 17, constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case.
In some embodiments of the present application, the degree of matching between the to-be-handled case and the candidate case may be calculated from the two different angles of the key element representation and the global semantic information representation, so that the accuracy of the degree of matching between the to-be-handled case and the candidate case is greatly improved.
And step 18, constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases.
In some embodiments of the present application, after executing the Top-K recommendation task, K candidate cases can be selected from the plurality of candidate cases according to the order of the matching degree with the to-be-handled case from high to low.
And step 19, taking the Top-K candidate cases as the pushing cases of the cases to be handled.
In some embodiments of the present application, after determining the pushing cases, the corresponding cases may be pushed to the judges according to the actual requirements, to assist the judges in case analysis and judicial judgment. It should be noted that, K candidate cases with higher matching degree with the case to be handled can be screened out from a plurality of candidate cases by utilizing the Top-K recommendation task, so that matching precision of the case finally pushed to the user (such as a judge) and the case to be handled is high.
It is worth mentioning that, the above-mentioned class pushing method of the application utilizes BiLSTM neural network to excavate key features and semantic information in the case to be handled, and applies the attention mechanism to the extraction of the key elements of the core, has improved the search efficiency of the candidate class; meanwhile, based on key element representation and global semantic information representation of the to-be-handled case and the candidate case, the purpose of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is achieved by constructing a matching objective function between the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, top-K recommendation tasks are completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
The construction process of the matching objective function is exemplarily described below with reference to specific embodiments.
In some embodiments of the present application, the step 17, according to the key element representation and the global semantic information representation of the to-be-handled case, and the key element representation and the global semantic information representation of each candidate case, constructs a specific implementation manner of the matching objective function corresponding to each candidate case, including the following steps:
and 17.1, calculating word shift distance between the case to be handled and each candidate case, which is expressed by the key element.
Specifically, it can be expressed by the formulaCalculating to-be-handled cases and the +.>Candidate case->Word shift distance between representations with respect to key elements。
wherein ,representing the case to be handled->Indicate->Candidate case,/->,/>Representing the number of candidate classes, +.>Key element representation +.>The%>Key element of core->Representation->The number of key elements of the middle core,/-, and>representation->The +.o in the key element representation of (2)>Key element of core->Representation->The key elements of (2) represent the number of core key elements in the core. />Representation->And->The weight between the two can be calculated by adopting semantic similarity between word vectors, namely cosine similarity; />Representation->And->The semantic distance between the two can be calculated by Euclidean distance.
And 17.2, calculating cosine distances between the to-be-handled case and each candidate case about the global semantic information representation.
Specifically, it can be expressed by the formulaCalculating to-be-handled cases and the +.>Cosine distance between candidate classes about global semantic information representation +.>。
wherein ,global semantic information representation representing the to-be-handled case, < ->Representation ofIs represented by global semantic information->Representing the inner product of the matrix>Representing the L2 norm.
And 17.3, determining the integration distance between the case to be handled and each candidate case according to the calculated word shift distance and cosine distance, and taking an expression of the integration distance corresponding to each candidate case as a matching objective function corresponding to the candidate case.
In some embodiments of the present application, the formula may be passedCalculating to-be-handled cases and the +.>Integration distance between candidate classes +.>. Correspondingly, the->The matching objective functions corresponding to the candidate classes are: />。
Iterative optimization of Top-K recommendation tasks is described in exemplary detail below in connection with specific embodiments.
In some embodiments of the present application, after the matching objective function corresponding to each candidate class is constructed, the Top-K recommendation task may be constructed by using the constructed matching objective function, and the Top-K recommendation task may be iteratively optimized to determine Top-K candidate classes.
In the process of iteratively optimizing the Top-K recommendation task, the optimization objective of the Top-K recommendation task is as follows:
wherein ,core key element for representing to-be-handled case +.>Specific gravity in the to-be-handled case, < ->Indicate->Core key element of candidate class +.>In->Specific gravity in each candidate class.
An exemplary description of the environment-friendly case element-based case pushing system provided in the present application is provided below in connection with specific embodiments.
As shown in fig. 2, the class pushing system based on environment-friendly case elements provided in the embodiment of the present application includes a data processing module, a semantic enhancement module, a candidate class searching module, and a class matching and pushing module, which is specifically as follows:
the data processing module is used for carrying out data preprocessing on the basic case information of the case to be handled and the candidate case (namely the candidate similar case), serializing the preprocessed case information and vectorizing the text sentence.
Specifically, the data processing module in the embodiment of the application comprises a preprocessing unit and a vectorization unit, wherein the preprocessing unit is used for preprocessing operations such as stopping words, filtering useless words, text word segmentation, text segmentation, serialization and the like on case text information; the vectorization unit is used for inputting the text sequence into an embedded layer in the ERNIE model so as to obtain vectorization representation of the text and a corresponding entity sequence thereof.
The semantic enhancement module is used for extracting case key elements representing case fact information and carrying out information fusion of global semantic representation through a fully connected neural network.
Specifically, the semantic enhancement module in the embodiment of the application comprises an element extraction unit and an information fusion unit, wherein the element extraction unit is used for sequentially carrying out semantic learning on the vectorized case information through an encoder in an ERNIE model and a BiLSTM model so as to extract key elements in the case information; the information fusion unit is used for inputting the key elements and the entity sequences into the fully-connected neural network for information fusion.
And the candidate case retrieval module is used for acquiring a plurality of candidate cases with key elements similar to the case to be handled.
Specifically, the candidate case search module in the embodiment of the application includes a case search unit, where the case search unit is configured to search a plurality of candidate cases with similar case elements from a search case pool.
The class matching and pushing module is used for constructing a matching objective function according to key elements and global semantic representations of the to-be-handled cases and the candidate classes, constructing a Top-K recommendation task through the matching objective function, iteratively optimizing the Top-K recommendation task, pushing the to-be-handled cases to a judge according to actual requirements, and assisting the judge to analyze the cases and judge judicially.
Specifically, the class matching and pushing module in the embodiment of the application comprises a matching calculation unit and a class pushing unit, wherein the matching calculation unit is used for matching construction of an objective function, construction of a Top-K recommendation task is carried out through the objective function, the Top-K recommendation task is optimized in an iteration mode, and the class pushing unit is used for pushing front Top-K similar cases to a judge according to actual requirements and assisting the judge in carrying out case analysis and judicial judgment.
As shown in fig. 3, an embodiment of the present application provides an electronic device, where the sending end device D10 of the embodiment includes: at least one processor D100 (only one processor is shown in the figure), a memory D101, and a computer program D102 stored in the memory D101 and executable on the at least one processor D100, wherein the steps in any of the above-described method embodiments are implemented by the processor D100 when the computer program D102 is executed, so as to improve the matching accuracy of the class of the eco-friendly class case.
The processor D100 may be a central processing unit (CPU, central Processing Unit), the processor D100 may also be other general purpose processors, digital signal processors (DSP, digital Signal Processor), application specific integrated circuits (ASIC, application Specific Integrated Circuit), off-the-shelf programmable gate arrays (FPGA, field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage D101 may in some embodiments be an internal storage unit of the sender device D10, for example, a hard disk or a memory of the sender device D10. The memory D101 may also be an external storage device of the sender device D10 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the sender device D10. Further, the memory D101 may include both an internal storage unit and an external storage device of the transmitting device D10. The memory D101 is used for storing an operating system, an application program, a boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory D101 may also be used to temporarily store data that has been output or is to be output.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The embodiment of the application also provides a computer readable storage medium for intelligent pushing of the class, wherein a computer executable program is stored on the computer readable storage medium, and the computer executable program is used for realizing the class pushing method based on the environment-friendly case elements when being executed by a computer processor.
In summary, the embodiment of the application performs text preprocessing and pre-training on the to-be-transacted cases, models the front-back semantic dependency relationship through the BiLSTM neural network, extracts key elements in judicial case information, and searches similar cases by using the case key elements, so that invalid potential similar cases can be effectively filtered, and the composition of candidate cases is further refined; the key elements and the entity sequences of the case text are input into the fully-connected neural network for information fusion so as to obtain global semantic representation, and the extraction and characterization of the case key information are effectively improved. Meanwhile, the matching degree between the to-be-handled cases and the candidate cases in the aspects of key elements and global semantic representation is comprehensively considered, the accuracy of matching similar cases is effectively improved, and proper similar case auxiliary judges are recommended to conduct case analysis and judicial judgment.
While the foregoing is directed to the preferred embodiments of the present application, it should be noted that modifications and adaptations to those embodiments may occur to one skilled in the art and that such modifications and adaptations are intended to be comprehended within the scope of the present application without departing from the principles set forth herein.
Claims (10)
1. The class case pushing method based on the environment-friendly case elements is characterized by comprising the following steps of:
converting case text information of a case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence;
acquiring a characteristic representation of the text sequence according to the vectorization representation;
inputting the characteristic representation into a BiLSTM neural network to perform modeling of front-back semantic dependency relationship, and obtaining key characteristic representation of fused context semantic information; the key feature representation refers to a vectorized representation of key information in the text sequence;
extracting core key elements in the key feature representation by using an attention mechanism to obtain the key element representation of the to-be-handled case; the core key elements refer to vectorized representations of words which can represent the subject and the content of the to-be-handled case in the key feature representation;
inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the to-be-handled case;
based on the key element representation, a plurality of candidate cases of the case to be handled are retrieved from a retrieval case pool, and the key element representation and the global semantic information representation of each candidate case are obtained;
constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case;
constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases;
and taking the Top-K candidate cases as the pushing cases of the to-be-handled cases.
2. The case pushing method according to claim 1, wherein the key features represent,/>No. I representing BiLSTM neural network output>The number of key elements is one,,/>representing the number of key elements of the BiLSTM neural network output, < ->,/>Representing the number of sentences in the text sequence;
the extracting the core key elements in the key feature representation by using an attention mechanism comprises the following steps:
by the formulaExtracting core key elements in the key feature representation to obtain a key element representation of the to-be-handled case>;
wherein ,,/>representation->The%>Key element of core->,/>Representing the number of key elements of the core,/->Representation->Attention weighting coefficient of->,/>Representation->By->The weight obtained by the activation function is used,,/>representation->Weight matrix of>Representation->Is a deviation of (2).
3. The case pushing method according to claim 2, wherein the inputting the key element representation and the entity sequence into a fully connected neural network for information fusion to obtain the global semantic information representation of the case to be handled includes:
by the formulaCalculating to obtain global semantic information representation of the to-be-handled case>;
4. The case pushing method according to claim 1, wherein the key element representation of the case to be handled includes a plurality of core key elements;
the retrieving, based on the key element representation, a plurality of candidate cases of the case to be handled from a retrieval case pool, including:
and taking the cases containing any core key element in the plurality of core key elements in the search case pool as candidate cases of the cases to be handled.
5. The case pushing method according to claim 4, wherein the constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case includes:
calculating word shift distance between the to-be-handled case and each candidate case, wherein the word shift distance is expressed by the key element;
calculating cosine distance between the to-be-handled case and each candidate case about the global semantic information representation;
and determining the integration distance between the to-be-handled case and each candidate case according to the calculated word shift distance and cosine distance, and taking an expression of the integration distance corresponding to each candidate case as a matching objective function corresponding to the candidate case.
6. The case pushing method according to claim 5, wherein the calculating a word shift distance between the case to be handled and each candidate case with respect to the key element representation includes:
by the formulaCalculating to-be-handled cases and the +.>Word shift distance +.between candidate classes with respect to key element representation>;
The calculating the cosine distance between the to-be-handled case and each candidate case about the global semantic information representation includes:
by the formulaCalculating to-be-handled cases and the +.>Cosine distance between candidate classes about global semantic information representation +.>;
wherein ,representing the to-be-handled case +.>Indicate->Candidate case,/->,/>Representing the number of candidate classes, +.>The key element representation of the to-be-handled case>The%>Key element of core->Representation->The number of key elements of the middle core,/-, and>representation->The +.o in the key element representation of (2)>Key element of core->Representation->The number of key elements of the core in the representation, < +.>Representation->And->Weights between->Representation->And->Semantic distance between->Global semantic information representation representing the to-be-handled case,>representation->Is represented by global semantic information->Representing the inner product of the matrix>Representing the L2 norm.
8. The case pushing method according to claim 7, wherein in the process of iteratively optimizing the Top-K recommendation task, an optimization objective of the Top-K recommendation task is:
9. The case pushing method according to claim 1, wherein the converting case text information of a case to be handled into a text sequence, and obtaining a vectorized representation of the text sequence and an entity sequence corresponding to the text sequence, includes:
filtering the stop words and useless words of the case text information of the case to be handled;
word segmentation processing is carried out on the filtered case text information;
carrying out text serialization processing on the case text information subjected to word segmentation processing to obtain a text sequence of the case text information;
and inputting the text sequence into an embedding layer in an ERNIE model to obtain the vectorized representation of the text sequence and an entity sequence corresponding to the text sequence.
10. The case pushing method according to claim 1, wherein the obtaining, from the vectorized representation, a feature representation of the text sequence includes:
and (3) inputting the vectorized representation into an encoder in an ERNIE model for pre-training to obtain the characteristic representation of the text sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310359002.9A CN116070624A (en) | 2023-04-06 | 2023-04-06 | Class case pushing method based on environment-friendly case elements |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310359002.9A CN116070624A (en) | 2023-04-06 | 2023-04-06 | Class case pushing method based on environment-friendly case elements |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116070624A true CN116070624A (en) | 2023-05-05 |
Family
ID=86170074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310359002.9A Pending CN116070624A (en) | 2023-04-06 | 2023-04-06 | Class case pushing method based on environment-friendly case elements |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116070624A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027306A (en) * | 2019-12-23 | 2020-04-17 | 园宝科技(武汉)有限公司 | Intellectual property matching technology based on keyword extraction and word shifting distance |
CN111797247A (en) * | 2020-09-10 | 2020-10-20 | 平安国际智慧城市科技股份有限公司 | Case pushing method and device based on artificial intelligence, electronic equipment and medium |
CN112905793A (en) * | 2021-02-23 | 2021-06-04 | 山西同方知网数字出版技术有限公司 | Case recommendation method and system based on Bilstm + Attention text classification |
US11194972B1 (en) * | 2021-02-19 | 2021-12-07 | Institute Of Automation, Chinese Academy Of Sciences | Semantic sentiment analysis method fusing in-depth features and time sequence models |
CN114490946A (en) * | 2022-02-16 | 2022-05-13 | 中南大学 | Xlnet model-based class case retrieval method, system and equipment |
CN114547257A (en) * | 2022-04-25 | 2022-05-27 | 湖南工商大学 | Class matching method and device, computer equipment and storage medium |
CN114547237A (en) * | 2022-01-24 | 2022-05-27 | 河海大学 | French recommendation method fusing French keywords |
CN114610891A (en) * | 2022-05-12 | 2022-06-10 | 湖南工商大学 | Law recommendation method and system for unbalanced judicial official document data |
-
2023
- 2023-04-06 CN CN202310359002.9A patent/CN116070624A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027306A (en) * | 2019-12-23 | 2020-04-17 | 园宝科技(武汉)有限公司 | Intellectual property matching technology based on keyword extraction and word shifting distance |
CN111797247A (en) * | 2020-09-10 | 2020-10-20 | 平安国际智慧城市科技股份有限公司 | Case pushing method and device based on artificial intelligence, electronic equipment and medium |
US11194972B1 (en) * | 2021-02-19 | 2021-12-07 | Institute Of Automation, Chinese Academy Of Sciences | Semantic sentiment analysis method fusing in-depth features and time sequence models |
CN112905793A (en) * | 2021-02-23 | 2021-06-04 | 山西同方知网数字出版技术有限公司 | Case recommendation method and system based on Bilstm + Attention text classification |
CN114547237A (en) * | 2022-01-24 | 2022-05-27 | 河海大学 | French recommendation method fusing French keywords |
CN114490946A (en) * | 2022-02-16 | 2022-05-13 | 中南大学 | Xlnet model-based class case retrieval method, system and equipment |
CN114547257A (en) * | 2022-04-25 | 2022-05-27 | 湖南工商大学 | Class matching method and device, computer equipment and storage medium |
CN114610891A (en) * | 2022-05-12 | 2022-06-10 | 湖南工商大学 | Law recommendation method and system for unbalanced judicial official document data |
Non-Patent Citations (4)
Title |
---|
CHEN H, WU L, CHEN J, ET AL.: "A comparative study of automated legal text classification using random forests and deep learning", 《INFORMATION PROCESSING & MANAGEMENT, 2022》, vol. 59, no. 2, pages 102798 * |
LIU L, AN D.: "Law Recommendation Based on Self-Attention Mechanism and Feature Fusion", 《2022 THE 5TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION MANAGEMENT (ICSIM)》, pages 106 - 112 * |
NURANTI E Q, YULIANTI E, HUSIN H S.: "Predicting the Category and the Length of Punishment in Indonesian Courts Based on Previous Court Decision Documents", 《COMPUTERS》, vol. 11, no. 6, pages 88 * |
赵承鼎;郭军军;余正涛;黄于欣;刘权;宋燃;: "基于非对称孪生网络的新闻与案件相关性分析", 中文信息学报, no. 03 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tay et al. | Compare, compress and propagate: Enhancing neural architectures with alignment factorization for natural language inference | |
Rudolph et al. | Dynamic embeddings for language evolution | |
Augenstein et al. | Stance detection with bidirectional conditional encoding | |
US10289952B2 (en) | Semantic frame identification with distributed word representations | |
Yu et al. | Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach | |
Sharma et al. | Self-supervised contextual keyword and keyphrase retrieval with self-labelling | |
Chen et al. | Discriminative soft bag-of-visual phrase for mobile landmark recognition | |
CN101542531A (en) | Image recognizing apparatus and image recognizing method | |
CN104572958A (en) | Event extraction based sensitive information monitoring method | |
Zhou et al. | Recognizing software bug-specific named entity in software bug repository | |
Xu et al. | Post2vec: Learning distributed representations of Stack Overflow posts | |
Li et al. | Automatic identification of decisions from the hibernate developer mailing list | |
Wang et al. | DM_NLP at semeval-2018 task 12: A pipeline system for toponym resolution | |
CN110826323B (en) | Comment information validity detection method and comment information validity detection device | |
Mi et al. | Knowledge-aware cross-modal text-image retrieval for remote sensing images | |
CN117668292A (en) | Cross-modal sensitive information identification method | |
Nair et al. | Fake News Detection Model for Regional Language | |
CN116070624A (en) | Class case pushing method based on environment-friendly case elements | |
CN116186241A (en) | Event element extraction method and device based on semantic analysis and prompt learning, electronic equipment and storage medium | |
CN113836297B (en) | Training method and device for text emotion analysis model | |
Tang et al. | Interpretability rules: Jointly bootstrapping a neural relation extractorwith an explanation decoder | |
CN112837148B (en) | Risk logic relationship quantitative analysis method integrating domain knowledge | |
CN115455155B (en) | Method for extracting subject information of government affair text and storage medium | |
US20240062570A1 (en) | Detecting unicode injection in text | |
Lee et al. | Extracting fallen objects on the road from accident reports using a natural language processing model-based approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |