CN116070624A - Class case pushing method based on environment-friendly case elements - Google Patents

Class case pushing method based on environment-friendly case elements Download PDF

Info

Publication number
CN116070624A
CN116070624A CN202310359002.9A CN202310359002A CN116070624A CN 116070624 A CN116070624 A CN 116070624A CN 202310359002 A CN202310359002 A CN 202310359002A CN 116070624 A CN116070624 A CN 116070624A
Authority
CN
China
Prior art keywords
case
representation
handled
candidate
key element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310359002.9A
Other languages
Chinese (zh)
Inventor
陈晓红
陈姣龙
梁伟
胡东滨
刘朝明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202310359002.9A priority Critical patent/CN116070624A/en
Publication of CN116070624A publication Critical patent/CN116070624A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application is applicable to the technical field of judicial case pushing, and provides a case pushing method based on environment-friendly case elements, which comprises the following steps: the case text information of the case to be handled is converted into a text sequence, and vectorized representation and entity sequences of the text sequence are obtained; acquiring a characteristic representation of a text sequence; inputting the feature representation into a BiLSTM neural network to perform modeling of the front-back semantic dependency relationship, and obtaining a key feature representation; the key element representation of the case to be handled is obtained; information fusion is carried out on the key element representation and the entity sequence, and global semantic information representation of the case to be handled is obtained; determining candidate cases of the case to be handled from the retrieval case pool; constructing a matching objective function according to key element representations of the to-be-handled cases and the candidate cases and the global semantic information representations; and completing the Top-K recommendation task by matching the objective function to obtain Top-K candidate cases of the case to be handled. The matching precision of the class case of ecological environment-friendly class case can be improved.

Description

Class case pushing method based on environment-friendly case elements
Technical Field
The application belongs to the technical field of judicial case pushing, and particularly relates to a case pushing method based on environment-friendly case elements.
Background
In recent years, with the development of deep fusion of judicial work and novel information technology, a text mining technology based on deep learning can capture key elements of texts from case documents, and search and screening of similar cases can be performed in a targeted manner. However, the conventional case pushing method and system have the problems of inaccurate key element information mining, inaccurate case-like pushing and the like, so that the practicability of the method or system in the process of assisting a judicial analysis and judgment by a judges is seriously affected, and the method or system is more obvious in the process of processing ecological environment-friendly cases. The ecological environment-friendly cases are provided with more complex case elements, the cases are generally complicated, the causal relationships among different factors are related, the interests among the main bodies are overlapped and interweaved, a certain challenge is brought to the case analysis and the case pushing of the ecological environment-friendly cases, and the conventional case pushing method and system cannot meet the judicial judgment requirements of the current ecological environment-friendly cases.
Disclosure of Invention
The embodiment of the application provides a class case pushing method based on environment-friendly case elements, which can solve the problem of low matching precision of the class case of the ecological environment-friendly case.
The embodiment of the application provides a class case pushing method based on environment-friendly case elements, which comprises the following steps:
converting case text information of a case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence;
acquiring a characteristic representation of the text sequence according to the vectorization representation;
inputting the feature representation into a BiLSTM neural network to perform modeling of front-back semantic dependency relationship, and obtaining key feature representation integrating context semantic information; the key feature representation refers to a vectorized representation of key information in a text sequence;
extracting core key elements in the key feature representation by using an attention mechanism to obtain key element representations of the to-be-handled cases; the core key element refers to vectorized representation of words which can represent the subject and content of the case to be handled in the key feature representation;
inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the case to be handled;
based on the key element representation, a plurality of candidate cases of the case to be handled are retrieved from a retrieval case pool, and the key element representation and the global semantic information representation of each candidate case are obtained;
constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case;
constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases;
and taking the Top-K candidate cases as the pushing cases of the cases to be handled.
The scheme of the application has the following beneficial effects:
in the embodiment of the application, the key characteristics and semantic information in the case to be handled are mined by utilizing the BiLSTM neural network, and the attention mechanism is applied to the extraction of the core key elements, so that the search efficiency of the candidate cases is improved; meanwhile, based on key element representation and global semantic information representation of the to-be-handled case and the candidate case, the purpose of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is achieved by constructing a matching objective function between the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, top-K recommendation tasks are completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
Other advantages of the present application will be described in detail in the detailed description section that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a class pushing method based on environment-friendly case elements according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a case pushing system based on environmental case elements according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
Aiming at the problem of low matching precision of the current ecological environment-friendly case, the embodiment of the application provides a case pushing method based on environment-friendly case elements, which improves the searching efficiency of candidate cases by utilizing a BiLSTM neural network to mine key features and semantic information in the case to be handled and applying an attention mechanism to the extraction of core key elements; meanwhile, the aim of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is fulfilled by constructing a matching objective function between the to-be-handled case and the candidate case based on the key element representation and the global semantic information representation of the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, the Top-K recommendation task is completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
An exemplary description of the case pushing method based on the environment-friendly case element provided in the application is provided below with reference to specific embodiments.
As shown in fig. 1, the class case pushing method based on the environment-friendly case element provided in the embodiment of the application includes the following steps:
and 11, converting the case text information of the case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence.
The vectorized representation refers to converting case text information into computer-processable digital vectors for subsequent processing and analysis.
For convenience of description, the case text information of the to-be-handled case may be recorded as
Figure SMS_1
. In some embodiments of the present application, the specific process of obtaining text sequences, vectorized representations, and entity sequences may be:
step 11.1, filtering stop words and useless words in case text information of a case to be handled;
step 11.2, word segmentation processing is carried out on the filtered case text information;
step 11.3, performing data processing such as text segmentation and text serialization on the case text information subjected to word segmentation processing to obtain a text sequence of the case text information
Figure SMS_2
;/>
Figure SMS_3
,/>
Figure SMS_4
Representing the first of a text sequence
Figure SMS_5
Sentence (S)>
Figure SMS_6
,/>
Figure SMS_7
Representing the number of sentences in the text sequence;
step 11.4, inputting the text sequence into an embedding layer in the ERNIE model to obtain a vectorized representation of the text sequence
Figure SMS_8
And the entity sequence corresponding to the text sequence +.>
Figure SMS_9
Wherein the ERNIE model is a pre-trained language model,
Figure SMS_12
Figure SMS_15
,/>
Figure SMS_17
representation->
Figure SMS_11
Is represented by a vectorization of>
Figure SMS_14
Representing text sequence +.>
Figure SMS_18
Middle->
Figure SMS_20
Personal entity object->
Figure SMS_10
,/>
Figure SMS_13
Representing text sequence +.>
Figure SMS_16
The number of the entity objects in the middle, and->
Figure SMS_19
And step 12, acquiring the characteristic representation of the text sequence according to the vectorization representation.
In some embodiments of the present application, the feature representation of the text sequence may be obtained by pre-training the vectorized representation into an encoder (e.g., a T-Econder encoder) in the ERNIE model
Figure SMS_21
. In particular, the method comprises the steps of,
Figure SMS_22
,/>
Figure SMS_23
is generated by the encoder with respect to sentence +.>
Figure SMS_24
Is a case feature representation of (a).
The characteristic representation refers to the vectorization representation of words or words such as names of people, names of places, names of institutions, time, emotion words and the like in the text information of the case, and the words or words can be used for extracting topics and key points related to the text information.
And 13, inputting the feature representation into a BiLSTM neural network to perform modeling of the front-back semantic dependency relationship, and obtaining the key feature representation of the fused context semantic information.
In the related art, the BiLSTM neural network is formed by combining a forward long short-term memory (LSTM) with a backward LSTM. The key feature representation refers to a vectorized representation of key information in a text sequence. That is, the key feature representation refers to a vectorized representation of extracting words or terms from the feature representation that contribute significantly to the text content that may be used to describe and understand the topic or emotional propensity of the text.
In some embodiments of the present application, feature representations are input into a BiLSTM neural network for modeling of contextual semantic dependencies in order to obtain key feature representations that integrate contextual semantic information
Figure SMS_27
Figure SMS_28
,/>
Figure SMS_30
No. I representing BiLSTM neural network output>
Figure SMS_26
The number of key elements is one,
Figure SMS_29
,/>
Figure SMS_31
representing the number of key elements of the BiLSTM neural network output, < ->
Figure SMS_32
,/>
Figure SMS_25
Representing the number of sentences in the text sequence.
Figure SMS_33
The specific formula of (2) is calculated as follows:
Figure SMS_34
Figure SMS_35
Figure SMS_36
Figure SMS_37
indicate->
Figure SMS_38
Results output by the forward LSTM at each instant, < >>
Figure SMS_39
Indicate->
Figure SMS_40
And outputting results to the LSTM after each moment.
It is worth mentioning that, through utilizing forward LSTM to carry out the extraction of key element to the feature representation, obtain forward feature representation, simultaneously, carry out the extraction of key element to the feature representation through backward LSTM, obtain backward feature representation, finally, through carrying out the concatenation with the context semantic information of two directions, obtain the key feature representation of the context semantic information of fusion, effectively alleviate the incompleteness of one-way semantic representation, thereby improved the accuracy of extracting the key element of case.
The core key elements refer to vectorized representations of words in the key feature representation, which can represent the subject and content of the to-be-handled case. That is, in some embodiments of the present application, core key element refers to extracting a vectorized representation of words from the key feature representation that best represent the subject matter and content of the case information, resulting in a key element representation. The words can assist a judge or a computer to quickly know the core theme and content of the text, and provide a retrieval basis for subsequent retrieval candidate cases.
And step 14, extracting core key elements in the key feature representation by using an attention mechanism to obtain the key element representation of the case to be handled.
In some embodiments of the present application, by applying Attention mechanism (Attention) to the extraction process of the core key element, the extraction effect of the BiLSTM neural network on the case key element is further improved, and the Attention on the non-case key element information is reduced, so that the accuracy of extracting the case key element is further improved.
In some embodiments of the present application, the specific formula may be
Figure SMS_41
Extracting core key elements in the key feature representation to obtain a key element representation +.>
Figure SMS_42
。/>
wherein ,
Figure SMS_47
,/>
Figure SMS_48
representation->
Figure SMS_56
The%>
Figure SMS_45
Key element of core->
Figure SMS_52
,/>
Figure SMS_50
Representing the number of key elements of the core,/->
Figure SMS_58
Representation->
Figure SMS_51
Attention weighting coefficient of->
Figure SMS_53
,/>
Figure SMS_46
Representation->
Figure SMS_57
By->
Figure SMS_44
The weight obtained by the activation function is used,
Figure SMS_55
,/>
Figure SMS_49
representation->
Figure SMS_59
Weight matrix of>
Figure SMS_43
Representation->
Figure SMS_54
Is a deviation of (2).
And step 15, inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the case to be handled.
In some embodiments of the present application, the specific formula may be
Figure SMS_60
Calculating to obtain global semantic information representation +.>
Figure SMS_61
wherein ,
Figure SMS_62
representing the activation function of a fully connected neural network, +.>
Figure SMS_63
Representing a weight matrix (i.e +.>
Figure SMS_64
Weight matrix of (d) ->
Figure SMS_65
Representing the bias vector +_>
Figure SMS_66
Representing a stitching function->
Figure SMS_67
Representing the sequence of entities.
For example, assume that the case text information of the case to be handled is: through mass report, law enforcement personnel conduct field investigation on waste abrasive disc processing factories responsible for the law enforcement personnel, and the dangerous wastes such as acid-containing wastewater, waste mud and the like generated by the factory production are directly discharged to nearby ditches without effective treatment, so that serious influence is caused to the surrounding environment. Through the steps, key characteristics of law enforcement personnel, waste abrasive disc processing factories, field investigation, production, acid-containing wastewater, waste sludge, dangerous waste, ditches, surrounding environment and the like can be obtained from the case text information, the core key elements of acid-containing wastewater, waste sludge, dangerous waste, illegal dumping, surrounding environment pollution and the like, and the global semantic information of the case text represents: * Illegal dumping of hazardous waste such as acid-containing waste water, waste mud, etc., results in serious environmental pollution.
And step 16, searching a plurality of candidate cases of the case to be handled from the search case pool based on the key element representation, and acquiring the key element representation and the global semantic information representation of each candidate case.
In some embodiments of the present application, the key element representation of the to-do case includes a plurality of core key elements. Accordingly, when searching candidate cases, cases containing any core key element of the core key elements in the search case pool can be used as candidate cases of the cases to be handled. That is, for any case of the search case pool, if the case contains a key element representation
Figure SMS_68
And taking the case as a candidate case of the case to be handled. It should be noted that, the above search case pool stores a large number of history cases (such as ecological environmental protection cases in recent years).
It should be noted that, in some embodiments of the present application, after determining the candidate cases of the to-be-handled case, the key element representation and the global semantic information representation of each candidate case may be obtained according to the above-mentioned manners from step 11 to step 15.
And step 17, constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case.
In some embodiments of the present application, the degree of matching between the to-be-handled case and the candidate case may be calculated from the two different angles of the key element representation and the global semantic information representation, so that the accuracy of the degree of matching between the to-be-handled case and the candidate case is greatly improved.
And step 18, constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases.
In some embodiments of the present application, after executing the Top-K recommendation task, K candidate cases can be selected from the plurality of candidate cases according to the order of the matching degree with the to-be-handled case from high to low.
And step 19, taking the Top-K candidate cases as the pushing cases of the cases to be handled.
In some embodiments of the present application, after determining the pushing cases, the corresponding cases may be pushed to the judges according to the actual requirements, to assist the judges in case analysis and judicial judgment. It should be noted that, K candidate cases with higher matching degree with the case to be handled can be screened out from a plurality of candidate cases by utilizing the Top-K recommendation task, so that matching precision of the case finally pushed to the user (such as a judge) and the case to be handled is high.
It is worth mentioning that, the above-mentioned class pushing method of the application utilizes BiLSTM neural network to excavate key features and semantic information in the case to be handled, and applies the attention mechanism to the extraction of the key elements of the core, has improved the search efficiency of the candidate class; meanwhile, based on key element representation and global semantic information representation of the to-be-handled case and the candidate case, the purpose of calculating the matching degree between the to-be-handled case and the candidate case from multiple perspectives is achieved by constructing a matching objective function between the to-be-handled case and the candidate case, so that the accuracy of the matching degree between the to-be-handled case and the candidate case is improved, further, top-K recommendation tasks are completed through the matching objective function, the matching precision of the obtained Top-K candidate cases and the to-be-handled case is high, and the effect of improving the matching precision of the cases is achieved.
The construction process of the matching objective function is exemplarily described below with reference to specific embodiments.
In some embodiments of the present application, the step 17, according to the key element representation and the global semantic information representation of the to-be-handled case, and the key element representation and the global semantic information representation of each candidate case, constructs a specific implementation manner of the matching objective function corresponding to each candidate case, including the following steps:
and 17.1, calculating word shift distance between the case to be handled and each candidate case, which is expressed by the key element.
Specifically, it can be expressed by the formula
Figure SMS_69
Calculating to-be-handled cases and the +.>
Figure SMS_70
Candidate case->
Figure SMS_71
Word shift distance between representations with respect to key elements
Figure SMS_72
wherein ,
Figure SMS_75
representing the case to be handled->
Figure SMS_77
Indicate->
Figure SMS_86
Candidate case,/->
Figure SMS_78
,/>
Figure SMS_92
Representing the number of candidate classes, +.>
Figure SMS_79
Key element representation +.>
Figure SMS_88
The%>
Figure SMS_73
Key element of core->
Figure SMS_84
Representation->
Figure SMS_74
The number of key elements of the middle core,/-, and>
Figure SMS_91
representation->
Figure SMS_80
The +.o in the key element representation of (2)>
Figure SMS_93
Key element of core->
Figure SMS_81
Representation->
Figure SMS_90
The key elements of (2) represent the number of core key elements in the core. />
Figure SMS_82
Representation->
Figure SMS_89
And->
Figure SMS_83
The weight between the two can be calculated by adopting semantic similarity between word vectors, namely cosine similarity; />
Figure SMS_87
Representation->
Figure SMS_76
And->
Figure SMS_85
The semantic distance between the two can be calculated by Euclidean distance.
And 17.2, calculating cosine distances between the to-be-handled case and each candidate case about the global semantic information representation.
Specifically, it can be expressed by the formula
Figure SMS_94
Calculating to-be-handled cases and the +.>
Figure SMS_95
Cosine distance between candidate classes about global semantic information representation +.>
Figure SMS_96
wherein ,
Figure SMS_97
global semantic information representation representing the to-be-handled case, < ->
Figure SMS_98
Representation of
Figure SMS_99
Is represented by global semantic information->
Figure SMS_100
Representing the inner product of the matrix>
Figure SMS_101
Representing the L2 norm.
And 17.3, determining the integration distance between the case to be handled and each candidate case according to the calculated word shift distance and cosine distance, and taking an expression of the integration distance corresponding to each candidate case as a matching objective function corresponding to the candidate case.
In some embodiments of the present application, the formula may be passed
Figure SMS_102
Calculating to-be-handled cases and the +.>
Figure SMS_103
Integration distance between candidate classes +.>
Figure SMS_104
. Correspondingly, the->
Figure SMS_105
The matching objective functions corresponding to the candidate classes are: />
Figure SMS_106
Iterative optimization of Top-K recommendation tasks is described in exemplary detail below in connection with specific embodiments.
In some embodiments of the present application, after the matching objective function corresponding to each candidate class is constructed, the Top-K recommendation task may be constructed by using the constructed matching objective function, and the Top-K recommendation task may be iteratively optimized to determine Top-K candidate classes.
In the process of iteratively optimizing the Top-K recommendation task, the optimization objective of the Top-K recommendation task is as follows:
Figure SMS_107
wherein ,
Figure SMS_108
core key element for representing to-be-handled case +.>
Figure SMS_109
Specific gravity in the to-be-handled case, < ->
Figure SMS_110
Indicate->
Figure SMS_111
Core key element of candidate class +.>
Figure SMS_112
In->
Figure SMS_113
Specific gravity in each candidate class.
An exemplary description of the environment-friendly case element-based case pushing system provided in the present application is provided below in connection with specific embodiments.
As shown in fig. 2, the class pushing system based on environment-friendly case elements provided in the embodiment of the present application includes a data processing module, a semantic enhancement module, a candidate class searching module, and a class matching and pushing module, which is specifically as follows:
the data processing module is used for carrying out data preprocessing on the basic case information of the case to be handled and the candidate case (namely the candidate similar case), serializing the preprocessed case information and vectorizing the text sentence.
Specifically, the data processing module in the embodiment of the application comprises a preprocessing unit and a vectorization unit, wherein the preprocessing unit is used for preprocessing operations such as stopping words, filtering useless words, text word segmentation, text segmentation, serialization and the like on case text information; the vectorization unit is used for inputting the text sequence into an embedded layer in the ERNIE model so as to obtain vectorization representation of the text and a corresponding entity sequence thereof.
The semantic enhancement module is used for extracting case key elements representing case fact information and carrying out information fusion of global semantic representation through a fully connected neural network.
Specifically, the semantic enhancement module in the embodiment of the application comprises an element extraction unit and an information fusion unit, wherein the element extraction unit is used for sequentially carrying out semantic learning on the vectorized case information through an encoder in an ERNIE model and a BiLSTM model so as to extract key elements in the case information; the information fusion unit is used for inputting the key elements and the entity sequences into the fully-connected neural network for information fusion.
And the candidate case retrieval module is used for acquiring a plurality of candidate cases with key elements similar to the case to be handled.
Specifically, the candidate case search module in the embodiment of the application includes a case search unit, where the case search unit is configured to search a plurality of candidate cases with similar case elements from a search case pool.
The class matching and pushing module is used for constructing a matching objective function according to key elements and global semantic representations of the to-be-handled cases and the candidate classes, constructing a Top-K recommendation task through the matching objective function, iteratively optimizing the Top-K recommendation task, pushing the to-be-handled cases to a judge according to actual requirements, and assisting the judge to analyze the cases and judge judicially.
Specifically, the class matching and pushing module in the embodiment of the application comprises a matching calculation unit and a class pushing unit, wherein the matching calculation unit is used for matching construction of an objective function, construction of a Top-K recommendation task is carried out through the objective function, the Top-K recommendation task is optimized in an iteration mode, and the class pushing unit is used for pushing front Top-K similar cases to a judge according to actual requirements and assisting the judge in carrying out case analysis and judicial judgment.
As shown in fig. 3, an embodiment of the present application provides an electronic device, where the sending end device D10 of the embodiment includes: at least one processor D100 (only one processor is shown in the figure), a memory D101, and a computer program D102 stored in the memory D101 and executable on the at least one processor D100, wherein the steps in any of the above-described method embodiments are implemented by the processor D100 when the computer program D102 is executed, so as to improve the matching accuracy of the class of the eco-friendly class case.
The processor D100 may be a central processing unit (CPU, central Processing Unit), the processor D100 may also be other general purpose processors, digital signal processors (DSP, digital Signal Processor), application specific integrated circuits (ASIC, application Specific Integrated Circuit), off-the-shelf programmable gate arrays (FPGA, field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage D101 may in some embodiments be an internal storage unit of the sender device D10, for example, a hard disk or a memory of the sender device D10. The memory D101 may also be an external storage device of the sender device D10 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the sender device D10. Further, the memory D101 may include both an internal storage unit and an external storage device of the transmitting device D10. The memory D101 is used for storing an operating system, an application program, a boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory D101 may also be used to temporarily store data that has been output or is to be output.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The embodiment of the application also provides a computer readable storage medium for intelligent pushing of the class, wherein a computer executable program is stored on the computer readable storage medium, and the computer executable program is used for realizing the class pushing method based on the environment-friendly case elements when being executed by a computer processor.
In summary, the embodiment of the application performs text preprocessing and pre-training on the to-be-transacted cases, models the front-back semantic dependency relationship through the BiLSTM neural network, extracts key elements in judicial case information, and searches similar cases by using the case key elements, so that invalid potential similar cases can be effectively filtered, and the composition of candidate cases is further refined; the key elements and the entity sequences of the case text are input into the fully-connected neural network for information fusion so as to obtain global semantic representation, and the extraction and characterization of the case key information are effectively improved. Meanwhile, the matching degree between the to-be-handled cases and the candidate cases in the aspects of key elements and global semantic representation is comprehensively considered, the accuracy of matching similar cases is effectively improved, and proper similar case auxiliary judges are recommended to conduct case analysis and judicial judgment.
While the foregoing is directed to the preferred embodiments of the present application, it should be noted that modifications and adaptations to those embodiments may occur to one skilled in the art and that such modifications and adaptations are intended to be comprehended within the scope of the present application without departing from the principles set forth herein.

Claims (10)

1. The class case pushing method based on the environment-friendly case elements is characterized by comprising the following steps of:
converting case text information of a case to be handled into a text sequence, and acquiring vectorized representation of the text sequence and an entity sequence corresponding to the text sequence;
acquiring a characteristic representation of the text sequence according to the vectorization representation;
inputting the characteristic representation into a BiLSTM neural network to perform modeling of front-back semantic dependency relationship, and obtaining key characteristic representation of fused context semantic information; the key feature representation refers to a vectorized representation of key information in the text sequence;
extracting core key elements in the key feature representation by using an attention mechanism to obtain the key element representation of the to-be-handled case; the core key elements refer to vectorized representations of words which can represent the subject and the content of the to-be-handled case in the key feature representation;
inputting the key element representation and the entity sequence into a fully-connected neural network for information fusion to obtain the global semantic information representation of the to-be-handled case;
based on the key element representation, a plurality of candidate cases of the case to be handled are retrieved from a retrieval case pool, and the key element representation and the global semantic information representation of each candidate case are obtained;
constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case;
constructing a Top-K recommendation task through the constructed matching objective function, and iteratively optimizing the Top-K recommendation task to obtain Top-K candidate cases;
and taking the Top-K candidate cases as the pushing cases of the to-be-handled cases.
2. The case pushing method according to claim 1, wherein the key features represent
Figure QLYQS_1
,/>
Figure QLYQS_2
No. I representing BiLSTM neural network output>
Figure QLYQS_3
The number of key elements is one,
Figure QLYQS_4
,/>
Figure QLYQS_5
representing the number of key elements of the BiLSTM neural network output, < ->
Figure QLYQS_6
,/>
Figure QLYQS_7
Representing the number of sentences in the text sequence;
the extracting the core key elements in the key feature representation by using an attention mechanism comprises the following steps:
by the formula
Figure QLYQS_8
Extracting core key elements in the key feature representation to obtain a key element representation of the to-be-handled case>
Figure QLYQS_9
wherein ,
Figure QLYQS_15
,/>
Figure QLYQS_14
representation->
Figure QLYQS_24
The%>
Figure QLYQS_11
Key element of core->
Figure QLYQS_19
,/>
Figure QLYQS_13
Representing the number of key elements of the core,/->
Figure QLYQS_22
Representation->
Figure QLYQS_17
Attention weighting coefficient of->
Figure QLYQS_20
,/>
Figure QLYQS_10
Representation->
Figure QLYQS_23
By->
Figure QLYQS_16
The weight obtained by the activation function is used,
Figure QLYQS_26
,/>
Figure QLYQS_18
representation->
Figure QLYQS_25
Weight matrix of>
Figure QLYQS_12
Representation->
Figure QLYQS_21
Is a deviation of (2).
3. The case pushing method according to claim 2, wherein the inputting the key element representation and the entity sequence into a fully connected neural network for information fusion to obtain the global semantic information representation of the case to be handled includes:
by the formula
Figure QLYQS_27
Calculating to obtain global semantic information representation of the to-be-handled case>
Figure QLYQS_28
wherein ,
Figure QLYQS_29
representing the activation function of a fully connected neural network, +.>
Figure QLYQS_30
Representing a weight matrix, +.>
Figure QLYQS_31
Representing the bias vector +_>
Figure QLYQS_32
Representing a stitching function->
Figure QLYQS_33
Representing the sequence of entities.
4. The case pushing method according to claim 1, wherein the key element representation of the case to be handled includes a plurality of core key elements;
the retrieving, based on the key element representation, a plurality of candidate cases of the case to be handled from a retrieval case pool, including:
and taking the cases containing any core key element in the plurality of core key elements in the search case pool as candidate cases of the cases to be handled.
5. The case pushing method according to claim 4, wherein the constructing a matching objective function corresponding to each candidate case according to the key element representation and the global semantic information representation of the case to be handled and the key element representation and the global semantic information representation of each candidate case includes:
calculating word shift distance between the to-be-handled case and each candidate case, wherein the word shift distance is expressed by the key element;
calculating cosine distance between the to-be-handled case and each candidate case about the global semantic information representation;
and determining the integration distance between the to-be-handled case and each candidate case according to the calculated word shift distance and cosine distance, and taking an expression of the integration distance corresponding to each candidate case as a matching objective function corresponding to the candidate case.
6. The case pushing method according to claim 5, wherein the calculating a word shift distance between the case to be handled and each candidate case with respect to the key element representation includes:
by the formula
Figure QLYQS_34
Calculating to-be-handled cases and the +.>
Figure QLYQS_35
Word shift distance +.between candidate classes with respect to key element representation>
Figure QLYQS_36
The calculating the cosine distance between the to-be-handled case and each candidate case about the global semantic information representation includes:
by the formula
Figure QLYQS_37
Calculating to-be-handled cases and the +.>
Figure QLYQS_38
Cosine distance between candidate classes about global semantic information representation +.>
Figure QLYQS_39
wherein ,
Figure QLYQS_42
representing the to-be-handled case +.>
Figure QLYQS_51
Indicate->
Figure QLYQS_58
Candidate case,/->
Figure QLYQS_45
,/>
Figure QLYQS_50
Representing the number of candidate classes, +.>
Figure QLYQS_57
The key element representation of the to-be-handled case>
Figure QLYQS_64
The%>
Figure QLYQS_44
Key element of core->
Figure QLYQS_47
Representation->
Figure QLYQS_54
The number of key elements of the middle core,/-, and>
Figure QLYQS_61
representation->
Figure QLYQS_41
The +.o in the key element representation of (2)>
Figure QLYQS_48
Key element of core->
Figure QLYQS_56
Representation->
Figure QLYQS_63
The number of key elements of the core in the representation, < +.>
Figure QLYQS_46
Representation->
Figure QLYQS_53
And->
Figure QLYQS_59
Weights between->
Figure QLYQS_62
Representation->
Figure QLYQS_40
And->
Figure QLYQS_52
Semantic distance between->
Figure QLYQS_60
Global semantic information representation representing the to-be-handled case,>
Figure QLYQS_65
representation->
Figure QLYQS_43
Is represented by global semantic information->
Figure QLYQS_49
Representing the inner product of the matrix>
Figure QLYQS_55
Representing the L2 norm.
7. The case pushing method according to claim 6, wherein the first step
Figure QLYQS_66
The matching objective functions corresponding to the candidate classes are:
Figure QLYQS_67
wherein ,
Figure QLYQS_68
representing the to-be-handled case and +.>
Figure QLYQS_69
Integration distance between candidate classes.
8. The case pushing method according to claim 7, wherein in the process of iteratively optimizing the Top-K recommendation task, an optimization objective of the Top-K recommendation task is:
Figure QLYQS_70
wherein ,
Figure QLYQS_71
core key element for representing the to-be-handled case->
Figure QLYQS_72
Specific gravity in the to-be-handled case, < ->
Figure QLYQS_73
Indicate->
Figure QLYQS_74
Core key element of candidate class +.>
Figure QLYQS_75
In->
Figure QLYQS_76
Specific gravity in each candidate class.
9. The case pushing method according to claim 1, wherein the converting case text information of a case to be handled into a text sequence, and obtaining a vectorized representation of the text sequence and an entity sequence corresponding to the text sequence, includes:
filtering the stop words and useless words of the case text information of the case to be handled;
word segmentation processing is carried out on the filtered case text information;
carrying out text serialization processing on the case text information subjected to word segmentation processing to obtain a text sequence of the case text information;
and inputting the text sequence into an embedding layer in an ERNIE model to obtain the vectorized representation of the text sequence and an entity sequence corresponding to the text sequence.
10. The case pushing method according to claim 1, wherein the obtaining, from the vectorized representation, a feature representation of the text sequence includes:
and (3) inputting the vectorized representation into an encoder in an ERNIE model for pre-training to obtain the characteristic representation of the text sequence.
CN202310359002.9A 2023-04-06 2023-04-06 Class case pushing method based on environment-friendly case elements Pending CN116070624A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310359002.9A CN116070624A (en) 2023-04-06 2023-04-06 Class case pushing method based on environment-friendly case elements

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310359002.9A CN116070624A (en) 2023-04-06 2023-04-06 Class case pushing method based on environment-friendly case elements

Publications (1)

Publication Number Publication Date
CN116070624A true CN116070624A (en) 2023-05-05

Family

ID=86170074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310359002.9A Pending CN116070624A (en) 2023-04-06 2023-04-06 Class case pushing method based on environment-friendly case elements

Country Status (1)

Country Link
CN (1) CN116070624A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027306A (en) * 2019-12-23 2020-04-17 园宝科技(武汉)有限公司 Intellectual property matching technology based on keyword extraction and word shifting distance
CN111797247A (en) * 2020-09-10 2020-10-20 平安国际智慧城市科技股份有限公司 Case pushing method and device based on artificial intelligence, electronic equipment and medium
CN112905793A (en) * 2021-02-23 2021-06-04 山西同方知网数字出版技术有限公司 Case recommendation method and system based on Bilstm + Attention text classification
US11194972B1 (en) * 2021-02-19 2021-12-07 Institute Of Automation, Chinese Academy Of Sciences Semantic sentiment analysis method fusing in-depth features and time sequence models
CN114490946A (en) * 2022-02-16 2022-05-13 中南大学 Xlnet model-based class case retrieval method, system and equipment
CN114547257A (en) * 2022-04-25 2022-05-27 湖南工商大学 Class matching method and device, computer equipment and storage medium
CN114547237A (en) * 2022-01-24 2022-05-27 河海大学 French recommendation method fusing French keywords
CN114610891A (en) * 2022-05-12 2022-06-10 湖南工商大学 Law recommendation method and system for unbalanced judicial official document data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027306A (en) * 2019-12-23 2020-04-17 园宝科技(武汉)有限公司 Intellectual property matching technology based on keyword extraction and word shifting distance
CN111797247A (en) * 2020-09-10 2020-10-20 平安国际智慧城市科技股份有限公司 Case pushing method and device based on artificial intelligence, electronic equipment and medium
US11194972B1 (en) * 2021-02-19 2021-12-07 Institute Of Automation, Chinese Academy Of Sciences Semantic sentiment analysis method fusing in-depth features and time sequence models
CN112905793A (en) * 2021-02-23 2021-06-04 山西同方知网数字出版技术有限公司 Case recommendation method and system based on Bilstm + Attention text classification
CN114547237A (en) * 2022-01-24 2022-05-27 河海大学 French recommendation method fusing French keywords
CN114490946A (en) * 2022-02-16 2022-05-13 中南大学 Xlnet model-based class case retrieval method, system and equipment
CN114547257A (en) * 2022-04-25 2022-05-27 湖南工商大学 Class matching method and device, computer equipment and storage medium
CN114610891A (en) * 2022-05-12 2022-06-10 湖南工商大学 Law recommendation method and system for unbalanced judicial official document data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEN H, WU L, CHEN J, ET AL.: "A comparative study of automated legal text classification using random forests and deep learning", 《INFORMATION PROCESSING & MANAGEMENT, 2022》, vol. 59, no. 2, pages 102798 *
LIU L, AN D.: "Law Recommendation Based on Self-Attention Mechanism and Feature Fusion", 《2022 THE 5TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION MANAGEMENT (ICSIM)》, pages 106 - 112 *
NURANTI E Q, YULIANTI E, HUSIN H S.: "Predicting the Category and the Length of Punishment in Indonesian Courts Based on Previous Court Decision Documents", 《COMPUTERS》, vol. 11, no. 6, pages 88 *
赵承鼎;郭军军;余正涛;黄于欣;刘权;宋燃;: "基于非对称孪生网络的新闻与案件相关性分析", 中文信息学报, no. 03 *

Similar Documents

Publication Publication Date Title
Tay et al. Compare, compress and propagate: Enhancing neural architectures with alignment factorization for natural language inference
Rudolph et al. Dynamic embeddings for language evolution
Augenstein et al. Stance detection with bidirectional conditional encoding
US10289952B2 (en) Semantic frame identification with distributed word representations
Yu et al. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach
Sharma et al. Self-supervised contextual keyword and keyphrase retrieval with self-labelling
Chen et al. Discriminative soft bag-of-visual phrase for mobile landmark recognition
CN101542531A (en) Image recognizing apparatus and image recognizing method
CN104572958A (en) Event extraction based sensitive information monitoring method
Zhou et al. Recognizing software bug-specific named entity in software bug repository
Xu et al. Post2vec: Learning distributed representations of Stack Overflow posts
Li et al. Automatic identification of decisions from the hibernate developer mailing list
Wang et al. DM_NLP at semeval-2018 task 12: A pipeline system for toponym resolution
CN110826323B (en) Comment information validity detection method and comment information validity detection device
Mi et al. Knowledge-aware cross-modal text-image retrieval for remote sensing images
CN117668292A (en) Cross-modal sensitive information identification method
Nair et al. Fake News Detection Model for Regional Language
CN116070624A (en) Class case pushing method based on environment-friendly case elements
CN116186241A (en) Event element extraction method and device based on semantic analysis and prompt learning, electronic equipment and storage medium
CN113836297B (en) Training method and device for text emotion analysis model
Tang et al. Interpretability rules: Jointly bootstrapping a neural relation extractorwith an explanation decoder
CN112837148B (en) Risk logic relationship quantitative analysis method integrating domain knowledge
CN115455155B (en) Method for extracting subject information of government affair text and storage medium
US20240062570A1 (en) Detecting unicode injection in text
Lee et al. Extracting fallen objects on the road from accident reports using a natural language processing model-based approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination