CN111639153A - Query method and device based on legal knowledge graph, electronic equipment and medium - Google Patents
Query method and device based on legal knowledge graph, electronic equipment and medium Download PDFInfo
- Publication number
- CN111639153A CN111639153A CN202010334998.4A CN202010334998A CN111639153A CN 111639153 A CN111639153 A CN 111639153A CN 202010334998 A CN202010334998 A CN 202010334998A CN 111639153 A CN111639153 A CN 111639153A
- Authority
- CN
- China
- Prior art keywords
- query
- matrix
- encoder
- channel
- output data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 239000011159 matrix material Substances 0.000 claims abstract description 188
- 230000007246 mechanism Effects 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 32
- 230000006870 function Effects 0.000 claims description 24
- 238000005457 optimization Methods 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000013519 translation Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 abstract description 24
- 230000008569 process Effects 0.000 description 14
- 238000012549 training Methods 0.000 description 13
- 238000004590 computer program Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 239000003795 chemical substances by application Substances 0.000 description 4
- 108010001267 Protein Subunits Proteins 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012384 transportation and delivery Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000004148 unit process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data processing, and provides a query method based on a legal knowledge graph. The method can link to the weight vector of the query statement from the element list of the legal knowledge base based on the attention mechanism, distinguish the contribution rate of each word, calculate and obtain a characteristic matrix based on the weight and input the characteristic matrix into an encoder to obtain the output data of the encoder, wherein the encoder comprises two BiGRU networks, and processing the output data of the encoder by a decoder to obtain a machine query language, the decoder comprising four BiGRU networks, because the structures of the encoder and the decoder are respectively optimized, the conversion of the query statement is more accurate and stable, the machine query language is further executed in the database, the query result is output, because the machine query language obtained by data processing is more accurate, the output query result is more accurate and reliable, the automatic conversion and query of the query statement are realized, and the query efficiency is improved.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a query method and device based on a legal knowledge graph, electronic equipment and a medium.
Background
Natural language generation is a very important area of research in the artificial intelligence industry, which is an inherent capability for humans, while representing the highest level of progress for artificial intelligence. Research natural language generation can help users find needed answers from a database in a faster, more accurate, and less costly manner.
In the legal field, some terms have high similarity and are not easy to distinguish, and a query result is often required to be obtained through a plurality of steps, each step may cause a mistake in the query result due to unclear intention transmission or understanding deviation and the like, and the accuracy is low.
Meanwhile, the existing query method is weak in generalization capability, cannot cope with new complicated and variable problems, needs to retrain the model when facing the new problems, and is high in cost.
Disclosure of Invention
In view of the above, it is necessary to provide a query method, a query device, an electronic device, and a query medium based on a legal knowledge graph, which can distinguish the contribution rate of each word based on an attention mechanism, and optimize the structures of an encoder and a decoder, respectively, so that the conversion of query statements is more accurate and stable.
A legal knowledge graph-based query method, the method comprising:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
According to a preferred embodiment of the present invention, each BiGRU network in the encoder comprises a plurality of subunits, the method further comprising:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
According to a preferred embodiment of the invention, the method further comprises:
taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
vector splicing is carried out on the output state of each BiGRU network to be used as output data of the encoder;
and uploading the output data of the encoder to a block chain.
According to the preferred embodiment of the present invention, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing of the output data of the encoder by the decoder to obtain the machine query language includes:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
According to a preferred embodiment of the present invention, the predicting, by using the category prediction channel, a channel to which each SQL word belongs in the output data of the encoder includes:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
According to a preferred embodiment of the invention, the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, controlling the category prediction channel to stop prediction.
According to a preferred embodiment of the present invention, the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
calculating a first penalty for the class prediction channel and calculating a second penalty for a weight vector of the query statement to which attention mechanism is linked;
calculating a sum of the first loss and the second loss as a loss function of the language translation model;
and optimizing the loss function by adopting a configuration optimization algorithm.
A legal knowledge graph-based querying device, the device comprising:
the query unit is used for querying a matrix of the query statement from a first preset dictionary when the query statement is received;
a linking unit for linking to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism;
the calculation unit is used for calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
the query unit is further used for querying a second matrix of the SQL statement from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
the splicing unit is used for splicing the first matrix, the second matrix and the third matrix to obtain a characteristic matrix;
the input unit is used for inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
the processing unit is used for processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and the execution unit is used for executing the machine query language in the database and outputting a query result.
According to the preferred embodiment of the present invention, each BiGRU network in the encoder includes a plurality of subunits, and the input unit is further configured to, for each subunit, at an initial time, acquire a preconfigured initialization value and acquire an initial feature matrix, input the initialization value and the initial feature matrix into the subunit, and output an initial state; or
The input unit is further configured to obtain an output state at a previous time and a current feature matrix at other times except the initial time, input the output state at the previous time and the current feature matrix into the subunit, and output a current state.
According to a preferred embodiment of the present invention, the method apparatus further comprises:
the determining unit is used for taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
and the splicing unit is also used for carrying out vector splicing on the output state of each BiGRU network to be used as the output data of the encoder.
According to the preferred embodiment of the present invention, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing unit processes the output data of the encoder by using a decoder to obtain the machine query language includes:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
According to a preferred embodiment of the present invention, the processing unit predicting, by using the category prediction channel, a channel to which each SQL word belongs in the output data of the encoder includes:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
According to a preferred embodiment of the invention, the apparatus further comprises:
and the control unit is used for controlling the category prediction channel to stop prediction when the SQL words in the output data of the encoder are stop symbols.
According to a preferred embodiment of the present invention, the encoder and the decoder are configured into a language conversion model according to an attention mechanism and a cross entropy function, and the calculation unit is further configured to calculate a first penalty of the class prediction channel and a second penalty based on a weight vector of the query statement to which the attention mechanism is linked;
the computing unit is further configured to compute a sum of the first loss and the second loss as a loss function of the language conversion model;
and the optimization unit is used for optimizing the loss function by adopting a configuration optimization algorithm.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the legal knowledge base query method.
A computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executable by a processor in an electronic device to implement the legal knowledgegraph-based query method.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the legal knowledge base query method of the present invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the query device based on legal knowledge base.
FIG. 3 is a schematic structural diagram of an electronic device implementing a query method based on legal knowledge base according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a preferred embodiment of the query method based on legal knowledge base according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The query method based on the legal knowledge base is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a cloud computing (cloud computing) based cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, when a query sentence is received, querying a matrix of the query sentence from the first preset dictionary.
Wherein the query statement may be a legally relevant query statement, such as: "agent name of the prover" and the like.
The first preset dictionary can be configured in a user-defined mode, and the first preset dictionary comprises all words related to the query sentence.
Therefore, the electronic device may directly perform a query in the first preset dictionary and determine the matrix of the query statement.
S11, linking to the weight vector of the query statement from the list of elements of the legal knowledge graph based on Attention mechanism (Attention).
For example: if the vector matrix of the question query is V1(dm dimension), and the question query is a sentence of 10 words, dm is 256, a matrix of 10 × 256 is obtained; the vector matrix of the element list names is V2(dm dimension), and there are 100 keywords in the element list, dm is 256, resulting in a matrix of 100 × 256; multiplying the two matrices, namely: v1 × V2T. The above process of calculating the product is to multiply the matrix of 10 × 256 by the matrix of 256 × 100 to obtain a matrix of 10 × 100, then normalize the calculated matrix, i.e. add 100 values in each dimension to obtain a vector of 10 × 1, then calculate the root mean square SQRT (SUM (V1 × V2T, axis 0)) of 10 values, and divide all ten values by the root mean square, that is: SUM (V1 × V2T, axis ═ 0)/SQRT (SUM (V1 × V2T, axis ═ 0)), a new vector is obtained, the structure of which is 10 × 1, and this vector is the weight vector W of the question query.
It will be appreciated that each word in the query statement has a different importance when querying, for example, for the query statement "agent name of the prover", several other words should not be assigned the same attention when analyzing the word "query". When the sentence of the query sentence is short, no obvious problem is caused, but when the sentence of the query sentence is long, if each word in the query sentence is represented by an intermediate semantic vector, the information of each word itself is weakened or even disappears, much detail information is lost, and therefore, the electronic device introduces the attention mechanism.
Specifically, the electronic device links to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism to distinguish the contribution of each word in the query statement to a query process, so that the query statement can be more accurately utilized for query.
In at least one embodiment of the present invention, the legal knowledge-graph may include a variety of legally relevant features, such as: legal entities, features specific to legal relationships, and the like.
It should be noted that the conventional features are usually text statistic features, such as: text length features, word frequency statistics class features, and the like.
In comparison, the legal knowledge graph mainly includes, but is not limited to, the following two types:
(1) and the law abstract class characteristics are extracted according to laws, regulations, judicial interpretations and the like.
For example: whether the original reported attribute is natural person, abstract legal person or other organization, the borrowing intention of the borrower, the interest mode selected by the borrowing contract, the borrowing delivery form and the like can be extracted from the borrowing contract.
In particular, the legal abstract class features are generalized in accordance with each legal sub-category.
(2) Features constructed according to the law theory.
For example: whether the process of contract-making is an offer invitation, an offer and a promise, whether the form of contract-making is a written form or a spoken form, whether the contract is a promise contract or a practice contract, whether the legal relationship established by the contract is a single-party civil legal relationship or a multi-party civil legal relationship, whether the contract establishes a first-fulfilling obligation and the like.
Specifically, the characteristics constructed according to the law theory are obtained by combing and inducing according to the law theory.
In at least one embodiment of the invention, the legal knowledge graph is constructed in a list form, and each element in the legal knowledge graph is displayed in an element list form.
S12, calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
In at least one embodiment of the present invention, before performing a query using the query statement, the electronic device first needs to perform an initialization process on the query statement.
Specifically, the electronic device calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
S13, querying a second matrix of SQL (Structured Query Language) statements from a second predetermined dictionary, and querying a third matrix of the list of elements from a third predetermined dictionary.
In at least one embodiment of the present invention, the electronic device may perform a custom configuration on the second preset dictionary and the third preset dictionary.
The second preset dictionary comprises SQL sentences, and the third preset dictionary comprises each element in the element list.
It should be noted that, since the construction technology of the dictionary is relatively mature, the present invention is not described herein.
And S14, splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix.
In at least one embodiment of the present invention, the electronic device splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix, where the feature matrix includes:
and the electronic equipment splices the first matrix, the second matrix and the third matrix in a transverse splicing or longitudinal splicing mode to obtain a characteristic matrix.
Through the implementation mode, the obtained feature matrix has the feature attributes of a plurality of layers, and accurate query is facilitated.
And S15, inputting the feature matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks.
In at least one embodiment of the present invention, the electronic device is improved based on a Sequence 2 Sequence (Sequence to Sequence) architecture, and trained to obtain a language translation model, which includes, but is not limited to: an encoder and a decoder.
Further, before the feature matrix is input into the encoder to obtain the output data of the encoder, the method further includes:
the electronic device trains the encoder.
Specifically, each BiGRU network in the encoder includes a plurality of subunits, the electronic device training the encoder includes:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
The initialization value may be configured by self-definition, which is not limited in the present invention.
Further, the electronic device may acquire training data, and construct the initial feature matrix and the current feature matrix in a manner of constructing the feature matrix in combination with the training data.
Further, the method further comprises:
the electronic equipment takes the output of the plurality of subunits after being serialized as the output state of each BiGRU network, and carries out vector splicing on the output state of each BiGRU network as the output data of the encoder;
and uploading the output data of the encoder to a block chain.
The corresponding digest information is obtained based on the output data of the encoder, and specifically, the digest information is obtained by hashing the output data of the encoder, for example, using the sha256s algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The user equipment may download the summary information from the blockchain to verify that the output data of the encoder is tampered.
The blockchain referred to in this embodiment is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Wherein the vector splicing is performed on the output state of each BiGRU network by the electronic device, and the vector splicing is performed as the output data of the encoder, and the vector splicing includes:
and the electronic equipment carries out vector splicing on the output state of each BiGRU network in a transverse splicing or longitudinal splicing mode to serve as the output data of the encoder.
For example: when the output state of one BiGRU network is a matrix of 100 × 128 and the output state of the other BiGRU network is 100 × 128, the output data obtained is a matrix of 200 × 128 if it is spliced in the 0 dimension (i.e., in the manner of vertical splicing), and the output data obtained is a matrix of 100 × 256 if it is spliced in the 1 dimension (i.e., in the manner of horizontal splicing).
By the implementation mode, the training of the encoder can be realized, and different from the existing Seq2Seq architecture, the encoder is provided with two BiGRU networks, so that the accuracy of model prediction is remarkably improved.
And S16, processing the output data of the encoder by using a decoder to obtain the machine query language, wherein the decoder comprises four BiGRU networks.
In at least one embodiment of the present invention, the output data of the encoder is used as the input data of the decoder, and the query statement can be converted into the machine query language after the output data of the encoder is processed by the decoder.
Specifically, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing of the output data of the encoder by the electronic device using the decoder to obtain the machine query language includes:
the electronic equipment predicts a channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, determines a word with the maximum probability in the channel to which each SQL word belongs as a participle corresponding to each SQL word based on an attention mechanism, and merges the participles corresponding to each SQL word to obtain the machine query language.
The electronic equipment predicts the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, and comprises the following steps:
for each SQL word in the output data of the encoder, the electronic equipment acquires the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the numerical channel, and determines the channel with the maximum probability value as the channel of the next SQL word.
Further, the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, the electronic equipment controls the category prediction channel to stop prediction.
Through the implementation mode, the query statement can be accurately converted based on the four channels and the type of each channel in the decoder, and compared with the existing Seq2Seq architecture, the decoder has a higher accuracy due to the fact that the decoder has different internal structures and 4 BiGRU networks.
It should be noted that the principle of the electronic device training the decoder is the same as the working principle of the decoder, and only a large amount of training data is used as the training basis, which is not described herein again.
In at least one embodiment of the present invention, the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
the electronic device calculates a first penalty for the class prediction channel and a second penalty for a weight vector of the query statement linked to based on an attention mechanism, further calculates a sum of the first penalty and the second penalty as a penalty function for the language translation model, the electronic device optimizing the penalty function using a configuration optimization algorithm.
The configuration optimization algorithm may be any loss function optimization algorithm, and the present invention is not limited thereto.
Through the implementation mode, the accuracy of the conversion result of the language conversion model can be evaluated, the intermediate process can be evaluated, the probability of selecting a channel and the probability of generating a vocabulary are considered, the language conversion model can be optimized more accurately, the generalization capability and the interpretability of the language conversion model are further improved, and the conversion result of the language conversion model is more accurate.
And S17, executing the machine query language in the database and outputting a query result.
After the machine query language is obtained, the electronic equipment can query in the database based on the machine query language, and the machine query language obtained through the language conversion model is more accurate, so that the finally obtained query result is more accurate and reliable.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output. The method can be applied to scenes such as an intelligent court and the like, so that the construction of an intelligent city is promoted.
Fig. 2 is a functional block diagram of a preferred embodiment of the query device based on legal knowledge base according to the present invention. The query device 11 based on the legal knowledge graph comprises a query unit 110, a link unit 111, a calculation unit 112, a splicing unit 113, an input unit 114, a processing unit 115, an execution unit 116, a determination unit 117, a control unit 118 and an optimization unit 119. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
When a query statement is received, the query unit 110 queries a matrix of the query statement from a first preset dictionary.
Wherein the query statement may be a legally relevant query statement, such as: "agent name of the prover" and the like.
The first preset dictionary can be configured in a user-defined mode, and the first preset dictionary comprises all words related to the query sentence.
Therefore, the query unit 110 may directly perform a query in the first predetermined dictionary and determine the matrix of the query sentences.
The linking unit 111 links the weight vector of the query statement from the list of elements of the legal knowledge base based on the Attention mechanism (Attention).
For example: if the vector matrix of the question query is V1(dm dimension), and the question query is a sentence of 10 words, dm is 256, a matrix of 10 × 256 is obtained; the vector matrix of the element list names is V2(dm dimension), and there are 100 keywords in the element list, dm is 256, resulting in a matrix of 100 × 256; multiplying the two matrices, namely: v1 × V2T. The above process of calculating the product is to multiply the matrix of 10 × 256 by the matrix of 256 × 100 to obtain a matrix of 10 × 100, then normalize the calculated matrix, i.e. add 100 values in each dimension to obtain a vector of 10 × 1, then calculate the root mean square SQRT (SUM (V1 × V2T, axis 0)) of 10 values, and divide all ten values by the root mean square, that is: SUM (V1 × V2T, axis ═ 0)/SQRT (SUM (V1 × V2T, axis ═ 0)), a new vector is obtained, the structure of which is 10 × 1, and this vector is the weight vector W of the question query.
It will be appreciated that each word in the query statement has a different importance when querying, for example, for the query statement "agent name of the prover", several other words should not be assigned the same attention when analyzing the word "query". When the sentence of the query sentence is short, no obvious problem is caused, but when the sentence of the query sentence is long, if each word in the query sentence is represented by an intermediate semantic vector, the information of each word itself is weakened or even disappears, much detail information is lost, and thus the link unit 111 introduces the attention mechanism.
Specifically, the linking unit 111 links the weight vector of the query statement from the element list of the legal knowledge base based on an attention mechanism to distinguish the contribution of each word in the query statement to the query process, so as to more accurately query with the query statement.
In at least one embodiment of the present invention, the legal knowledge-graph may include a variety of legally relevant features, such as: legal entities, features specific to legal relationships, and the like.
It should be noted that the conventional features are usually text statistic features, such as: text length features, word frequency statistics class features, and the like.
In comparison, the legal knowledge graph mainly includes, but is not limited to, the following two types:
(1) and the law abstract class characteristics are extracted according to laws, regulations, judicial interpretations and the like.
For example: whether the original reported attribute is natural person, abstract legal person or other organization, the borrowing intention of the borrower, the interest mode selected by the borrowing contract, the borrowing delivery form and the like can be extracted from the borrowing contract.
In particular, the legal abstract class features are generalized in accordance with each legal sub-category.
(2) Features constructed according to the law theory.
For example: whether the process of contract-making is an offer invitation, an offer and a promise, whether the form of contract-making is a written form or a spoken form, whether the contract is a promise contract or a practice contract, whether the legal relationship established by the contract is a single-party civil legal relationship or a multi-party civil legal relationship, whether the contract establishes a first-fulfilling obligation and the like.
Specifically, the characteristics constructed according to the law theory are obtained by combing and inducing according to the law theory.
In at least one embodiment of the invention, the legal knowledge graph is constructed in a list form, and each element in the legal knowledge graph is displayed in an element list form.
The calculating unit 112 calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
In at least one embodiment of the present invention, before performing a query using the query statement, the computing unit 112 first needs to perform an initialization process on the query statement.
Specifically, the calculating unit 112 calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
The Query unit 110 queries a second matrix of SQL (Structured Query Language) statements from a second predetermined dictionary, and queries a third matrix of the list of elements from a third predetermined dictionary.
In at least one embodiment of the present invention, the querying unit 110 may perform a custom configuration on the second predetermined dictionary and the third predetermined dictionary.
The second preset dictionary comprises SQL sentences, and the third preset dictionary comprises each element in the element list.
It should be noted that, since the construction technology of the dictionary is relatively mature, the present invention is not described herein.
The splicing unit 113 splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix.
In at least one embodiment of the present invention, the splicing unit 113 splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix, where the feature matrix includes:
the splicing unit 113 splices the first matrix, the second matrix and the third matrix in a transverse splicing or longitudinal splicing manner to obtain a feature matrix.
The query device 11 based on the legal knowledge base further includes an uploading unit that uploads the output data of the encoder to the block chain.
Through the implementation mode, the obtained feature matrix has the feature attributes of a plurality of layers, and accurate query is facilitated.
The input unit 114 inputs the feature matrix into an encoder, which includes two BiGRU networks, to obtain output data of the encoder.
In at least one embodiment of the present invention, the input unit 114 is improved based on a Sequence 2 Sequence (Sequence to Sequence) architecture, and trained to obtain a language translation model, which includes, but is not limited to: an encoder and a decoder.
Further, before the feature matrix is input into the encoder to obtain the output data of the encoder, the encoder is trained.
Specifically, each BiGRU network in the encoder includes a plurality of sub-units, and for each sub-unit, at an initial time, the input unit 114 acquires a pre-configured initialization value and an initial feature matrix, inputs the initialization value and the initial feature matrix into the sub-unit, and outputs an initial state; or
At other times except the initial time, the input unit 114 obtains the output state at the previous time and obtains the current feature matrix, and inputs the output state at the previous time and the current feature matrix into the subunit to output the current state.
The initialization value may be configured by self-definition, which is not limited in the present invention.
Further, the input unit 114 may obtain training data, and combine the training data to construct the initial feature matrix and the current feature matrix in a manner of constructing the feature matrix.
Further, the determining unit 117 uses the output of the plurality of serial subunits as the output state of each BiGRU network, and the splicing unit 113 performs vector splicing on the output state of each BiGRU network as the output data of the encoder.
The vector splicing unit 113 performs vector splicing on the output state of each BiGRU network, and includes, as the output data of the encoder:
the splicing unit 113 performs vector splicing on the output state of each BiGRU network in a transverse splicing or longitudinal splicing manner, and uses the vector splicing as the output data of the encoder.
For example: when the output state of one BiGRU network is a matrix of 100 × 128 and the output state of the other BiGRU network is 100 × 128, the output data obtained is a matrix of 200 × 128 if it is spliced in the 0 dimension (i.e., in the manner of vertical splicing), and the output data obtained is a matrix of 100 × 256 if it is spliced in the 1 dimension (i.e., in the manner of horizontal splicing).
By the implementation mode, the training of the encoder can be realized, and different from the existing Seq2Seq architecture, the encoder is provided with two BiGRU networks, so that the accuracy of model prediction is remarkably improved.
The processing unit 115 processes the output data of the encoder with a decoder comprising four BiGRU networks, resulting in a machine query language.
In at least one embodiment of the present invention, the output data of the encoder is used as the input data of the decoder, and the query statement can be converted into the machine query language after the output data of the encoder is processed by the decoder.
Specifically, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing unit 115 processes the output data of the encoder by using a decoder to obtain the machine query language includes:
the processing unit 115 predicts a channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, determines a word with the highest probability in the channel to which each SQL word belongs as a participle corresponding to each SQL word based on an attention mechanism, and the processing unit 115 merges the participle corresponding to each SQL word to obtain the machine query language.
Wherein the predicting, by the processing unit 115, the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel includes:
for each SQL word in the output data of the encoder, the processing unit 115 obtains a probability value output by the word in the SQL channel, a probability value output by the element list channel, and a probability value output by the numerical channel, and determines the channel with the highest probability value as the channel of the next SQL word.
Further, when the SQL word in the output data of the encoder is a stopper, the control unit 118 controls the category prediction channel to stop prediction.
Through the implementation mode, the query statement can be accurately converted based on the four channels and the type of each channel in the decoder, and compared with the existing Seq2Seq architecture, the decoder has a higher accuracy due to the fact that the decoder has different internal structures and 4 BiGRU networks.
It should be noted that the principle of training the decoder is the same as the working principle of the decoder, but a large amount of training data is used as the training basis, which is not described herein again.
In at least one embodiment of the present invention, the encoder and the decoder are configured as a language conversion model according to an attention mechanism and a cross entropy function, the calculation unit 112 calculates a first loss of the class prediction channel and a second loss based on a weight vector of the query statement to which the attention mechanism is linked, and further calculates a sum of the first loss and the second loss as a loss function of the language conversion model, and the optimization unit 119 optimizes the loss function using a configuration optimization algorithm.
The configuration optimization algorithm may be any loss function optimization algorithm, and the present invention is not limited thereto.
Through the implementation mode, the accuracy of the conversion result of the language conversion model can be evaluated, the intermediate process can be evaluated, the probability of selecting a channel and the probability of generating a vocabulary are considered, the language conversion model can be optimized more accurately, the generalization capability and the interpretability of the language conversion model are further improved, and the conversion result of the language conversion model is more accurate.
The execution unit 116 executes the machine query language in the database and outputs a query result.
After obtaining the machine query language, the execution unit 116 may perform query in the database based on the machine query language, and since the machine query language obtained through the language conversion model is more accurate, the finally obtained query result is more accurate and reliable.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a query method based on a legal knowledge base according to a preferred embodiment of the present invention.
The electronic device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as a legal knowledge graph based query program, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-type structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of a legal knowledge base query program, etc., but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (for example, executing a query program based on a legal knowledge map, and the like) stored in the memory 12 and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in each of the above-mentioned embodiments of the legal knowledge base query method, such as steps S10, S11, S12, S13, S14, S15, S16, S17 shown in fig. 1.
Alternatively, the processor 13, when executing the computer program, implements the functions of the modules/units in the above device embodiments, for example:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the electronic device 1. For example, the computer program may be divided into a query unit 110, a link unit 111, a calculation unit 112, a concatenation unit 113, an input unit 114, a processing unit 115, an execution unit 116, a determination unit 117, a control unit 118, an optimization unit 119.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 only shows the electronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
In conjunction with fig. 1, the memory 12 in the electronic device 1 stores a plurality of instructions to implement a legal knowledge base query method, and the processor 13 can execute the plurality of instructions to implement:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
1. A legal knowledge graph-based query method, comprising:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
2. The legal knowledgebase query method of claim 1, wherein each BiGRU network in the encoder comprises a plurality of subunits, the method further comprising:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
3. The legal knowledge graph-based query method of claim 2, wherein the method further comprises:
taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
vector splicing is carried out on the output state of each BiGRU network to be used as output data of the encoder;
and uploading the output data of the encoder to a block chain.
4. The legal knowledgebase query method of claim 1, wherein the four BiGRU networks are a category prediction channel, an SQL channel, an element list channel, and a numerical channel, respectively, and the processing of the encoder output data by the decoder to obtain the machine query language comprises:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
5. The legal knowledge graph-based query method of claim 4, wherein the predicting the channel to which each SQL word belongs in the output data of the encoder using the category prediction channel comprises:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
6. The legal knowledge graph-based query method of claim 4, wherein the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, controlling the category prediction channel to stop prediction.
7. The legal knowledge graph-based query method of claim 4, wherein the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
calculating a first penalty for the class prediction channel and calculating a second penalty for a weight vector of the query statement to which attention mechanism is linked;
calculating a sum of the first loss and the second loss as a loss function of the language translation model;
and optimizing the loss function by adopting a configuration optimization algorithm.
8. A legal knowledge graph-based query device, the device comprising:
the query unit is used for querying a matrix of the query statement from a first preset dictionary when the query statement is received;
a linking unit for linking to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism;
the calculation unit is used for calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
the query unit is further used for querying a second matrix of the SQL statement from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
the splicing unit is used for splicing the first matrix, the second matrix and the third matrix to obtain a characteristic matrix;
the input unit is used for inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
the processing unit is used for processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and the execution unit is used for executing the machine query language in the database and outputting a query result.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement a legal knowledge graph-based query method as recited in any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executable by a processor in an electronic device to implement the legal knowledge graph-based query method of any one of claims 1-7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010334998.4A CN111639153B (en) | 2020-04-24 | 2020-04-24 | Query method and device based on legal knowledge graph, electronic equipment and medium |
PCT/CN2020/104968 WO2021212683A1 (en) | 2020-04-24 | 2020-07-27 | Law knowledge map-based query method and apparatus, and electronic device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010334998.4A CN111639153B (en) | 2020-04-24 | 2020-04-24 | Query method and device based on legal knowledge graph, electronic equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111639153A true CN111639153A (en) | 2020-09-08 |
CN111639153B CN111639153B (en) | 2024-07-02 |
Family
ID=72333231
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010334998.4A Active CN111639153B (en) | 2020-04-24 | 2020-04-24 | Query method and device based on legal knowledge graph, electronic equipment and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111639153B (en) |
WO (1) | WO2021212683A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112328960A (en) * | 2020-11-02 | 2021-02-05 | 中国平安财产保险股份有限公司 | Data operation optimization method and device, electronic equipment and storage medium |
CN113221975A (en) * | 2021-04-26 | 2021-08-06 | 中国科学技术大学先进技术研究院 | Working condition construction method based on improved Markov analysis method and storage medium |
CN115455149A (en) * | 2022-09-20 | 2022-12-09 | 城云科技(中国)有限公司 | Database construction method based on coding query mode and application thereof |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115063666A (en) * | 2022-07-06 | 2022-09-16 | 京东科技信息技术有限公司 | Decoder training method, target detection method, device and storage medium |
CN115658926B (en) * | 2022-11-21 | 2023-05-05 | 中国科学院自动化研究所 | Element estimation method and device of knowledge graph, electronic equipment and storage medium |
CN116258521A (en) * | 2022-12-02 | 2023-06-13 | 东莞盟大集团有限公司 | Secondary node identification application integral management method based on blockchain technology |
CN115983379B (en) * | 2023-03-20 | 2023-10-10 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Reachable path query method and system of MDTA knowledge graph |
CN116225973B (en) * | 2023-05-10 | 2023-06-30 | 贵州轻工职业技术学院 | Chip code testing method and device based on embedded implementation electronic equipment |
CN117743590B (en) * | 2023-11-30 | 2024-07-26 | 北京汉勃科技有限公司 | Legal assistance method and system based on large language model |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285740A1 (en) * | 2017-04-03 | 2018-10-04 | Royal Bank Of Canada | Systems and methods for malicious code detection |
CN109766355A (en) * | 2018-12-28 | 2019-05-17 | 上海汇付数据服务有限公司 | A kind of data query method and system for supporting natural language |
CN109977200A (en) * | 2019-01-25 | 2019-07-05 | 上海凯岸信息科技有限公司 | Speech polling assistant based on SQL Auto |
CN110489102A (en) * | 2019-07-29 | 2019-11-22 | 东北大学 | A method of Python code is automatically generated from natural language |
CN110945495A (en) * | 2017-05-18 | 2020-03-31 | 易享信息技术有限公司 | Conversion of natural language queries to database queries based on neural networks |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291871B (en) * | 2017-06-15 | 2021-02-19 | 北京百度网讯科技有限公司 | Matching degree evaluation method, device and medium for multi-domain information based on artificial intelligence |
CN107943874B (en) * | 2017-11-13 | 2019-08-23 | 平安科技(深圳)有限公司 | Knowledge mapping processing method, device, computer equipment and storage medium |
CN109656952B (en) * | 2018-10-31 | 2021-04-13 | 北京百度网讯科技有限公司 | Query processing method and device and electronic equipment |
CN110990536A (en) * | 2019-12-06 | 2020-04-10 | 重庆邮电大学 | CQL generation method based on BERT and knowledge graph perception |
-
2020
- 2020-04-24 CN CN202010334998.4A patent/CN111639153B/en active Active
- 2020-07-27 WO PCT/CN2020/104968 patent/WO2021212683A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285740A1 (en) * | 2017-04-03 | 2018-10-04 | Royal Bank Of Canada | Systems and methods for malicious code detection |
CN110945495A (en) * | 2017-05-18 | 2020-03-31 | 易享信息技术有限公司 | Conversion of natural language queries to database queries based on neural networks |
CN109766355A (en) * | 2018-12-28 | 2019-05-17 | 上海汇付数据服务有限公司 | A kind of data query method and system for supporting natural language |
CN109977200A (en) * | 2019-01-25 | 2019-07-05 | 上海凯岸信息科技有限公司 | Speech polling assistant based on SQL Auto |
CN110489102A (en) * | 2019-07-29 | 2019-11-22 | 东北大学 | A method of Python code is automatically generated from natural language |
Non-Patent Citations (2)
Title |
---|
BEN BOGIN ET AL: "Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing", 《ARXIV:1905.06241V2 [CS.CL]》, 3 June 2019 (2019-06-03), pages 1 - 7 * |
DONGJUN LEE ET AL: "Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation", 《ARXIV:1904.08835V2 [CS.CL]》, 19 August 2019 (2019-08-19), pages 1 - 7 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112328960A (en) * | 2020-11-02 | 2021-02-05 | 中国平安财产保险股份有限公司 | Data operation optimization method and device, electronic equipment and storage medium |
CN112328960B (en) * | 2020-11-02 | 2023-09-19 | 中国平安财产保险股份有限公司 | Optimization method and device for data operation, electronic equipment and storage medium |
CN113221975A (en) * | 2021-04-26 | 2021-08-06 | 中国科学技术大学先进技术研究院 | Working condition construction method based on improved Markov analysis method and storage medium |
CN115455149A (en) * | 2022-09-20 | 2022-12-09 | 城云科技(中国)有限公司 | Database construction method based on coding query mode and application thereof |
CN115455149B (en) * | 2022-09-20 | 2023-05-30 | 城云科技(中国)有限公司 | Database construction method based on coding query mode and application thereof |
Also Published As
Publication number | Publication date |
---|---|
WO2021212683A1 (en) | 2021-10-28 |
CN111639153B (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111639153A (en) | Query method and device based on legal knowledge graph, electronic equipment and medium | |
CN112541338A (en) | Similar text matching method and device, electronic equipment and computer storage medium | |
CN111460797B (en) | Keyword extraction method and device, electronic equipment and readable storage medium | |
CN113821622B (en) | Answer retrieval method and device based on artificial intelligence, electronic equipment and medium | |
CN112883190A (en) | Text classification method and device, electronic equipment and storage medium | |
CN114461777B (en) | Intelligent question-answering method, device, equipment and storage medium | |
CN112507663A (en) | Text-based judgment question generation method and device, electronic equipment and storage medium | |
CN112559687A (en) | Question identification and query method and device, electronic equipment and storage medium | |
CN113807973B (en) | Text error correction method, apparatus, electronic device and computer readable storage medium | |
CN113887941B (en) | Business process generation method, device, electronic equipment and medium | |
CN112528013A (en) | Text abstract extraction method and device, electronic equipment and storage medium | |
CN116821373A (en) | Map-based prompt recommendation method, device, equipment and medium | |
CN113627160B (en) | Text error correction method and device, electronic equipment and storage medium | |
CN113344125B (en) | Long text matching recognition method and device, electronic equipment and storage medium | |
CN116521867A (en) | Text clustering method and device, electronic equipment and storage medium | |
CN112347739A (en) | Application rule analysis method and device, electronic equipment and storage medium | |
CN114548114B (en) | Text emotion recognition method, device, equipment and storage medium | |
CN111414452B (en) | Search word matching method and device, electronic equipment and readable storage medium | |
CN113449037B (en) | AI-based SQL engine calling method, device, equipment and medium | |
CN113221578B (en) | Disease entity retrieval method, device, equipment and medium | |
CN115346095A (en) | Visual question answering method, device, equipment and storage medium | |
CN115146064A (en) | Intention recognition model optimization method, device, equipment and storage medium | |
CN113704616A (en) | Information pushing method and device, electronic equipment and readable storage medium | |
CN112214594A (en) | Text briefing generation method and device, electronic equipment and readable storage medium | |
CN115221875B (en) | Word weight generation method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |