CN111639153A - Query method and device based on legal knowledge graph, electronic equipment and medium - Google Patents

Query method and device based on legal knowledge graph, electronic equipment and medium Download PDF

Info

Publication number
CN111639153A
CN111639153A CN202010334998.4A CN202010334998A CN111639153A CN 111639153 A CN111639153 A CN 111639153A CN 202010334998 A CN202010334998 A CN 202010334998A CN 111639153 A CN111639153 A CN 111639153A
Authority
CN
China
Prior art keywords
query
matrix
encoder
channel
output data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010334998.4A
Other languages
Chinese (zh)
Other versions
CN111639153B (en
Inventor
于溦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010334998.4A priority Critical patent/CN111639153B/en
Priority to PCT/CN2020/104968 priority patent/WO2021212683A1/en
Publication of CN111639153A publication Critical patent/CN111639153A/en
Application granted granted Critical
Publication of CN111639153B publication Critical patent/CN111639153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, and provides a query method based on a legal knowledge graph. The method can link to the weight vector of the query statement from the element list of the legal knowledge base based on the attention mechanism, distinguish the contribution rate of each word, calculate and obtain a characteristic matrix based on the weight and input the characteristic matrix into an encoder to obtain the output data of the encoder, wherein the encoder comprises two BiGRU networks, and processing the output data of the encoder by a decoder to obtain a machine query language, the decoder comprising four BiGRU networks, because the structures of the encoder and the decoder are respectively optimized, the conversion of the query statement is more accurate and stable, the machine query language is further executed in the database, the query result is output, because the machine query language obtained by data processing is more accurate, the output query result is more accurate and reliable, the automatic conversion and query of the query statement are realized, and the query efficiency is improved.

Description

Query method and device based on legal knowledge graph, electronic equipment and medium
Technical Field
The invention relates to the technical field of data processing, in particular to a query method and device based on a legal knowledge graph, electronic equipment and a medium.
Background
Natural language generation is a very important area of research in the artificial intelligence industry, which is an inherent capability for humans, while representing the highest level of progress for artificial intelligence. Research natural language generation can help users find needed answers from a database in a faster, more accurate, and less costly manner.
In the legal field, some terms have high similarity and are not easy to distinguish, and a query result is often required to be obtained through a plurality of steps, each step may cause a mistake in the query result due to unclear intention transmission or understanding deviation and the like, and the accuracy is low.
Meanwhile, the existing query method is weak in generalization capability, cannot cope with new complicated and variable problems, needs to retrain the model when facing the new problems, and is high in cost.
Disclosure of Invention
In view of the above, it is necessary to provide a query method, a query device, an electronic device, and a query medium based on a legal knowledge graph, which can distinguish the contribution rate of each word based on an attention mechanism, and optimize the structures of an encoder and a decoder, respectively, so that the conversion of query statements is more accurate and stable.
A legal knowledge graph-based query method, the method comprising:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
According to a preferred embodiment of the present invention, each BiGRU network in the encoder comprises a plurality of subunits, the method further comprising:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
According to a preferred embodiment of the invention, the method further comprises:
taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
vector splicing is carried out on the output state of each BiGRU network to be used as output data of the encoder;
and uploading the output data of the encoder to a block chain.
According to the preferred embodiment of the present invention, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing of the output data of the encoder by the decoder to obtain the machine query language includes:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
According to a preferred embodiment of the present invention, the predicting, by using the category prediction channel, a channel to which each SQL word belongs in the output data of the encoder includes:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
According to a preferred embodiment of the invention, the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, controlling the category prediction channel to stop prediction.
According to a preferred embodiment of the present invention, the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
calculating a first penalty for the class prediction channel and calculating a second penalty for a weight vector of the query statement to which attention mechanism is linked;
calculating a sum of the first loss and the second loss as a loss function of the language translation model;
and optimizing the loss function by adopting a configuration optimization algorithm.
A legal knowledge graph-based querying device, the device comprising:
the query unit is used for querying a matrix of the query statement from a first preset dictionary when the query statement is received;
a linking unit for linking to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism;
the calculation unit is used for calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
the query unit is further used for querying a second matrix of the SQL statement from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
the splicing unit is used for splicing the first matrix, the second matrix and the third matrix to obtain a characteristic matrix;
the input unit is used for inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
the processing unit is used for processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and the execution unit is used for executing the machine query language in the database and outputting a query result.
According to the preferred embodiment of the present invention, each BiGRU network in the encoder includes a plurality of subunits, and the input unit is further configured to, for each subunit, at an initial time, acquire a preconfigured initialization value and acquire an initial feature matrix, input the initialization value and the initial feature matrix into the subunit, and output an initial state; or
The input unit is further configured to obtain an output state at a previous time and a current feature matrix at other times except the initial time, input the output state at the previous time and the current feature matrix into the subunit, and output a current state.
According to a preferred embodiment of the present invention, the method apparatus further comprises:
the determining unit is used for taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
and the splicing unit is also used for carrying out vector splicing on the output state of each BiGRU network to be used as the output data of the encoder.
According to the preferred embodiment of the present invention, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing unit processes the output data of the encoder by using a decoder to obtain the machine query language includes:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
According to a preferred embodiment of the present invention, the processing unit predicting, by using the category prediction channel, a channel to which each SQL word belongs in the output data of the encoder includes:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
According to a preferred embodiment of the invention, the apparatus further comprises:
and the control unit is used for controlling the category prediction channel to stop prediction when the SQL words in the output data of the encoder are stop symbols.
According to a preferred embodiment of the present invention, the encoder and the decoder are configured into a language conversion model according to an attention mechanism and a cross entropy function, and the calculation unit is further configured to calculate a first penalty of the class prediction channel and a second penalty based on a weight vector of the query statement to which the attention mechanism is linked;
the computing unit is further configured to compute a sum of the first loss and the second loss as a loss function of the language conversion model;
and the optimization unit is used for optimizing the loss function by adopting a configuration optimization algorithm.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the legal knowledge base query method.
A computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executable by a processor in an electronic device to implement the legal knowledgegraph-based query method.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the legal knowledge base query method of the present invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the query device based on legal knowledge base.
FIG. 3 is a schematic structural diagram of an electronic device implementing a query method based on legal knowledge base according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a preferred embodiment of the query method based on legal knowledge base according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The query method based on the legal knowledge base is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a cloud computing (cloud computing) based cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, when a query sentence is received, querying a matrix of the query sentence from the first preset dictionary.
Wherein the query statement may be a legally relevant query statement, such as: "agent name of the prover" and the like.
The first preset dictionary can be configured in a user-defined mode, and the first preset dictionary comprises all words related to the query sentence.
Therefore, the electronic device may directly perform a query in the first preset dictionary and determine the matrix of the query statement.
S11, linking to the weight vector of the query statement from the list of elements of the legal knowledge graph based on Attention mechanism (Attention).
For example: if the vector matrix of the question query is V1(dm dimension), and the question query is a sentence of 10 words, dm is 256, a matrix of 10 × 256 is obtained; the vector matrix of the element list names is V2(dm dimension), and there are 100 keywords in the element list, dm is 256, resulting in a matrix of 100 × 256; multiplying the two matrices, namely: v1 × V2T. The above process of calculating the product is to multiply the matrix of 10 × 256 by the matrix of 256 × 100 to obtain a matrix of 10 × 100, then normalize the calculated matrix, i.e. add 100 values in each dimension to obtain a vector of 10 × 1, then calculate the root mean square SQRT (SUM (V1 × V2T, axis 0)) of 10 values, and divide all ten values by the root mean square, that is: SUM (V1 × V2T, axis ═ 0)/SQRT (SUM (V1 × V2T, axis ═ 0)), a new vector is obtained, the structure of which is 10 × 1, and this vector is the weight vector W of the question query.
It will be appreciated that each word in the query statement has a different importance when querying, for example, for the query statement "agent name of the prover", several other words should not be assigned the same attention when analyzing the word "query". When the sentence of the query sentence is short, no obvious problem is caused, but when the sentence of the query sentence is long, if each word in the query sentence is represented by an intermediate semantic vector, the information of each word itself is weakened or even disappears, much detail information is lost, and therefore, the electronic device introduces the attention mechanism.
Specifically, the electronic device links to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism to distinguish the contribution of each word in the query statement to a query process, so that the query statement can be more accurately utilized for query.
In at least one embodiment of the present invention, the legal knowledge-graph may include a variety of legally relevant features, such as: legal entities, features specific to legal relationships, and the like.
It should be noted that the conventional features are usually text statistic features, such as: text length features, word frequency statistics class features, and the like.
In comparison, the legal knowledge graph mainly includes, but is not limited to, the following two types:
(1) and the law abstract class characteristics are extracted according to laws, regulations, judicial interpretations and the like.
For example: whether the original reported attribute is natural person, abstract legal person or other organization, the borrowing intention of the borrower, the interest mode selected by the borrowing contract, the borrowing delivery form and the like can be extracted from the borrowing contract.
In particular, the legal abstract class features are generalized in accordance with each legal sub-category.
(2) Features constructed according to the law theory.
For example: whether the process of contract-making is an offer invitation, an offer and a promise, whether the form of contract-making is a written form or a spoken form, whether the contract is a promise contract or a practice contract, whether the legal relationship established by the contract is a single-party civil legal relationship or a multi-party civil legal relationship, whether the contract establishes a first-fulfilling obligation and the like.
Specifically, the characteristics constructed according to the law theory are obtained by combing and inducing according to the law theory.
In at least one embodiment of the invention, the legal knowledge graph is constructed in a list form, and each element in the legal knowledge graph is displayed in an element list form.
S12, calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
In at least one embodiment of the present invention, before performing a query using the query statement, the electronic device first needs to perform an initialization process on the query statement.
Specifically, the electronic device calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
S13, querying a second matrix of SQL (Structured Query Language) statements from a second predetermined dictionary, and querying a third matrix of the list of elements from a third predetermined dictionary.
In at least one embodiment of the present invention, the electronic device may perform a custom configuration on the second preset dictionary and the third preset dictionary.
The second preset dictionary comprises SQL sentences, and the third preset dictionary comprises each element in the element list.
It should be noted that, since the construction technology of the dictionary is relatively mature, the present invention is not described herein.
And S14, splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix.
In at least one embodiment of the present invention, the electronic device splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix, where the feature matrix includes:
and the electronic equipment splices the first matrix, the second matrix and the third matrix in a transverse splicing or longitudinal splicing mode to obtain a characteristic matrix.
Through the implementation mode, the obtained feature matrix has the feature attributes of a plurality of layers, and accurate query is facilitated.
And S15, inputting the feature matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks.
In at least one embodiment of the present invention, the electronic device is improved based on a Sequence 2 Sequence (Sequence to Sequence) architecture, and trained to obtain a language translation model, which includes, but is not limited to: an encoder and a decoder.
Further, before the feature matrix is input into the encoder to obtain the output data of the encoder, the method further includes:
the electronic device trains the encoder.
Specifically, each BiGRU network in the encoder includes a plurality of subunits, the electronic device training the encoder includes:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
The initialization value may be configured by self-definition, which is not limited in the present invention.
Further, the electronic device may acquire training data, and construct the initial feature matrix and the current feature matrix in a manner of constructing the feature matrix in combination with the training data.
Further, the method further comprises:
the electronic equipment takes the output of the plurality of subunits after being serialized as the output state of each BiGRU network, and carries out vector splicing on the output state of each BiGRU network as the output data of the encoder;
and uploading the output data of the encoder to a block chain.
The corresponding digest information is obtained based on the output data of the encoder, and specifically, the digest information is obtained by hashing the output data of the encoder, for example, using the sha256s algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The user equipment may download the summary information from the blockchain to verify that the output data of the encoder is tampered.
The blockchain referred to in this embodiment is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Wherein the vector splicing is performed on the output state of each BiGRU network by the electronic device, and the vector splicing is performed as the output data of the encoder, and the vector splicing includes:
and the electronic equipment carries out vector splicing on the output state of each BiGRU network in a transverse splicing or longitudinal splicing mode to serve as the output data of the encoder.
For example: when the output state of one BiGRU network is a matrix of 100 × 128 and the output state of the other BiGRU network is 100 × 128, the output data obtained is a matrix of 200 × 128 if it is spliced in the 0 dimension (i.e., in the manner of vertical splicing), and the output data obtained is a matrix of 100 × 256 if it is spliced in the 1 dimension (i.e., in the manner of horizontal splicing).
By the implementation mode, the training of the encoder can be realized, and different from the existing Seq2Seq architecture, the encoder is provided with two BiGRU networks, so that the accuracy of model prediction is remarkably improved.
And S16, processing the output data of the encoder by using a decoder to obtain the machine query language, wherein the decoder comprises four BiGRU networks.
In at least one embodiment of the present invention, the output data of the encoder is used as the input data of the decoder, and the query statement can be converted into the machine query language after the output data of the encoder is processed by the decoder.
Specifically, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing of the output data of the encoder by the electronic device using the decoder to obtain the machine query language includes:
the electronic equipment predicts a channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, determines a word with the maximum probability in the channel to which each SQL word belongs as a participle corresponding to each SQL word based on an attention mechanism, and merges the participles corresponding to each SQL word to obtain the machine query language.
The electronic equipment predicts the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, and comprises the following steps:
for each SQL word in the output data of the encoder, the electronic equipment acquires the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the numerical channel, and determines the channel with the maximum probability value as the channel of the next SQL word.
Further, the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, the electronic equipment controls the category prediction channel to stop prediction.
Through the implementation mode, the query statement can be accurately converted based on the four channels and the type of each channel in the decoder, and compared with the existing Seq2Seq architecture, the decoder has a higher accuracy due to the fact that the decoder has different internal structures and 4 BiGRU networks.
It should be noted that the principle of the electronic device training the decoder is the same as the working principle of the decoder, and only a large amount of training data is used as the training basis, which is not described herein again.
In at least one embodiment of the present invention, the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
the electronic device calculates a first penalty for the class prediction channel and a second penalty for a weight vector of the query statement linked to based on an attention mechanism, further calculates a sum of the first penalty and the second penalty as a penalty function for the language translation model, the electronic device optimizing the penalty function using a configuration optimization algorithm.
The configuration optimization algorithm may be any loss function optimization algorithm, and the present invention is not limited thereto.
Through the implementation mode, the accuracy of the conversion result of the language conversion model can be evaluated, the intermediate process can be evaluated, the probability of selecting a channel and the probability of generating a vocabulary are considered, the language conversion model can be optimized more accurately, the generalization capability and the interpretability of the language conversion model are further improved, and the conversion result of the language conversion model is more accurate.
And S17, executing the machine query language in the database and outputting a query result.
After the machine query language is obtained, the electronic equipment can query in the database based on the machine query language, and the machine query language obtained through the language conversion model is more accurate, so that the finally obtained query result is more accurate and reliable.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output. The method can be applied to scenes such as an intelligent court and the like, so that the construction of an intelligent city is promoted.
Fig. 2 is a functional block diagram of a preferred embodiment of the query device based on legal knowledge base according to the present invention. The query device 11 based on the legal knowledge graph comprises a query unit 110, a link unit 111, a calculation unit 112, a splicing unit 113, an input unit 114, a processing unit 115, an execution unit 116, a determination unit 117, a control unit 118 and an optimization unit 119. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
When a query statement is received, the query unit 110 queries a matrix of the query statement from a first preset dictionary.
Wherein the query statement may be a legally relevant query statement, such as: "agent name of the prover" and the like.
The first preset dictionary can be configured in a user-defined mode, and the first preset dictionary comprises all words related to the query sentence.
Therefore, the query unit 110 may directly perform a query in the first predetermined dictionary and determine the matrix of the query sentences.
The linking unit 111 links the weight vector of the query statement from the list of elements of the legal knowledge base based on the Attention mechanism (Attention).
For example: if the vector matrix of the question query is V1(dm dimension), and the question query is a sentence of 10 words, dm is 256, a matrix of 10 × 256 is obtained; the vector matrix of the element list names is V2(dm dimension), and there are 100 keywords in the element list, dm is 256, resulting in a matrix of 100 × 256; multiplying the two matrices, namely: v1 × V2T. The above process of calculating the product is to multiply the matrix of 10 × 256 by the matrix of 256 × 100 to obtain a matrix of 10 × 100, then normalize the calculated matrix, i.e. add 100 values in each dimension to obtain a vector of 10 × 1, then calculate the root mean square SQRT (SUM (V1 × V2T, axis 0)) of 10 values, and divide all ten values by the root mean square, that is: SUM (V1 × V2T, axis ═ 0)/SQRT (SUM (V1 × V2T, axis ═ 0)), a new vector is obtained, the structure of which is 10 × 1, and this vector is the weight vector W of the question query.
It will be appreciated that each word in the query statement has a different importance when querying, for example, for the query statement "agent name of the prover", several other words should not be assigned the same attention when analyzing the word "query". When the sentence of the query sentence is short, no obvious problem is caused, but when the sentence of the query sentence is long, if each word in the query sentence is represented by an intermediate semantic vector, the information of each word itself is weakened or even disappears, much detail information is lost, and thus the link unit 111 introduces the attention mechanism.
Specifically, the linking unit 111 links the weight vector of the query statement from the element list of the legal knowledge base based on an attention mechanism to distinguish the contribution of each word in the query statement to the query process, so as to more accurately query with the query statement.
In at least one embodiment of the present invention, the legal knowledge-graph may include a variety of legally relevant features, such as: legal entities, features specific to legal relationships, and the like.
It should be noted that the conventional features are usually text statistic features, such as: text length features, word frequency statistics class features, and the like.
In comparison, the legal knowledge graph mainly includes, but is not limited to, the following two types:
(1) and the law abstract class characteristics are extracted according to laws, regulations, judicial interpretations and the like.
For example: whether the original reported attribute is natural person, abstract legal person or other organization, the borrowing intention of the borrower, the interest mode selected by the borrowing contract, the borrowing delivery form and the like can be extracted from the borrowing contract.
In particular, the legal abstract class features are generalized in accordance with each legal sub-category.
(2) Features constructed according to the law theory.
For example: whether the process of contract-making is an offer invitation, an offer and a promise, whether the form of contract-making is a written form or a spoken form, whether the contract is a promise contract or a practice contract, whether the legal relationship established by the contract is a single-party civil legal relationship or a multi-party civil legal relationship, whether the contract establishes a first-fulfilling obligation and the like.
Specifically, the characteristics constructed according to the law theory are obtained by combing and inducing according to the law theory.
In at least one embodiment of the invention, the legal knowledge graph is constructed in a list form, and each element in the legal knowledge graph is displayed in an element list form.
The calculating unit 112 calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
In at least one embodiment of the present invention, before performing a query using the query statement, the computing unit 112 first needs to perform an initialization process on the query statement.
Specifically, the calculating unit 112 calculates a product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement.
The Query unit 110 queries a second matrix of SQL (Structured Query Language) statements from a second predetermined dictionary, and queries a third matrix of the list of elements from a third predetermined dictionary.
In at least one embodiment of the present invention, the querying unit 110 may perform a custom configuration on the second predetermined dictionary and the third predetermined dictionary.
The second preset dictionary comprises SQL sentences, and the third preset dictionary comprises each element in the element list.
It should be noted that, since the construction technology of the dictionary is relatively mature, the present invention is not described herein.
The splicing unit 113 splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix.
In at least one embodiment of the present invention, the splicing unit 113 splices the first matrix, the second matrix, and the third matrix to obtain a feature matrix, where the feature matrix includes:
the splicing unit 113 splices the first matrix, the second matrix and the third matrix in a transverse splicing or longitudinal splicing manner to obtain a feature matrix.
The query device 11 based on the legal knowledge base further includes an uploading unit that uploads the output data of the encoder to the block chain.
Through the implementation mode, the obtained feature matrix has the feature attributes of a plurality of layers, and accurate query is facilitated.
The input unit 114 inputs the feature matrix into an encoder, which includes two BiGRU networks, to obtain output data of the encoder.
In at least one embodiment of the present invention, the input unit 114 is improved based on a Sequence 2 Sequence (Sequence to Sequence) architecture, and trained to obtain a language translation model, which includes, but is not limited to: an encoder and a decoder.
Further, before the feature matrix is input into the encoder to obtain the output data of the encoder, the encoder is trained.
Specifically, each BiGRU network in the encoder includes a plurality of sub-units, and for each sub-unit, at an initial time, the input unit 114 acquires a pre-configured initialization value and an initial feature matrix, inputs the initialization value and the initial feature matrix into the sub-unit, and outputs an initial state; or
At other times except the initial time, the input unit 114 obtains the output state at the previous time and obtains the current feature matrix, and inputs the output state at the previous time and the current feature matrix into the subunit to output the current state.
The initialization value may be configured by self-definition, which is not limited in the present invention.
Further, the input unit 114 may obtain training data, and combine the training data to construct the initial feature matrix and the current feature matrix in a manner of constructing the feature matrix.
Further, the determining unit 117 uses the output of the plurality of serial subunits as the output state of each BiGRU network, and the splicing unit 113 performs vector splicing on the output state of each BiGRU network as the output data of the encoder.
The vector splicing unit 113 performs vector splicing on the output state of each BiGRU network, and includes, as the output data of the encoder:
the splicing unit 113 performs vector splicing on the output state of each BiGRU network in a transverse splicing or longitudinal splicing manner, and uses the vector splicing as the output data of the encoder.
For example: when the output state of one BiGRU network is a matrix of 100 × 128 and the output state of the other BiGRU network is 100 × 128, the output data obtained is a matrix of 200 × 128 if it is spliced in the 0 dimension (i.e., in the manner of vertical splicing), and the output data obtained is a matrix of 100 × 256 if it is spliced in the 1 dimension (i.e., in the manner of horizontal splicing).
By the implementation mode, the training of the encoder can be realized, and different from the existing Seq2Seq architecture, the encoder is provided with two BiGRU networks, so that the accuracy of model prediction is remarkably improved.
The processing unit 115 processes the output data of the encoder with a decoder comprising four BiGRU networks, resulting in a machine query language.
In at least one embodiment of the present invention, the output data of the encoder is used as the input data of the decoder, and the query statement can be converted into the machine query language after the output data of the encoder is processed by the decoder.
Specifically, the four BiGRU networks are respectively a category prediction channel, an SQL channel, an element list channel, and a numerical channel, and the processing unit 115 processes the output data of the encoder by using a decoder to obtain the machine query language includes:
the processing unit 115 predicts a channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel, determines a word with the highest probability in the channel to which each SQL word belongs as a participle corresponding to each SQL word based on an attention mechanism, and the processing unit 115 merges the participle corresponding to each SQL word to obtain the machine query language.
Wherein the predicting, by the processing unit 115, the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel includes:
for each SQL word in the output data of the encoder, the processing unit 115 obtains a probability value output by the word in the SQL channel, a probability value output by the element list channel, and a probability value output by the numerical channel, and determines the channel with the highest probability value as the channel of the next SQL word.
Further, when the SQL word in the output data of the encoder is a stopper, the control unit 118 controls the category prediction channel to stop prediction.
Through the implementation mode, the query statement can be accurately converted based on the four channels and the type of each channel in the decoder, and compared with the existing Seq2Seq architecture, the decoder has a higher accuracy due to the fact that the decoder has different internal structures and 4 BiGRU networks.
It should be noted that the principle of training the decoder is the same as the working principle of the decoder, but a large amount of training data is used as the training basis, which is not described herein again.
In at least one embodiment of the present invention, the encoder and the decoder are configured as a language conversion model according to an attention mechanism and a cross entropy function, the calculation unit 112 calculates a first loss of the class prediction channel and a second loss based on a weight vector of the query statement to which the attention mechanism is linked, and further calculates a sum of the first loss and the second loss as a loss function of the language conversion model, and the optimization unit 119 optimizes the loss function using a configuration optimization algorithm.
The configuration optimization algorithm may be any loss function optimization algorithm, and the present invention is not limited thereto.
Through the implementation mode, the accuracy of the conversion result of the language conversion model can be evaluated, the intermediate process can be evaluated, the probability of selecting a channel and the probability of generating a vocabulary are considered, the language conversion model can be optimized more accurately, the generalization capability and the interpretability of the language conversion model are further improved, and the conversion result of the language conversion model is more accurate.
The execution unit 116 executes the machine query language in the database and outputs a query result.
After obtaining the machine query language, the execution unit 116 may perform query in the database based on the machine query language, and since the machine query language obtained through the language conversion model is more accurate, the finally obtained query result is more accurate and reliable.
From the above technical solutions, the present invention can query a matrix of a query sentence from a first preset dictionary when the query sentence is received, and link a weight vector of the query sentence from an element list of a legal knowledge graph based on an attention mechanism, where the introduction of the attention mechanism distinguishes a contribution rate of each word, further calculate a product of the matrix of the query sentence and the weight vector to obtain a first matrix corresponding to the query sentence, and query a second matrix of an SQL sentence from a second preset dictionary, and query a third matrix of the element list from a third preset dictionary, and concatenate the first matrix, the second matrix, and the third matrix to obtain a feature matrix, and further input the feature matrix into an encoder to obtain output data of the encoder, where the encoder includes two BiGRU networks, and processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks, the structures of the encoder and the decoder are respectively optimized, so that the conversion of the query statement is more accurate and stable, the machine query language is further executed in a database, and a query result is output.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a query method based on a legal knowledge base according to a preferred embodiment of the present invention.
The electronic device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as a legal knowledge graph based query program, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-type structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of a legal knowledge base query program, etc., but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (for example, executing a query program based on a legal knowledge map, and the like) stored in the memory 12 and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps in each of the above-mentioned embodiments of the legal knowledge base query method, such as steps S10, S11, S12, S13, S14, S15, S16, S17 shown in fig. 1.
Alternatively, the processor 13, when executing the computer program, implements the functions of the modules/units in the above device embodiments, for example:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the electronic device 1. For example, the computer program may be divided into a query unit 110, a link unit 111, a calculation unit 112, a concatenation unit 113, an input unit 114, a processing unit 115, an execution unit 116, a determination unit 117, a control unit 118, an optimization unit 119.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 only shows the electronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
In conjunction with fig. 1, the memory 12 in the electronic device 1 stores a plurality of instructions to implement a legal knowledge base query method, and the processor 13 can execute the plurality of instructions to implement:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A legal knowledge graph-based query method, comprising:
when a query statement is received, querying a matrix of the query statement from a first preset dictionary;
link to a weight vector of the query statement from a list of elements of a legal knowledge graph based on an attention mechanism;
calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
querying a second matrix of SQL sentences from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
splicing the first matrix, the second matrix and the third matrix to obtain a feature matrix;
inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and executing the machine query language in a database, and outputting a query result.
2. The legal knowledgebase query method of claim 1, wherein each BiGRU network in the encoder comprises a plurality of subunits, the method further comprising:
for each subunit, at the initial moment, acquiring a pre-configured initialization value and an initial feature matrix, inputting the initialization value and the initial feature matrix into the subunit, and outputting an initial state; or
And acquiring the output state of the previous moment and the current feature matrix at other moments except the initial moment, inputting the output state of the previous moment and the current feature matrix into the subunit, and outputting the current state.
3. The legal knowledge graph-based query method of claim 2, wherein the method further comprises:
taking the output of the plurality of subunits after being serialized as the output state of each BiGRU network;
vector splicing is carried out on the output state of each BiGRU network to be used as output data of the encoder;
and uploading the output data of the encoder to a block chain.
4. The legal knowledgebase query method of claim 1, wherein the four BiGRU networks are a category prediction channel, an SQL channel, an element list channel, and a numerical channel, respectively, and the processing of the encoder output data by the decoder to obtain the machine query language comprises:
predicting the channel to which each SQL word belongs in the output data of the encoder by using the category prediction channel;
determining the word with the maximum probability in the channel to which each SQL word belongs as the participle corresponding to each SQL word based on the attention mechanism;
and combining the participles corresponding to each SQL word to obtain the machine query language.
5. The legal knowledge graph-based query method of claim 4, wherein the predicting the channel to which each SQL word belongs in the output data of the encoder using the category prediction channel comprises:
for each SQL word in the output data of the encoder, obtaining the probability value output by the word in the SQL channel, the probability value output by the element list channel and the probability value output by the value channel;
and determining the channel with the maximum probability value as the channel of the next SQL word.
6. The legal knowledge graph-based query method of claim 4, wherein the method further comprises:
and when the SQL word in the output data of the encoder is a stop sign, controlling the category prediction channel to stop prediction.
7. The legal knowledge graph-based query method of claim 4, wherein the encoder and the decoder are constructed into a language translation model according to an attention mechanism and a cross-entropy function, the method further comprising:
calculating a first penalty for the class prediction channel and calculating a second penalty for a weight vector of the query statement to which attention mechanism is linked;
calculating a sum of the first loss and the second loss as a loss function of the language translation model;
and optimizing the loss function by adopting a configuration optimization algorithm.
8. A legal knowledge graph-based query device, the device comprising:
the query unit is used for querying a matrix of the query statement from a first preset dictionary when the query statement is received;
a linking unit for linking to a weight vector of the query statement from an element list of a legal knowledge base based on an attention mechanism;
the calculation unit is used for calculating the product of the matrix of the query statement and the weight vector to obtain a first matrix corresponding to the query statement;
the query unit is further used for querying a second matrix of the SQL statement from a second preset dictionary and querying a third matrix of the element list from a third preset dictionary;
the splicing unit is used for splicing the first matrix, the second matrix and the third matrix to obtain a characteristic matrix;
the input unit is used for inputting the characteristic matrix into an encoder to obtain output data of the encoder, wherein the encoder comprises two BiGRU networks;
the processing unit is used for processing the output data of the encoder by using a decoder to obtain a machine query language, wherein the decoder comprises four BiGRU networks;
and the execution unit is used for executing the machine query language in the database and outputting a query result.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement a legal knowledge graph-based query method as recited in any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executable by a processor in an electronic device to implement the legal knowledge graph-based query method of any one of claims 1-7.
CN202010334998.4A 2020-04-24 2020-04-24 Query method and device based on legal knowledge graph, electronic equipment and medium Active CN111639153B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010334998.4A CN111639153B (en) 2020-04-24 2020-04-24 Query method and device based on legal knowledge graph, electronic equipment and medium
PCT/CN2020/104968 WO2021212683A1 (en) 2020-04-24 2020-07-27 Law knowledge map-based query method and apparatus, and electronic device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010334998.4A CN111639153B (en) 2020-04-24 2020-04-24 Query method and device based on legal knowledge graph, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN111639153A true CN111639153A (en) 2020-09-08
CN111639153B CN111639153B (en) 2024-07-02

Family

ID=72333231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010334998.4A Active CN111639153B (en) 2020-04-24 2020-04-24 Query method and device based on legal knowledge graph, electronic equipment and medium

Country Status (2)

Country Link
CN (1) CN111639153B (en)
WO (1) WO2021212683A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328960A (en) * 2020-11-02 2021-02-05 中国平安财产保险股份有限公司 Data operation optimization method and device, electronic equipment and storage medium
CN113221975A (en) * 2021-04-26 2021-08-06 中国科学技术大学先进技术研究院 Working condition construction method based on improved Markov analysis method and storage medium
CN115455149A (en) * 2022-09-20 2022-12-09 城云科技(中国)有限公司 Database construction method based on coding query mode and application thereof

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063666A (en) * 2022-07-06 2022-09-16 京东科技信息技术有限公司 Decoder training method, target detection method, device and storage medium
CN115658926B (en) * 2022-11-21 2023-05-05 中国科学院自动化研究所 Element estimation method and device of knowledge graph, electronic equipment and storage medium
CN116258521A (en) * 2022-12-02 2023-06-13 东莞盟大集团有限公司 Secondary node identification application integral management method based on blockchain technology
CN115983379B (en) * 2023-03-20 2023-10-10 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Reachable path query method and system of MDTA knowledge graph
CN116225973B (en) * 2023-05-10 2023-06-30 贵州轻工职业技术学院 Chip code testing method and device based on embedded implementation electronic equipment
CN117743590B (en) * 2023-11-30 2024-07-26 北京汉勃科技有限公司 Legal assistance method and system based on large language model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285740A1 (en) * 2017-04-03 2018-10-04 Royal Bank Of Canada Systems and methods for malicious code detection
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109977200A (en) * 2019-01-25 2019-07-05 上海凯岸信息科技有限公司 Speech polling assistant based on SQL Auto
CN110489102A (en) * 2019-07-29 2019-11-22 东北大学 A method of Python code is automatically generated from natural language
CN110945495A (en) * 2017-05-18 2020-03-31 易享信息技术有限公司 Conversion of natural language queries to database queries based on neural networks

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291871B (en) * 2017-06-15 2021-02-19 北京百度网讯科技有限公司 Matching degree evaluation method, device and medium for multi-domain information based on artificial intelligence
CN107943874B (en) * 2017-11-13 2019-08-23 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN109656952B (en) * 2018-10-31 2021-04-13 北京百度网讯科技有限公司 Query processing method and device and electronic equipment
CN110990536A (en) * 2019-12-06 2020-04-10 重庆邮电大学 CQL generation method based on BERT and knowledge graph perception

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285740A1 (en) * 2017-04-03 2018-10-04 Royal Bank Of Canada Systems and methods for malicious code detection
CN110945495A (en) * 2017-05-18 2020-03-31 易享信息技术有限公司 Conversion of natural language queries to database queries based on neural networks
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109977200A (en) * 2019-01-25 2019-07-05 上海凯岸信息科技有限公司 Speech polling assistant based on SQL Auto
CN110489102A (en) * 2019-07-29 2019-11-22 东北大学 A method of Python code is automatically generated from natural language

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BEN BOGIN ET AL: "Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing", 《ARXIV:1905.06241V2 [CS.CL]》, 3 June 2019 (2019-06-03), pages 1 - 7 *
DONGJUN LEE ET AL: "Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation", 《ARXIV:1904.08835V2 [CS.CL]》, 19 August 2019 (2019-08-19), pages 1 - 7 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328960A (en) * 2020-11-02 2021-02-05 中国平安财产保险股份有限公司 Data operation optimization method and device, electronic equipment and storage medium
CN112328960B (en) * 2020-11-02 2023-09-19 中国平安财产保险股份有限公司 Optimization method and device for data operation, electronic equipment and storage medium
CN113221975A (en) * 2021-04-26 2021-08-06 中国科学技术大学先进技术研究院 Working condition construction method based on improved Markov analysis method and storage medium
CN115455149A (en) * 2022-09-20 2022-12-09 城云科技(中国)有限公司 Database construction method based on coding query mode and application thereof
CN115455149B (en) * 2022-09-20 2023-05-30 城云科技(中国)有限公司 Database construction method based on coding query mode and application thereof

Also Published As

Publication number Publication date
WO2021212683A1 (en) 2021-10-28
CN111639153B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN111639153A (en) Query method and device based on legal knowledge graph, electronic equipment and medium
CN112541338A (en) Similar text matching method and device, electronic equipment and computer storage medium
CN111460797B (en) Keyword extraction method and device, electronic equipment and readable storage medium
CN113821622B (en) Answer retrieval method and device based on artificial intelligence, electronic equipment and medium
CN112883190A (en) Text classification method and device, electronic equipment and storage medium
CN114461777B (en) Intelligent question-answering method, device, equipment and storage medium
CN112507663A (en) Text-based judgment question generation method and device, electronic equipment and storage medium
CN112559687A (en) Question identification and query method and device, electronic equipment and storage medium
CN113807973B (en) Text error correction method, apparatus, electronic device and computer readable storage medium
CN113887941B (en) Business process generation method, device, electronic equipment and medium
CN112528013A (en) Text abstract extraction method and device, electronic equipment and storage medium
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
CN113627160B (en) Text error correction method and device, electronic equipment and storage medium
CN113344125B (en) Long text matching recognition method and device, electronic equipment and storage medium
CN116521867A (en) Text clustering method and device, electronic equipment and storage medium
CN112347739A (en) Application rule analysis method and device, electronic equipment and storage medium
CN114548114B (en) Text emotion recognition method, device, equipment and storage medium
CN111414452B (en) Search word matching method and device, electronic equipment and readable storage medium
CN113449037B (en) AI-based SQL engine calling method, device, equipment and medium
CN113221578B (en) Disease entity retrieval method, device, equipment and medium
CN115346095A (en) Visual question answering method, device, equipment and storage medium
CN115146064A (en) Intention recognition model optimization method, device, equipment and storage medium
CN113704616A (en) Information pushing method and device, electronic equipment and readable storage medium
CN112214594A (en) Text briefing generation method and device, electronic equipment and readable storage medium
CN115221875B (en) Word weight generation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant