CN115618019A - Knowledge graph construction method and device and terminal equipment - Google Patents

Knowledge graph construction method and device and terminal equipment Download PDF

Info

Publication number
CN115618019A
CN115618019A CN202211394380.2A CN202211394380A CN115618019A CN 115618019 A CN115618019 A CN 115618019A CN 202211394380 A CN202211394380 A CN 202211394380A CN 115618019 A CN115618019 A CN 115618019A
Authority
CN
China
Prior art keywords
sequence
feature
layer
knowledge
adopting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211394380.2A
Other languages
Chinese (zh)
Inventor
陈曦
张鹏飞
孙思思
姜丹
李刚
卢艳艳
路欣
刘明硕
刘汝坤
尹晓宇
李涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
North China Electric Power University
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
North China Electric Power University
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, North China Electric Power University, Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202211394380.2A priority Critical patent/CN115618019A/en
Publication of CN115618019A publication Critical patent/CN115618019A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Finance (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Technology Law (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Machine Translation (AREA)

Abstract

The application is applicable to the technical field of data processing, and provides a knowledge graph construction method, a knowledge graph construction device and terminal equipment, wherein the method comprises the following steps: acquiring knowledge data, and coding a feature vector of the knowledge data by adopting an embedded layer, wherein the feature vector is an identifier obtained after the knowledge data is coded; extracting a first sequence feature in the feature vector by using a first sequence feature extraction model, and extracting a second sequence feature in the feature vector by using a second sequence feature extraction model; splicing the first sequence feature and the second sequence feature to obtain a third sequence feature; and selecting a third sequence characteristic conforming to the expression logic by adopting a CRF layer to obtain a triple sequence, wherein the triple sequence is composed of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph. The method and the device can reduce the influence of interference entity information and enhance the extraction capability of knowledge graph features, thereby completing entity identification and relationship extraction of the audit rule and further constructing the knowledge graph conforming to the financial audit rule.

Description

Knowledge graph construction method and device and terminal equipment
Technical Field
The application belongs to the technical field of data processing, and particularly relates to a knowledge graph construction method and device and terminal equipment.
Background
The financial audit is used as an important component of an economic supervision system, and the entity relations such as business rules, regulations, financial statements and the like involved in the process can convert abstract semantic data in audit rules into visual knowledge map information by extracting the entity relations, so that the method not only can be convenient for auditors to learn, analyze and research and judge problem types, problem frequency and problem expression forms, but also can realize reasonable fusion and cross-domain analysis of audit information, and provide valuable reference for digital audit.
Traditional knowledge graph mining and construction usually adopts feature-based model construction or kernel function-based model construction. The model constructed based on the characteristics usually needs to design a large amount of characteristics of grammar, semantics and syntax and then inputs the characteristics into a classifier similar to a Support Vector Machine (SVM) for classification, and a large amount of time and energy are consumed to construct applicable characteristics; although the model constructed based on the kernel function does not need to construct various features, how to design and select the proper kernel function is very difficult, so that the constructed knowledge graph is low in identification accuracy and poor in analysis capability.
Therefore, a knowledge graph construction method is needed to reduce the influence of interfering entity information and enhance the knowledge graph feature extraction capability, so as to complete entity identification and relationship extraction of the audit rule, and further construct a knowledge graph conforming to the financial audit rule.
Disclosure of Invention
In order to overcome the problems in the related art, embodiments of the present application provide a method, an apparatus, and a terminal device for constructing a knowledge graph, so as to reduce the influence of interfering entity information and enhance the feature extraction capability of the knowledge graph, thereby completing entity identification and relationship extraction of an audit rule, and further constructing a knowledge graph conforming to a financial audit rule.
The application is realized by the following technical scheme:
in a first aspect, an embodiment of the present application provides a method for constructing a knowledge graph, including: acquiring knowledge data, and coding a feature vector of the knowledge data by adopting an embedded layer, wherein the feature vector is an identifier obtained after the knowledge data is coded; extracting a first sequence feature in the feature vector by adopting a first sequence feature extraction model, and extracting a second sequence feature in the feature vector by adopting a second sequence feature extraction model; splicing the first sequence feature and the second sequence feature to obtain a third sequence feature; and selecting a third sequence characteristic which accords with the expression logic by adopting a CRF layer to obtain a triple sequence, wherein the triple sequence is composed of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph.
In a possible implementation manner of the first aspect, the first sequence feature extraction model includes a first BiGRU layer and a second BiGRU layer, and the first BiGRU layer and the second BiGRU layer are connected in a stacking manner, where the first BiGRU layer and the second BiGRU layer are each formed by two layers of GRU units, the input sequence of the GRU unit in the first layer is a forward GRU, and the input sequence of the GRU unit in the second layer is a reverse GRU. The feature vector is firstly subjected to feature extraction through the first BiGRU layer, and then is subjected to feature extraction through the second BiGRU layer, and first sequence features are output.
In a possible implementation manner of the first aspect, the second sequence feature extraction model includes a BilSTM layer, and the BilSTM layer is used for performing semantic feature extraction on the character-level feature vectors; the BilSTM layer receives the feature vectors and outputs character-level feature vectors.
In a possible implementation manner of the first aspect, the second sequence feature extraction model further includes a Self-Attention layer, and the Self-Attention layer is used for assigning Attention weights of the character-level feature vectors, wherein the Attention weights characterize the importance degree of each character-level feature vector to the construction of the knowledge graph. The Self-orientation layer receives feature extraction of the feature vector output by the BilSTM layer, and the Self-orientation layer outputs a second sequence feature.
In a possible implementation manner of the first aspect, the concatenating the first sequence feature and the second sequence feature to obtain a third sequence feature includes: splicing the first sequence characteristic and the second sequence characteristic by using a Concat function to obtain a third sequence characteristic, wherein the expression of the third sequence characteristic is as follows:
H(x)=concat[F(x)+G(x)]
wherein H (x) represents the third sequence feature, F (x) represents the first sequence feature, and G (x) represents the second sequence feature.
In one possible implementation manner of the first aspect, before the CRF layer is used to select the third sequence feature that conforms to the expression logic, the method includes: and coding the third sequence feature by adopting a full connection layer, and integrating a plurality of dimensions of the third sequence feature into one dimension.
In one possible implementation form of the first aspect, the knowledge data includes picture data and text data; after acquiring the knowledge data and before encoding the feature vectors of the knowledge data by using the embedding layer, the map construction method further comprises the following steps: recognizing characters in the image data by adopting an OCR recognition model; adopting a Chinese grammar segmentation model to perform word segmentation on the text data according to the Chinese grammar; and constructing a triple feature template, wherein the triple feature template is in the form of an entity, a relation and an entity.
In a second aspect, an embodiment of the present application provides a knowledge graph constructing apparatus, including: the data acquisition module is used for acquiring knowledge data and encoding the feature vectors of the knowledge data by adopting the embedded layer; the characteristic extraction module is used for extracting first sequence characteristics in the characteristic vectors by adopting a first sequence characteristic extraction model and is also used for extracting second sequence characteristics in the characteristic vectors by adopting a second sequence characteristic extraction model; the characteristic splicing module is used for splicing the first sequence characteristic and the second sequence characteristic to obtain a third sequence characteristic; and the decoding output module is used for selecting the third sequence characteristics which accord with the expression logic by adopting a CRF layer to obtain a triple sequence, wherein the triple sequence is composed of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph.
In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the processor implements the method for constructing a knowledge graph according to any one of the first aspect when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the method for constructing a knowledge graph according to any one of the first aspect.
In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the method for constructing a knowledge graph according to any one of the first aspect.
Compared with the prior art, the embodiment of the application has the advantages that:
according to the embodiment of the application, after the first sequence feature extraction model extracts the first sequence features in the feature vectors and the second sequence feature extraction model extracts the second sequence features in the feature vectors, the first sequence features and the second sequence features are spliced to obtain the third sequence features, so that the first sequence feature extraction model and the second sequence feature extraction model perform entity extraction on the same feature vectors, more entity information can be identified, and the accuracy of relation identification is higher.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the specification.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic flow chart diagram of a method for constructing a knowledge graph according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a first sequence feature extraction model provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a second sequence feature extraction model provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a concatenation of a first sequence feature and a second sequence feature provided in an embodiment of the present application;
FIG. 5 is a knowledge-graph of some financial audit rules provided in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a knowledge graph constructing apparatus provided in an embodiment of the present application;
fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
The existing knowledge graph construction method generally adopts a mode of entity co-occurrence, model construction based on a characteristic vector or model construction based on a kernel function, so that the accuracy and the efficiency are reduced doubly, and the three methods have the following specific defects:
(1) The entity co-occurrence means that when two entities occur simultaneously, the two entities are associated, wherein the co-occurrence frequency scoring is adopted to eliminate the situation of error identification caused by accidental co-occurrence.
(2) The model constructed based on the feature vectors is mainly characterized in that the feature vectors are constructed, wherein the more feature values are, the higher the accuracy of the extraction relationship is, but the greater the calculation complexity is, the weaker the feature generalization capability is, and the corresponding features need to be redesigned according to different fields.
(3) Although the model constructed based on the kernel function does not need to build huge feature engineering, how to design and select the proper kernel function is very difficult.
Based on the above problems, the embodiment of the application provides a knowledge graph construction method, a residual error recurrent neural network entity learning model integrating an attention mechanism is researched based on a traditional recurrent neural network, and the recurrent neural network with the residual error property is designed aiming at the construction of the knowledge graph, so that the influence of negative samples on the network can be reduced, and the extraction capacity of network entities is increased. An attention mechanism is introduced into the residual error part to capture the dependency relationship of long time steps, reduce interference of non-entity information, enhance network feature extraction capability, complete entity identification and relationship extraction of the audit rule, and further construct a knowledge graph.
The application takes financial audit as an example to detail the technical scheme applied for protection.
In order to make the objects, technical solutions and advantages of the present application more clear and more obvious, the present application is described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the following description of specific embodiments is intended to be illustrative only and is not intended to be in any way limiting.
Fig. 1 is a schematic flow chart of a method for constructing a knowledge graph according to an embodiment of the present application, and referring to fig. 1, the method may be implemented by performing steps 101 to 104, which are detailed as follows:
in step 101, knowledge data is acquired and feature vectors of the knowledge data are encoded using the embedding layer.
In some embodiments, the relevant knowledge data may be obtained through multiple knowledge bases of financial auditing.
For example, the knowledge base may be related financial audit files on the network, related legal regulations for financial audit, or audit rule specifications in a unit, etc.
The method for acquiring knowledge data from the knowledge base is also various, and the knowledge data can be acquired by crawling from a network through a Python technology, scanning files in a unit and the like. The present application is not further limited.
Optionally, the acquired knowledge data may include picture data and text data.
In some embodiments, after the knowledge data is obtained, the feature vectors of the knowledge data may be encoded using an embedding layer.
Illustratively, the Embedding layer may encode the feature vectors of the knowledge data in the form of an Embedding layer.
Optionally, the Embedding layer may include a Token Embedding vector, a Segment Embedding vector, and a Position Embedding vector. The feature vectors of the knowledge data can be encoded by combining the Token Embedding vector, the Segment Embedding vector and the Position Embedding vector, and the order of Chinese words can be reserved by adopting the method.
In step 102, a first sequence feature in the feature vector is extracted by using the first sequence feature extraction model, and a second sequence feature in the feature vector is extracted by using the second sequence feature extraction model.
In some embodiments, two feature extraction models are used, a first sequence feature extraction model and a second sequence feature extraction model. The first sequence feature extraction model is used for extracting first sequence features, namely deep sequence features, in the knowledge data feature vector, and the second sequence feature extraction model is used for extracting second sequence features, namely shallow sequence features, in the knowledge data feature vector.
The characteristics of the deep sequence features are that global information such as semantic information and the like is more important, and the deep sequence features are mainly feature information of some coarse particles.
The characteristic of the so-called shallow sequence feature is that more detailed information such as character level information is focused, and the characteristic information is mainly some fine-grained characteristic information.
In some embodiments, the first sequence feature extraction model may include a first BiGRU layer and a second BiGRU layer. Wherein the first BiGRU layer and the second BiGRU layer are connected in a stacking manner. Fig. 2 shows a schematic structural diagram of a first sequence feature extraction model provided in an embodiment of the present application.
Referring to fig. 2, the feature vector performs feature extraction through the first BiGRU layer, and then performs feature extraction through the second BiGRU layer, outputting a first feature sequence.
Optionally, the first BiGRU layer and the second BiGRU layer are each formed by two GRU units, the input sequence of the first layer of GRU unit is a forward GRU, and the input sequence of the second layer of GRU unit is a reverse GRU.
Illustratively, the feature vector x at time t is used as an input of the first sequence feature extraction model t By updating the door Z t And a reset gate r t According to the state information h of the previous moment t-1 To obtain the current hidden state h t
Updating door Z t The calculation formula of (2) is as follows: z is a radical of t =σ(W z ·[h t-1 ,x t ]). In which, at time t, a sequence X is input t And weight W z Multiplying and performing linear transformation, and simultaneously keeping the information h in the t-1 time t-1 And a weight W z The multiplication is performed to perform linear transformation, and then the sum of the multiplication and the activation is performed by using a sigmoid layer. In this process, the update gate determines how much information of the past time step and the current time step to continue to be transmitted.
The reset gate is calculated as: t' = tan (W [. Gamma. ]) t * t-1 ,x t ]). In the formula, r t And h t-1 The product of (a) and (b) represents the amount of information before retention and forgetting, and after a linear change of x, the activation is performed by a tanh layer, and the formula is mainly used for determining to ignore the amount of information in the past time.
And finally, obtaining final memory information of the current moment, wherein the inflow of new information is controlled by using an updating gate to obtain final output, and the calculated expression is as follows: t = (1-z) t )* t-1 +z t * t ′。
After the above formula calculation, more semantic information is obtained, and the calculation expression output by the BiGRU in the last t state is as follows:
Figure BDA0003932822480000081
Figure BDA0003932822480000082
Figure BDA0003932822480000083
in some embodiments, the second sequence feature extraction model may include a BilSTM layer for semantic feature extraction on character-level feature vectors. Fig. 3 shows a schematic structural diagram of a second sequence feature extraction model provided in an embodiment of the present application.
Optionally, the BiLSTM layer receives the feature vectors and outputs character-level feature vectors.
In some embodiments, the second sequence feature extraction model further comprises a Self-orientation layer for assigning Attention weights for character-level feature vectors. Wherein the attention weight characterizes the importance of each character-level feature vector to the construction of the knowledge graph.
In the process of constructing the knowledge graph, not only the extraction of entities of word segmentation and context is important, but also the emotion in the text plays an important role. In order to make the extracted entity closer to the real intention of the user, the application introduces an Attention mechanism, and character-level feature vectors processed by the BilSTM are input into a Self-Attention layer.
In some embodiments, let the character-level feature vector be X = { X 1 ,x 2 ,……,x n },x i Indicating the ith character, using k in the Self-extension layer i And f i Representing the word x i The corresponding key vector. The Self-extension layer respectively corresponds to the key vector k of each term by aiming at the query vector q i Similarity analysis is carried out, and the similarity value s of each character can be obtained i Then s i It represents the importance of each character. S will be obtained next i By performing normalization, the weight value w of each word can be obtained i Where the normalization is done using the SoftMax function. Finally, each weight value w i And corresponding key vector f i Carrying out weighted summation to obtain a text vector Question containing attentionAttention, its expression is:
Figure BDA0003932822480000091
the Query Attenttion contains three matrixes of Q (Query), K (Key) and V (Value), linear transformation is carried out on the Query, the Key and the Value, and the calculation expression is as follows:
Q′=Q*W j Q
K′=K*W j K
V′=V*W j V
finally, scaling dot product attention calculation is performed:
Figure BDA0003932822480000092
in the above formula, the dot product of the matrices Q and K is calculated and then divided by
Figure BDA0003932822480000093
Then, the weight of the matrix V is obtained through a SoftMax function, in order to ensure the stability of the gradient,
Figure BDA0003932822480000094
is a smoothing term. This process is repeated several times, wherein the parameters of each linear transformation are changed, and finally the layers are spliced. The process can enable important words to have higher weight, so that the semantic features can find key points, the representation of important semantic information is enhanced, and the interference of irrelevant information can be avoided.
In some embodiments, the Self-orientation layer receives feature extraction of the feature vector output by the BilSTM layer, and the Self-orientation layer outputs the second sequence features.
In the step, the first sequence feature extraction model and the second sequence feature extraction model share the feature vector of the embedded layer, so that the first sequence feature extraction model and the second sequence feature extraction model can learn the same data, preparation is made for the subsequent steps, and the problem of feature splicing of different data on an output layer due to disorder input of a neural network during training is solved.
In step 103, the first sequence feature and the second sequence feature are concatenated to obtain a third sequence feature.
Fig. 4 is a schematic diagram of a concatenation of a first sequence feature and a second sequence feature provided in an embodiment of the present application. Refer to fig. 4.
In some embodiments, the first sequence feature and the second sequence feature may be spliced by using a Concat function to obtain a third sequence feature, where the expression of the third sequence feature is:
H(x)=concat[F(x)+G(x)]
wherein H (x) represents the third sequence feature, F (x) represents the first sequence feature, and G (x) represents the second sequence feature.
In the step, the first sequence characteristics and the second sequence characteristics are spliced, so that two parts of neural network training can be well reserved to obtain the sequence characteristics, and the problem that the cyclic neural network has insufficient capability of capturing the long text data characteristics is solved.
In step 104, a third sequence feature conforming to the expression logic is selected by using the CRF layer to obtain a triple sequence.
The third sequence feature output in step 103 is still marked confusedly, so the CRF layer needs to be used to select the best tag sequence, i.e. to sort all entities.
In some embodiments, for a sequence x = { x under one given observation 1 ,x 2 ,……,x n Get the sequence y = { y) predictably 1 ,y 2 ,……,y n The expression for the scores for sequence x and sequence y is as follows:
Figure BDA0003932822480000101
wherein A is a transfer matrix, A i,j For transfer from tag i to tag j, P is an n × k matrix, P i,j Is the jth label score for the ith word.
The probability of the sequence y is then obtained by the Softmax function:
Figure BDA0003932822480000111
in the formula (I), the compound is shown in the specification,
Figure BDA0003932822480000112
to truly mark a value, Y x Is the set of all possible labels.
Then, the likelihood probability that the training process maximizes the correct tag sequence is:
Figure BDA0003932822480000113
finally, using the Viterbi algorithm to obtain the best predicted tag sequence:
y * =argmax(S(x,y))
in some embodiments, a CRF layer is used to select a third sequence feature that matches the expression logic to obtain a triple sequence. Wherein the sequence of triples includes at least one triplet, each triplet being representative of a single knowledge-graph.
Illustratively, in the process of decomposing the text of the audit doubts into < entity, relation, entity > triples, different types of events such as reimbursement subject matters, fund use and the like are involved, and a single audit rule is represented by each triplet to form a structured representation of the whole audit rule base.
In some embodiments, based on the embodiment shown in fig. 1, before selecting the third sequence feature according to the expression logic by using the CRF layer, the method for constructing a knowledge graph may further include:
and coding the third sequence characteristics by adopting a full connection layer, and integrating a plurality of dimensions of the third sequence characteristics into one dimension so as to reduce the influence of the position of the third sequence characteristics on the classification result and improve the robustness of the whole neural network.
In some embodiments, after acquiring the knowledge data, before encoding the feature vectors of the knowledge data by using the embedding layer, the knowledge graph construction method may further include preprocessing the knowledge data. The preprocessing may perform the following operations:
optionally, characters contained in the model image data are recognized by adopting an OCR;
adopting a Chinese grammar segmentation model to perform word segmentation on the text data according to the Chinese grammar;
and constructing a triple feature template, wherein the form of the triple feature template is < entity, relation and entity >.
In order to verify the beneficial effect and credibility of the technical scheme, a group of comparison cases are also provided. Compared with the traditional knowledge graph based on the recurrent neural network and the relation detection algorithm, the knowledge graph construction method disclosed by the application has the advantages that more information is identified, the relation identification accuracy is higher, the table 1 shows the information identification quantity of the knowledge graph constructed by different algorithms, and the table 2 shows the identification accuracy of the financial statement relations aiming at different quantities.
TABLE 1 information identification
Figure BDA0003932822480000121
TABLE 2 financial audit rule knowledge map recognition accuracy
Figure BDA0003932822480000122
FIG. 5 illustrates a knowledge-graph of partial financial audit rules provided by an embodiment of the present application.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Fig. 6 shows a block diagram of a knowledge graph constructing apparatus 200 provided in the embodiment of the present application, which corresponds to the knowledge graph constructing method described in the above embodiment, and only shows the relevant parts in the embodiment of the present application for convenience of description.
Referring to fig. 6, the knowledge graph constructing apparatus in the embodiment of the present application may include a data obtaining module 201, a feature extracting module 202, a feature splicing module 203, and a decoding output module 204.
And the data acquisition module 201 is configured to acquire knowledge data and encode feature vectors of the knowledge data by using the embedded layer.
The feature extraction module 202 is configured to extract a first sequence feature in the feature vector by using a first sequence feature extraction model.
Optionally, the first sequence feature extraction model includes a first BiGRU layer and a second BiGRU layer, the first BiGRU layer and the BiGRU layer are connected in a stacking manner, the first BiGRU layer and the second BiGRU layer are respectively formed by two layers of GRU units, the input sequence of the GRU unit of the first layer is a forward GRU, and the input sequence of the GRU unit of the second layer is a reverse GRU. The feature vector is firstly subjected to feature extraction through the first BiGRU layer, and then is subjected to feature extraction through the second BiGRU layer, and first sequence features are output.
The feature extraction module 202 is further configured to extract a second sequence feature in the feature vector by using a second sequence feature extraction model.
Optionally, the second sequence feature extraction model includes a BilSTM layer, and the BilSTM layer is used for performing semantic feature extraction on the character-level feature vectors; the BilSTM layer receives the feature vectors and outputs character-level feature vectors.
Optionally, the second sequence feature extraction model further includes a Self-orientation layer, and the Self-orientation layer is configured to assign Attention weights to the character-level feature vectors, where the Attention weights characterize an importance degree of each character-level feature vector to the construction of the knowledge graph. The Self-orientation layer receives feature extraction of the feature vector output by the BilSTM layer, and the Self-orientation layer outputs a second sequence feature.
And the feature splicing module 203 is configured to splice the first sequence feature and the second sequence feature to obtain a third sequence feature.
Optionally, the first sequence feature and the second sequence feature are spliced by using a Concat function to obtain a third sequence feature, where an expression of the third sequence feature is:
H(x)=concat[F(x)+G(x)]
wherein H (x) represents the third sequence feature, F (x) represents the first sequence feature, and G (x) represents the second sequence feature.
And the decoding output module is used for selecting the third sequence characteristics which accord with the expression logic by adopting a CRF layer to obtain the triple sequence.
Wherein the triple sequence is composed of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present application further provides a terminal device, and referring to fig. 7, the terminal device 300 may include: at least one processor 310, a memory 320, and a computer program 321 stored in the memory 320 and capable of running on the at least one processor 310, wherein the processor 310 executes the computer program 321 to implement the steps in any of the method embodiments, such as the steps 101 to 103 in the embodiment shown in fig. 1. Alternatively, the processor 310, when executing the computer program 321, implements the functions of each module/unit in each device embodiment described above, for example, the functions of the modules 201 to 204 shown in fig. 6.
Illustratively, the computer program 321 may be partitioned into one or more modules/units, which are stored in the memory 320 and executed by the processor 310 to accomplish the present application. One or more modules/units may be a series of computer program segments capable of performing certain functions, the program segments being used to describe the execution of the computer program in the terminal device 300.
Those skilled in the art will appreciate that fig. 6 is merely an example of a terminal device and is not limiting and may include more or fewer components than shown, or some components may be combined, or different components such as input output devices, network access devices, buses, etc.
The Processor 310 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 320 may be an internal storage unit of the terminal device 300, or may be an external storage device of the terminal device 300, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital Card (SD), a Flash memory Card (Flash Card), and the like. The memory 320 is used for storing the computer program 321 and other programs and data required by the terminal device 300. The memory 320 may also be used to temporarily store data that has been output or is to be output.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
The knowledge graph construction method provided by the embodiment of the application can be applied to terminal devices such as computers, wearable devices, vehicle-mounted devices, tablet computers, notebook computers, netbooks, personal Digital Assistants (PDAs), augmented Reality (AR)/Virtual Reality (VR) devices and mobile phones, and the embodiment of the application does not limit the specific types of the terminal devices at all.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps in the embodiments of the knowledge graph construction method may be implemented.
The embodiment of the application provides a computer program product, and when the computer program product runs on a mobile terminal, the steps in each embodiment of the knowledge graph construction method can be realized when the mobile terminal executes the computer program product.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be implemented by a computer program, which can be stored in a computer readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal device, recording medium, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunication signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A knowledge graph construction method is characterized by comprising the following steps:
acquiring knowledge data, and encoding a feature vector of the knowledge data by adopting an embedded layer, wherein the feature vector is an identifier obtained after encoding the knowledge data;
extracting a first sequence feature in the feature vector by adopting a first sequence feature extraction model, and extracting a second sequence feature in the feature vector by adopting a second sequence feature extraction model;
splicing the first sequence feature and the second sequence feature to obtain a third sequence feature;
and selecting the third sequence characteristics conforming to the expression logic by adopting a CRF layer to obtain a triple sequence, wherein the triple sequence consists of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph.
2. The method of knowledge-graph construction according to claim 1, wherein the first sequence feature extraction model comprises a first BiGRU layer and a second BiGRU layer, the first BiGRU layer and the second BiGRU layer being connected by tiling;
the first BiGRU layer and the second BiGRU layer are respectively composed of two GRU units, the input sequence of the GRU unit of the first layer is forward GRU, and the input sequence of the GRU unit of the second layer is reverse GRU;
and the feature vector firstly performs feature extraction through the first BiGRU layer, and then performs feature extraction through the second BiGRU layer, and outputs the first sequence feature.
3. The method of knowledge-graph construction according to claim 1 wherein the second sequence feature extraction model comprises a BilSTM layer for semantic feature extraction of character-level feature vectors;
and the BilSTM layer receives the feature vectors and outputs the character level feature vectors.
4. The method of knowledge-graph construction according to claim 3, wherein the second sequence feature extraction model further comprises a Self-Attention layer for assigning Attention weights to the character-level feature vectors, wherein the Attention weights characterize the degree of importance of each of the character-level feature vectors to knowledge-graph construction;
the Self-orientation layer receives feature extraction of the feature vector output by the BilSTM layer, and the Self-orientation layer outputs the second sequence feature.
5. The method of knowledge-graph construction according to claim 1, wherein said concatenating said first sequence features and said second sequence features to obtain third sequence features comprises:
splicing the first sequence feature and the second sequence feature by using a Concat function to obtain a third sequence feature, wherein an expression of the third sequence feature is as follows:
H(x)=concat[F(x)+G(x)]
wherein H (x) represents the third sequence feature, F (x) represents the first sequence feature, and G (x) represents the second sequence feature.
6. The method of knowledge-graph construction according to claim 1, wherein before said selecting said third sequence features according to expression logic using a CRF layer, said method comprises:
and coding the third sequence feature by adopting a full connection layer, and integrating a plurality of dimensions of the third sequence feature into one dimension.
7. The knowledge-graph constructing method according to claim 6, wherein the knowledge data includes picture data and text data;
after the acquiring of the knowledge data, before encoding the feature vectors of the knowledge data by using the embedding layer, the method further includes:
recognizing characters in the picture data by adopting an OCR recognition model;
performing word division on the text data according to Chinese grammar by adopting a crust word segmentation model;
and constructing a triple feature template, wherein the triple feature template is in the form of entity-relation-entity.
8. A knowledge-graph constructing apparatus for implementing the knowledge-graph constructing method according to any one of claims 1 to 7, the knowledge-graph constructing apparatus comprising:
the data acquisition module is used for acquiring knowledge data and encoding the characteristic vector of the knowledge data by adopting an embedded layer;
the characteristic extraction module is used for extracting first sequence characteristics in the characteristic vector by adopting a first sequence characteristic extraction model and extracting second sequence characteristics in the characteristic vector by adopting a second sequence characteristic extraction model;
the feature splicing module is used for splicing the first sequence features and the second sequence features to obtain third sequence features;
and the number of the first and second groups,
and the decoding output module is used for selecting the third sequence characteristics which accord with the expression logic by adopting a CRF layer to obtain a triple sequence, wherein the triple sequence is composed of an entity-relation-entity, and a plurality of triple sequences form a complete knowledge graph.
9. A terminal device comprising a memory and a processor, the memory having stored therein a computer program operable on the processor, wherein the processor, when executing the computer program, implements the method of any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202211394380.2A 2022-11-08 2022-11-08 Knowledge graph construction method and device and terminal equipment Pending CN115618019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211394380.2A CN115618019A (en) 2022-11-08 2022-11-08 Knowledge graph construction method and device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211394380.2A CN115618019A (en) 2022-11-08 2022-11-08 Knowledge graph construction method and device and terminal equipment

Publications (1)

Publication Number Publication Date
CN115618019A true CN115618019A (en) 2023-01-17

Family

ID=84878114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211394380.2A Pending CN115618019A (en) 2022-11-08 2022-11-08 Knowledge graph construction method and device and terminal equipment

Country Status (1)

Country Link
CN (1) CN115618019A (en)

Similar Documents

Publication Publication Date Title
CN111695352A (en) Grading method and device based on semantic analysis, terminal equipment and storage medium
CN110349229A (en) A kind of Image Description Methods and device
CN113158656B (en) Ironic content recognition method, ironic content recognition device, electronic device, and storage medium
CN113298152B (en) Model training method, device, terminal equipment and computer readable storage medium
CN113282714B (en) Event detection method based on differential word vector representation
CN112633431A (en) Tibetan-Chinese bilingual scene character recognition method based on CRNN and CTC
CN114861635B (en) Chinese spelling error correction method, device, equipment and storage medium
Inunganbi et al. Handwritten Meitei Mayek recognition using three‐channel convolution neural network of gradients and gray
CN113435499B (en) Label classification method, device, electronic equipment and storage medium
CN113342977B (en) Invoice image classification method, device, equipment and storage medium
CN113723111B (en) Small sample intention recognition method, device, equipment and storage medium
CN115455969A (en) Medical text named entity recognition method, device, equipment and storage medium
CN115618019A (en) Knowledge graph construction method and device and terminal equipment
CN110909546B (en) Text data processing method, device, equipment and medium
CN113836297A (en) Training method and device for text emotion analysis model
CN117235605B (en) Sensitive information classification method and device based on multi-mode attention fusion
CN110852102B (en) Chinese part-of-speech tagging method and device, storage medium and electronic equipment
CN116912920B (en) Expression recognition method and device
CN111680513B (en) Feature information identification method and device and computer readable storage medium
CN116467419A (en) Dialog generation method and device based on artificial intelligence, computer equipment and medium
CN117877047A (en) Chinese text recognition method based on visual transducer
CN116226329A (en) Intelligent retrieval method and device for problems in power grid field and terminal equipment
CN116740445A (en) Sample screening method, device, equipment and medium
CN116052198A (en) Data processing method, apparatus, device, storage medium and computer program product
CN116187439A (en) Graph searching model building method, graph searching method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination