CN116644192A - Knowledge graph construction method based on reliability of aircraft parts - Google Patents
Knowledge graph construction method based on reliability of aircraft parts Download PDFInfo
- Publication number
- CN116644192A CN116644192A CN202310625264.5A CN202310625264A CN116644192A CN 116644192 A CN116644192 A CN 116644192A CN 202310625264 A CN202310625264 A CN 202310625264A CN 116644192 A CN116644192 A CN 116644192A
- Authority
- CN
- China
- Prior art keywords
- entity
- knowledge
- knowledge graph
- extracting
- reliability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000010276 construction Methods 0.000 title claims abstract description 30
- 238000003745 diagnosis Methods 0.000 claims abstract description 74
- 238000000605 extraction Methods 0.000 claims abstract description 35
- 238000004458 analytical method Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000015654 memory Effects 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims description 31
- 239000013598 vector Substances 0.000 claims description 27
- 238000002372 labelling Methods 0.000 claims description 9
- 230000014509 gene expression Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 238000012423 maintenance Methods 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 6
- 230000004927 fusion Effects 0.000 description 5
- 239000003921 oil Substances 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000446 fuel Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000007596 consolidation process Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000010687 lubricating oil Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of knowledge graph construction, in particular to a knowledge graph construction method based on reliability of an aircraft part, which comprises the following steps: 1. constructing an ontology of the knowledge graph, namely defining entities, relations, attribute types and definitions required by the knowledge graph in the original data; 2. extracting the entity, namely training a Bi-directional long-short-term memory network Bi-LSTM by using BMEO labeled entity extraction corpus according to the definite entity definition type, and extracting the entity from unstructured and semi-structured texts by using the Bi-LSTM; 3. extracting the relation, and extracting semantic relation among entities from the text data; 4. performing entity alignment processing; 5. constructing a knowledge graph for the aircraft equipment by using the extracted entities and relations; and sixthly, the application of intelligent question answering and fault diagnosis analysis is realized by utilizing the knowledge graph. The invention can better construct the knowledge graph, thereby better diagnosing faults.
Description
Technical Field
The invention relates to the technical field of knowledge graph construction, in particular to a knowledge graph construction method based on reliability of aircraft parts.
Background
Currently, fault diagnosis of aircraft is mostly carried out on a single aircraft system, for example, fault diagnosis evaluation and analysis are only carried out on a hydraulic system.
Along with the development of the technology level, the design complexity of the aircraft is higher and higher, and the threat of each system of the aircraft to the flight safety of the aircraft is also increased, so that the fault diagnosis and the health state of the aircraft need to be judged rapidly and accurately. The conventional fault diagnosis analysis based on data driving cannot use expert knowledge, and the result has poor interpretation, so that the method brings inconvenience to practical use.
Disclosure of Invention
The present invention is based on the object of providing a knowledge-graph construction method based on the reliability of aircraft components, which is able to overcome certain drawbacks or drawbacks of the prior art.
According to the invention, the knowledge graph construction method based on the reliability of the aircraft component comprises the following steps of:
firstly, constructing a body of a knowledge graph, namely defining entities, relations, attribute types and definitions required by the knowledge graph in original data;
extracting the entity, namely training a Bi-directional long-short-term memory network Bi-LSTM by using the BMEO labeled entity extraction corpus according to the definite entity definition type, and extracting the entity from unstructured and semi-structured texts by using the Bi-LSTM;
extracting the relation, namely extracting semantic relation among entities from text data;
fourthly, performing entity alignment treatment;
fifthly, constructing a knowledge graph for the aircraft equipment by using the extracted entities and relations through a Pytorch and Neo4j graph database;
and step six, realizing the application of intelligent question answering and fault diagnosis analysis by using the knowledge graph.
In the first step, cleaning the data characteristics, deleting redundant and error information, combining expert knowledge in the aviation field, unifying text description to form a standardized structural text, and finding out entity, relationship and attribute internal connection in the preprocessing text; the explicit body includes: (1) determining the professional field and category of the field body; (2) consider multiplexing existing ontologies; (3) listing important terms in the art to which the ontology pertains; (4) defining classification concepts and a concept classification hierarchy; (5) relationships between concepts are defined.
Preferably, in the first step, the ontology construction is confirmed by a seven-step method, and the seven-step method comprises the following steps: (1) the key indexes, attributes and dimensions required by the reliability of the aircraft parts are defined by using an expert knowledge method, and the range is defined in a preliminary test; (2) collecting and checking whether the superposition with the ontology in other fields exists or not, and if the superposition is directly utilized, conveniently forming a unified expression form; (3) extracting the attribute of the entity according to the related requirements of civil aviation reliability; (4) classifying the entities to obtain a hierarchy chart of the entities; (5) defining attributes of the class; (6) defining a classification of the attribute; (7) an ontology is created.
Preferably, in the second step, the specific steps include: (1) taking the text data generated by the classification result of the step one as an input layer of the BERT model; (2) constructing a BERT model, and extracting text features; (3) the extracted text features are transmitted to a Bi-LSTM layer, BMEO labeling is used for training, keywords of training text are extracted, and entity labeling tasks are completed; (5) and taking the text marked after the weight is set as an input layer of the CRF, correcting marking errors and completing the entity extraction task.
Preferably, in the third step, the specific steps include: (1) inputting the text data extracted by the entity into an embedding layer to generate word vectors; (2) optimizing the word vector through the Bi-LSTM layer; (3) the self-attention layer generates weight vectors by optimizing word vectors and then merges the weight vectors into sentence-level feature vectors; (4) and finally, extracting the relation by using the sentence-level feature vector.
Preferably, in the fourth step, the specific steps include: (1) performing a method based on editing distance similarity on the text data to calculate the similarity between words; (2) and setting a threshold value, and identifying the same entity and relationship within the threshold value range.
Preferably, in the sixth step, the specific steps include:
1) The crew inputs the fault equipment;
2) Matching text similarity;
3) Judging whether the similarity meets a threshold value, if so, carrying out the next step; if not, carrying out the step 7);
4) Matching to the ATA100 device number;
5) Carrying out scheme matching by using new fault equipment;
6) Recommending a diagnostic regimen;
7) Extracting an entity of the input fault equipment;
8) Searching the knowledge graph to perform entity matching;
9) And (4) calculating and returning the fault equipment with the highest score, and then performing the step (4).
The beneficial effects of the invention are as follows:
(1) Based on the entity recognition model of BERT+Bi-LSTM+CRF and the relation extraction model of the self-attention mechanism, the automatic knowledge extraction of the system fault diagnosis text can be realized.
(2) By utilizing the framework and the technology provided by the invention, a system fault diagnosis knowledge graph with a certain data scale is constructed, knowledge support can be provided for applications such as intelligent question-answering, attribution analysis and the like, and an explanatory basis is provided for fault diagnosis to a certain extent.
(3) The application system fault diagnosis knowledge graph realizes the structured storage of unstructured fault diagnosis knowledge, can effectively integrate domain knowledge and aggregate expert experience, is favorable for improving the utilization rate of historical fault diagnosis knowledge and the fault diagnosis efficiency, and reduces the dependence on technical backbone experience.
With the continued development of new technologies such as artificial intelligence, big data, cloud computing and the like, the knowledge graph has wide application space in links such as aviation fault diagnosis management, quality control, safety supervision and the like. The knowledge extraction algorithm is further perfected, the coverage range of the knowledge graph is enlarged, and the knowledge engineering exploration of intelligent terminals such as images, audios and videos, sensors and the like is developed.
Drawings
FIG. 1 is a flow chart of a knowledge-graph construction method based on reliability of an aircraft component, in an embodiment;
FIG. 2 is a flow chart of ontology construction in an embodiment;
FIG. 3 is a basic flow diagram of entity extraction in an embodiment;
FIG. 4 is a diagram of basic steps of relation extraction in an embodiment;
FIG. 5 is a flow chart of a intelligent question-answering and fault diagnosis analysis in an embodiment;
FIG. 6 is a schematic diagram of a system fault diagnosis entity identification model in an embodiment;
FIG. 7 is a schematic diagram of a relationship extraction model in an embodiment.
Detailed Description
For a further understanding of the present invention, the present invention will be described in detail with reference to the drawings and examples. It is to be understood that the examples are illustrative of the present invention and are not intended to be limiting.
Examples
As shown in fig. 1, the present embodiment provides a knowledge graph construction method based on reliability of an aircraft component, which includes the following steps:
firstly, constructing a body of a knowledge graph, namely defining entities, relations, attribute types and definitions required by the knowledge graph in original data; for the aviation manufacturing industry, the data features show multi-mode, heterogeneous and strong relevance, redundant and error information is cleaned, and combined with aviation field expert knowledge, unified text description is formed to form standardized structural text, and entity, relationship and attribute internal relations in the preprocessing text are found to lay a foundation for subsequent entity and relationship extraction, as shown in fig. 2. The explicit body includes: (1) determining the professional field and category of the field body; (2) consider multiplexing existing ontologies; (3) listing important terms in the art to which the ontology pertains; (4) defining classification concepts and a concept classification hierarchy; (5) relationships between concepts are defined.
The body construction is confirmed by adopting a seven-step method, and the method has the advantages that interference of redundant information is effectively avoided, a clear entity is more objective, and the seven-step method comprises the following steps: (1) the key indexes, attributes, dimensions and the like required by the reliability of the aircraft parts are defined by using an expert knowledge method, and the range is initially defined; (2) collecting and checking whether the superposition with the bodies in other fields exists or not, and if the superposition can be directly utilized, a unified expression form is conveniently formed; (3) extracting the attribute of the entity according to the related requirements of civil aviation reliability; (4) classifying the entities to obtain a hierarchy chart of the entities; (5) defining attributes of the class; (6) defining a classification of the attribute; (7) an ontology is created.
Step two, extracting the entity, namely named entity recognition task (NER), according to the definite entity definition type, using BMEO labeled entity extraction corpus to train a two-way long short-term memory network (Bi-LSTM) and using the Bi-LSTM to extract the entity from unstructured and semi-structured texts, as shown in fig. 3, the specific steps comprise: (1) taking the text data generated by the classification result of the step one as an input layer of the BERT model; (2) constructing a BERT model, and extracting text features; (3) the extracted text features are transmitted to a Bi-LSTM layer, BMEO labeling is used for training, keywords of training text are extracted, and entity labeling tasks are completed; (5) and taking the text marked after the weight is set as an input layer of the CRF, correcting marking errors and completing the entity extraction task.
The main effects of feature extraction are two: (1) the input data set belongs to text data, and corresponding words or words are inconvenient to recognize for a computer training model, so that feature extraction is required to be converted into word vectors; (2) good text expression can improve training efficiency.
And thirdly, extracting the relation, namely extracting the semantic relation between the entities from the text data, wherein the relation classification is to find the semantic relation between the entity pairs. The advantage of using Bi-LSTM is that only text needs to be trained to extract relationships, without using external semantic resources and system features of NLP. Therefore, on the basis of the second step, the relationship extraction corpus after relationship labeling is used for training a two-way long-short-term memory network based on an attention mechanism, the relationship labeling also adopts a BMEO labeling method, and further, the relationship extraction among entities is carried out by using text data labeled in the entity extraction stage. The purpose of constructing this model is to capture important semantic information in the text and perform relationship extraction between entities, as shown in fig. 4. The method comprises the following specific steps: (1) inputting the text data extracted by the entity into an embedding layer to generate word vectors; (2) optimizing the word vector through the Bi-LSTM layer; (3) the self-attention layer generates weight vectors by optimizing word vectors and then merges the weight vectors into sentence-level feature vectors; (4) and finally, extracting the relation by using the sentence-level feature vector.
Fourthly, performing entity alignment treatment; the method is a key technology of knowledge graph fusion, the construction of the knowledge graph needs to be combined with third party data to complete dynamic updating, entity alignment aims at clustering words with the same meaning as the same entity or relationship, and the method specifically comprises the following steps: (1) performing a method based on editing distance similarity on the text data to calculate the similarity between words; (2) and setting a threshold value, and identifying the same entity and relationship within the threshold value range.
Fifthly, constructing a knowledge graph for the aircraft equipment by using the extracted entities and relations through a Pytorch and Neo4j graph database;
step six, the application of intelligent question answering and fault diagnosis analysis is realized by utilizing a knowledge graph, as shown in fig. 5, the specific steps include:
1) The crew inputs the fault equipment;
2) Matching text similarity;
3) Judging whether the similarity meets a threshold value, if so, carrying out the next step; if not, carrying out the step 7);
4) Matching to the ATA100 device number;
5) Carrying out scheme matching by using new fault equipment;
6) Recommending a diagnostic regimen;
7) Extracting an entity of the input fault equipment;
8) Searching the knowledge graph to perform entity matching;
9) And (4) calculating and returning the fault equipment with the highest score, and then performing the step (4).
The embodiment is based on the entity recognition model of BERT+Bi-LSTM+CRF and the relation extraction model of the self-attention mechanism, and can realize the extraction of the autonomous knowledge of the system fault diagnosis text. By utilizing the framework and the technology provided by the embodiment, a system fault diagnosis knowledge graph with a certain data scale is constructed, knowledge support can be provided for applications such as intelligent question-answering, attribution analysis and the like, and an explanatory basis is provided for fault diagnosis to a certain extent. The application system fault diagnosis knowledge graph of the embodiment realizes the structured storage of unstructured fault diagnosis knowledge, can effectively integrate domain knowledge and aggregate expert experience, is beneficial to improving the utilization rate of historical fault diagnosis knowledge and the fault diagnosis efficiency, and reduces the dependence on technical backbone experience.
Example
In the method, the operation maintenance record of an aircraft of a certain airline company ARJ-900 is taken as the original data, the fault diagnosis knowledge graph of the aircraft equipment is constructed and intelligently applied, the case verification is carried out on the proposed knowledge graph construction method, and the entity extraction and relation extraction effects of the method are analyzed by indexes such as accuracy, recall rate and the like, so that the effectiveness of the method is proved. The working process or the using mode is as follows:
a. knowledge pattern layer construction (ontology construction)
Before the ontology is built, the expert and related crews in the field discuss the service requirements of the fault diagnosis and analysis of the aircraft together, and the application purpose of the ontology is clearly built;
step2, checking whether other fields (such as medical diagnosis, power grid fault diagnosis treatment and the like) have reusable bodies;
step3, constructing a concept system in the field of system fault diagnosis, and defining a core concept of fault diagnosis knowledge;
step4, standardizing related concepts of fault diagnosis by referring to definitions of technical terms such as national standards, national military standards, industry standards and the like related to the aviation field;
step5. Returning to Step3, the fault diagnosis knowledge model is improved and optimized. Through multiple iterations, a fault diagnosis knowledge body model with practicability and commonality is finally constructed.
b. Knowledge graph data layer construction (entity and relationship extraction)
Named entity identification is the most core technology in the knowledge graph construction process, and mainly comprises entity identification and relation extraction.
Step1, entity recognition adopts a recognition method based on a two-way long-short-term memory neural network (Bi-LSTM) and a Conditional Random Field (CRF), and simultaneously realizes the entity recognition task by adding a BERT pre-training model, wherein the entity recognition model of the aircraft system is shown in figure 6.
Step2, certain association relations exist among entities in the related knowledge text of the fault diagnosis information of the relation extraction aircraft, and the relation extracted from the unstructured fault diagnosis record text is difficult to extract knowledge. The deep learning-based relation extraction method has great advantages in terms of improving relation classification efficiency and relation extraction accuracy, so that the relation extraction is realized by adopting a two-way long-short-term memory neural network model based on a self-attention mechanism, as shown in fig. 7.
The calculation formula of the two-way long-short-term memory neural network is as follows:
f(t)=σ(W f h t-1 +U f x t +b f )
i(t)=σ(W i h t-1 +U i x t +b i )
o(t)=σ(W o h t-1 +U o x t +b o )
wherein: x is x t Input at time t, h t-1 A hidden layer state value at the time t-1 is represented; w (W) f 、W i 、W o Respectively representing forgetting gate, input gate, output gate and h in the characteristic extraction process t-1 Weight coefficient of (2); u (U) f 、U i 、U o Respectively represent x t Weight coefficient of (2); b represents the offset value.
Step3. In the relation extraction model based on the self-attention mechanism, the calculation of the self-attention weight vector y and the calculation of the entity relation probability distribution P are key to realizing relation extraction, and the calculation formula is as follows
y=softmax(b att ·[tanh(H)] T )
P=softmax(W·[tanh(y·H)] T +k T )
H=(h 1 ,h 2 ,h 3 ,…,h n ) T
Wherein: b att A weight vector with a dimension of 2u, wherein u is the size of the BiLSTM hidden layer; h is a feature matrix obtained after Bi-LSTM encoding, and the dimension is n multiplied by 2u; w is a weight matrix for representing the importance of words, the dimension is c multiplied by 2u, and c is the number of relation types to be output; k is a bias parameter vector with a dimension of n; tanh (+) is the hyperbolic tangent activation function; softmax (+) is the normalized exponential function.
c. Entity alignment
(1) After the above entity and relationship extraction, the text may contain the same entity but different expressions, such as the meaning of the expressions "landing gear failure" and "landing gear failure" are consistent, so that the fusion processing is performed. Also, for example, the same "oil leak" but the object may be a "lubricating oil system" or a "fuel system" which also needs to be handled. After the entity, the relation and the attribute information of the entity are obtained from the original data through information extraction, logic attribution and redundancy/error filtering are needed to be carried out on the data through knowledge fusion. I.e. two flow implementations requiring physical linking and knowledge consolidation.
Step1, according to the above flow, a large amount of fuzzy and repeated data may exist in the knowledge extracted from the unstructured airplane fault diagnosis record text, and the purpose of knowledge fusion is to effectively fuse and unify the knowledge, so as to improve the knowledge quality of the knowledge graph database. The knowledge fusion task mainly comprises entity disambiguation and co-pointing disambiguation;
step2 entity disambiguation techniques are used to address the problem that homonymous entities refer to different objective things, e.g. "oil spills" are described in some text as "oil spills of the oil system" and "oil spills of the fuel system" are described in some text, so the semantics of the contact context are required to determine the correct meaning of the homonymous entities. The co-reference resolution technology is used for solving the problem that multiple expression modes correspond to the same entity object, for example, an "engine accessory case" and an "accessory case" correspond to one unit entity, namely an "engine accessory case", particularly, fault diagnosis records and troubleshooting experiences written manually, and the phenomenon of irregular expression is common, so that unified and standard entity names are required. In the embodiment, the similarity between the system entities is calculated by adopting a mode of combining the cosine distance and the Jaccard correlation coefficient, and whether the entities to be aligned are matched is judged by setting a threshold value, so that knowledge fusion is realized. The corresponding calculation formula is as follows:
wherein: s is(s) 1 Sum s 2 Two system entities respectively; a(s) is an attribute string of the entity s. The larger the Similarity value, the higher the semantic Similarity that represents both.
d. Knowledge storage
Through the above process flow, the multi-source heterogeneous data is converted into structured knowledge, and the knowledge storage needs to store various kinds of knowledge in the form of 'entity-relation-entity/attribute' triples for supporting the effective management and calculation of large-scale graph data. Aiming at the characteristics of clear fault diagnosis record structure and rich relation between entities, the embodiment adopts Pytorch and NEO4j graph databases as a storage system, and can display the aircraft fault diagnosis knowledge graph from multiple dimensions.
e. Intelligent application based on knowledge graph
Step1 intelligent question answering based on knowledge graph.
The knowledge graph-based question-answering system (KBQA, knowledge graph based question answering) is a question-answering mechanism for searching answers in a constructed knowledge graph by carrying out semantic analysis and intention recognition on questions input by a user and converting the questions into a computer query language. The KBQA framework comprises technologies of problem classification, knowledge retrieval, answer generation and the like, wherein the problem classification is a key for realizing KBQA, the core task of the problem classification is to identify the intention of a problem raised by a user, and useful information is provided for constructing an answer generation strategy.
Step1-1 naive bayes classifier. The classification model based on probability statistics is widely applied to the problem classification of intelligent question-answering technologies in the industries of power grids, medical treatment, subways and the like at present. And extracting characteristic words from the question segmentation words of the user, so that the question classification is realized, and the question classification corresponds to the corresponding question-answering templates.
step1-2 adopts the classification method to construct a classification model by taking the question feature words after the Jieba word segmentation and the custom question categories as training samples, so as to realize the intention recognition of the questions, and the obtained classification result is corresponding to the corresponding question-answering templates. Let x= (x) 1 .x 2 ,…,x n ) N feature word vectors representing the question word segmentation, a= (a) 1 .a 2 ,…,a m ) Representing custom m categories. Under the assumption that feature word variables under each class are mutually independent, according to the maximum posterior probability, the naive Bayes classification selects the class with the maximum posterior probability as the pair-unaddressedClass labels for knowing the feature word vector x of a question, i.e
Wherein: p (a) j ) For category a j Can be determined by a priori probabilities of a in the samples j Calculating the frequency of occurrence of the class; p (x) i |a j ) To at a j X in class i Probability of occurrence, j=1, 2, …, m.
Regarding P (x) i |a j ) Generally we use word frequency to calculate, i.e
It can be seen that the molecular part of the formula has a great influence on the overall calculation result. Meanwhile, some core words are found when classification is performed, and are special key words, which are often representative for some classifications, and the occurrence times may be small, but the problem classification is greatly affected. If for core word x i In calculating P (x i |a j ) The calculation formula is still adopted, so that the model classification precision is not high. To solve this problem, correction of this calculation formula is required, which results in poor model classification accuracy. The embodiment adopts a weighting method to correct the model, and the correction formula is that
Wherein: k is the importance of the core word affecting the problem classification, k >1. The core word about the question category may be selected by means of category information keywords and expert experience.
Step2 knowledge graph-based fault diagnosis analysis
The Bayesian network construction for fault diagnosis is realized, and the construction of the knowledge graph can well realize the effect of monitoring the fault diagnosis of the aircraft in real time. Combining the former aircraft equipment fault diagnosis positioning result to obtain a knowledge graph subgraph of fault diagnosis as a structure foundation of the Bayesian network, and constructing a network relation graph containing an ATA100 chapter number, a maintenance part and a maintenance processing mode. And the fault diagnosis component is used as a core, and the ATA100 forward deduces the fault diagnosis component or system. And building a Bayesian network structure according to the maintenance information of chapter 100 of the ATA and the association structure between maintenance parts.
The fault diagnosis analysis is combined with the component fault diagnosis information given in the aircraft record data, the fault diagnosis frequency is counted, and meanwhile, an expert is consulted, so that a Bayesian network is endowed with corresponding probability parameter values: the method specifically comprises fault diagnosis probability of the fault diagnosis component, prior probability of occurrence of fault diagnosis phenomenon and conditional probability among 3 kinds of nodes.
Step2-1 can consider and consult the fault diagnosis theory with respect to the probability of fault diagnosis of the fault diagnosis component (the prior probability of the node), and assume that the service life of the component obeys the exponential distribution, and the component is divided into two states of normal (normal) and fail (fail). T is recorded i For the service life of component i, let t be i Obeying an exponential distribution, i.e. t i ~E(λ i ). Probability of failure diagnosis at time tAnd then carrying out statistical analysis according to the data such as the fault diagnosis record account and the like to obtain corresponding parameter values.
The prior probability of occurrence of Step2-2 fault diagnosis means, that is, the probability of occurrence of the maintenance needs to be determined first in the normal case, can be obtained through analysis of past data, and it is assumed that the prior probabilities of all fault diagnosis nodes are the same, so that the posterior probability is emphasized through new observation results, and when no new fault diagnosis phenomenon is observed, the probability of failure diagnosis (cause) is 98%, and 2% probability exists. Conditional probabilities between nodes are given in conjunction with past statistics and expert interviews.
Step2-3 bayesian network based quantitative analysis proceeds in two directions, forward (predictive) and reverse (diagnostic). In the forward analysis, the probability of occurrence of an arbitrary node is calculated from the prior probability of the root node and the conditional probability of each node. Such as: the probability of a fault diagnosis phenomenon which may occur when a certain fault diagnosis occurs is known; when the time of use of a certain component is known, the probability of occurrence of its corresponding failure diagnosis cause. In the reverse analysis (diagnosis reasoning), after the probability of a certain child node is updated, the posterior probability of the parent node can be updated according to a Bayesian formula, namely the actual reasoning process of fault diagnosis. In general, the larger the difference between the prior probability and the posterior probability of the fault diagnosis, the higher the probability of occurrence of the corresponding fault diagnosis. The fault diagnosis method may provide a possibility of occurrence of fault diagnosis, but cannot give an explicit diagnosis result. Therefore, according to engineering experience, a threshold value of occurrence probability of fault diagnosis is generally required to be set to assist in judgment, and two judgment rules are defined in this embodiment to output a fault diagnosis evaluation level.
The invention and its embodiments have been described above by way of illustration and not limitation, and the invention is illustrated in the accompanying drawings and described in the drawings in which the actual structure is not limited thereto. Therefore, if one of ordinary skill in the art is informed by this disclosure, the structural mode and the embodiments similar to the technical scheme are not creatively designed without departing from the gist of the present invention.
Claims (7)
1. The knowledge graph construction method based on the reliability of the aircraft parts is characterized by comprising the following steps of: the method comprises the following steps:
firstly, constructing a body of a knowledge graph, namely defining entities, relations, attribute types and definitions required by the knowledge graph in original data;
extracting the entity, namely training a Bi-directional long-short-term memory network Bi-LSTM by using the BMEO labeled entity extraction corpus according to the definite entity definition type, and extracting the entity from unstructured and semi-structured texts by using the Bi-LSTM;
extracting the relation, namely extracting semantic relation among entities from text data;
fourthly, performing entity alignment treatment;
fifthly, constructing a knowledge graph for the aircraft equipment by using the extracted entities and relations through a Pytorch and Neo4j graph database;
and step six, realizing the application of intelligent question answering and fault diagnosis analysis by using the knowledge graph.
2. The knowledge-graph construction method based on the reliability of the aircraft parts according to claim 1, wherein: step one, cleaning data features, deleting redundant and error information, combining expert knowledge in the aviation field, unifying text description, forming standardized structural text, and finding out entity, relationship and attribute internal relation in the preprocessing text; the explicit body includes: (1) determining the professional field and category of the field body; (2) consider multiplexing existing ontologies; (3) listing important terms in the art to which the ontology pertains; (4) defining classification concepts and a concept classification hierarchy; (5) relationships between concepts are defined.
3. The knowledge-graph construction method based on the reliability of the aircraft parts according to claim 2, wherein: in the first step, the body construction is confirmed by adopting a seven-step method, wherein the seven-step method comprises the following steps: (1) the key indexes, attributes and dimensions required by the reliability of the aircraft parts are defined by using an expert knowledge method, and the range is defined in a preliminary test; (2) collecting and checking whether the superposition with the ontology in other fields exists or not, and if the superposition is directly utilized, conveniently forming a unified expression form; (3) extracting the attribute of the entity according to the related requirements of civil aviation reliability; (4) classifying the entities to obtain a hierarchy chart of the entities; (5) defining attributes of the class; (6) defining a classification of the attribute; (7) an ontology is created.
4. A knowledge-graph construction method based on reliability of aircraft parts according to claim 3, characterized in that: the knowledge-graph construction method based on the reliability of the aircraft parts according to claim 2, wherein: in the second step, the specific steps include: (1) taking the text data generated by the classification result of the step one as an input layer of the BERT model; (2) constructing a BERT model, and extracting text features; (3) the extracted text features are transmitted to a Bi-LSTM layer, BMEO labeling is used for training, keywords of training text are extracted, and entity labeling tasks are completed; (5) and taking the text marked after the weight is set as an input layer of the CRF, correcting marking errors and completing the entity extraction task.
5. The knowledge-graph construction method based on the reliability of the aircraft parts according to claim 4, wherein: in the third step, the specific steps include: (1) inputting the text data extracted by the entity into an embedding layer to generate word vectors; (2) optimizing the word vector through the Bi-LSTM layer; (3) the self-attention layer generates weight vectors by optimizing word vectors and then merges the weight vectors into sentence-level feature vectors; (4) and finally, extracting the relation by using the sentence-level feature vector.
6. The knowledge-graph construction method based on the reliability of the aircraft parts according to claim 5, wherein: in the fourth step, the specific steps include: (1) performing a method based on editing distance similarity on the text data to calculate the similarity between words; (2) and setting a threshold value, and identifying the same entity and relationship within the threshold value range.
7. The knowledge-graph construction method based on the reliability of the aircraft parts according to claim 6, wherein: in the sixth step, the specific steps include:
1) The crew inputs the fault equipment;
2) Matching text similarity;
3) Judging whether the similarity meets a threshold value, if so, carrying out the next step; if not, carrying out the step 7);
4) Matching to the ATA100 device number;
5) Carrying out scheme matching by using new fault equipment;
6) Recommending a diagnostic regimen;
7) Extracting an entity of the input fault equipment;
8) Searching the knowledge graph to perform entity matching;
9) And (4) calculating and returning the fault equipment with the highest score, and then performing the step (4).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310625264.5A CN116644192B (en) | 2023-05-30 | 2023-05-30 | Knowledge graph construction method based on reliability of aircraft parts |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310625264.5A CN116644192B (en) | 2023-05-30 | 2023-05-30 | Knowledge graph construction method based on reliability of aircraft parts |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116644192A true CN116644192A (en) | 2023-08-25 |
CN116644192B CN116644192B (en) | 2024-10-01 |
Family
ID=87643052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310625264.5A Active CN116644192B (en) | 2023-05-30 | 2023-05-30 | Knowledge graph construction method based on reliability of aircraft parts |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116644192B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118333411A (en) * | 2024-06-14 | 2024-07-12 | 福州无比欢信息科技有限公司 | Safety production accident hidden danger early warning system based on artificial intelligence |
CN118535947A (en) * | 2024-04-23 | 2024-08-23 | 中国民用航空飞行学院 | KG-BN based aircraft equipment fault diagnosis method |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109033284A (en) * | 2018-07-12 | 2018-12-18 | 国网福建省电力有限公司 | The power information operational system database construction method of knowledge based map |
CN109919585A (en) * | 2019-05-14 | 2019-06-21 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | Artificial intelligence auxiliary administrative examination and approval method, system and the terminal of knowledge based map |
CN112613314A (en) * | 2020-12-29 | 2021-04-06 | 国网江苏省电力有限公司信息通信分公司 | Electric power communication network knowledge graph construction method based on BERT model |
WO2021196520A1 (en) * | 2020-03-30 | 2021-10-07 | 西安交通大学 | Tax field-oriented knowledge map construction method and system |
CN114817454A (en) * | 2022-02-18 | 2022-07-29 | 北京邮电大学 | NLP knowledge graph construction method combining information content and BERT-BilSTM-CRF |
CN115688919A (en) * | 2021-07-29 | 2023-02-03 | 北京航空航天大学 | Method for constructing and applying fault diagnosis knowledge graph of airplane power supply system |
CN115712732A (en) * | 2022-09-13 | 2023-02-24 | 中国电力科学研究院有限公司 | Method, system, equipment and medium for constructing knowledge graph ontology of power equipment |
CN115718802A (en) * | 2022-11-14 | 2023-02-28 | 长城汽车股份有限公司 | Fault diagnosis method, system, equipment and storage medium |
CN115858807A (en) * | 2022-11-30 | 2023-03-28 | 中国人民解放军空军工程大学 | Question-answering system based on aviation equipment fault knowledge map |
-
2023
- 2023-05-30 CN CN202310625264.5A patent/CN116644192B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109033284A (en) * | 2018-07-12 | 2018-12-18 | 国网福建省电力有限公司 | The power information operational system database construction method of knowledge based map |
CN109919585A (en) * | 2019-05-14 | 2019-06-21 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | Artificial intelligence auxiliary administrative examination and approval method, system and the terminal of knowledge based map |
WO2021196520A1 (en) * | 2020-03-30 | 2021-10-07 | 西安交通大学 | Tax field-oriented knowledge map construction method and system |
CN112613314A (en) * | 2020-12-29 | 2021-04-06 | 国网江苏省电力有限公司信息通信分公司 | Electric power communication network knowledge graph construction method based on BERT model |
CN115688919A (en) * | 2021-07-29 | 2023-02-03 | 北京航空航天大学 | Method for constructing and applying fault diagnosis knowledge graph of airplane power supply system |
CN114817454A (en) * | 2022-02-18 | 2022-07-29 | 北京邮电大学 | NLP knowledge graph construction method combining information content and BERT-BilSTM-CRF |
CN115712732A (en) * | 2022-09-13 | 2023-02-24 | 中国电力科学研究院有限公司 | Method, system, equipment and medium for constructing knowledge graph ontology of power equipment |
CN115718802A (en) * | 2022-11-14 | 2023-02-28 | 长城汽车股份有限公司 | Fault diagnosis method, system, equipment and storage medium |
CN115858807A (en) * | 2022-11-30 | 2023-03-28 | 中国人民解放军空军工程大学 | Question-answering system based on aviation equipment fault knowledge map |
Non-Patent Citations (4)
Title |
---|
梁乙凯: "基于本体的教育资源元数据验证模型的研究与设计", 《中国优秀硕士学位论文全文数据库电子期刊网》, 15 January 2013 (2013-01-15), pages 14 - 15 * |
童昭: "基于预训练模型的军事领域命令实体识别研究", 《数据与计算发展前沿》, 20 October 2022 (2022-10-20), pages 120 - 123 * |
聂同攀: "面向飞机电源系统故障诊断的知识图谱构建技术及应用", 《航空学报》, 26 August 2021 (2021-08-26), pages 1 - 15 * |
郑闯: "电网智能客服问答系统设计与实现", 《中国优秀硕士学位论文全文数据库电子期刊网》, 15 February 2023 (2023-02-15), pages 41 - 42 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118535947A (en) * | 2024-04-23 | 2024-08-23 | 中国民用航空飞行学院 | KG-BN based aircraft equipment fault diagnosis method |
CN118333411A (en) * | 2024-06-14 | 2024-07-12 | 福州无比欢信息科技有限公司 | Safety production accident hidden danger early warning system based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN116644192B (en) | 2024-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116644192B (en) | Knowledge graph construction method based on reliability of aircraft parts | |
Rajpathak | An ontology based text mining system for knowledge discovery from the diagnosis data in the automotive domain | |
US20230169309A1 (en) | Knowledge graph construction method for ethylene oxide derivatives production process | |
CN111581396A (en) | Event graph construction system and method based on multi-dimensional feature fusion and dependency syntax | |
Kaza et al. | Evaluating ontology mapping techniques: An experiment in public safety information sharing | |
Al-Arfaj et al. | Ontology construction from text: challenges and trends | |
Pence et al. | Data-theoretic approach for socio-technical risk analysis: Text mining licensee event reports of US nuclear power plants | |
Xu et al. | Data-driven causal knowledge graph construction for root cause analysis in quality problem solving | |
CN113487211A (en) | Nuclear power equipment quality tracing method and system, computer equipment and medium | |
CN116010619A (en) | Knowledge extraction method in complex equipment knowledge graph construction process | |
Jang et al. | TechWordNet: Development of semantic relation for technology information analysis using F-term and natural language processing | |
CN115858807A (en) | Question-answering system based on aviation equipment fault knowledge map | |
Qu et al. | Knowledge-driven recognition methodology for electricity safety hazard scenarios | |
Liu et al. | Repairing and reasoning with inconsistent and uncertain ontologies | |
Brito et al. | Subjective machines: Probabilistic risk assessment based on deep learning of soft information | |
Wang et al. | Ids-kg: An industrial dataspace-based knowledge graph construction approach for smart maintenance | |
CN117687824A (en) | Satellite fault diagnosis system based on quality problem knowledge graph | |
Ahaggach et al. | Information extraction from automotive reports for ontology population | |
Wang et al. | Transh-ra: A learning model of knowledge representation by hyperplane projection and relational attributes | |
Goossens et al. | GPT-3 for Decision Logic Modeling. | |
CN112559741A (en) | Nuclear power equipment defect recording text classification method, system, medium and electronic equipment | |
Wan et al. | Evaluation model of power operation and maintenance based on text emotion analysis | |
CN113988083B (en) | Factual information coding and evaluating method for generating shipping news abstract | |
Hu et al. | A classification model of power operation inspection defect texts based on graph convolutional network | |
Pang et al. | A New Fault Diagnosis Method for Quality Control of Electromagnet Based on T–S Fault Tree and Grey Relation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |