CN113987212A - Knowledge graph construction method for process data in numerical control machining field - Google Patents

Knowledge graph construction method for process data in numerical control machining field Download PDF

Info

Publication number
CN113987212A
CN113987212A CN202111361153.5A CN202111361153A CN113987212A CN 113987212 A CN113987212 A CN 113987212A CN 202111361153 A CN202111361153 A CN 202111361153A CN 113987212 A CN113987212 A CN 113987212A
Authority
CN
China
Prior art keywords
data
class
word
processing
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111361153.5A
Other languages
Chinese (zh)
Inventor
萧筝
王继业
田野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202111361153.5A priority Critical patent/CN113987212A/en
Publication of CN113987212A publication Critical patent/CN113987212A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Machine Translation (AREA)

Abstract

A knowledge graph construction method of process data in the field of numerical control machining comprises the following steps: s1: establishing an original data database, S2: constructing a knowledge map mode layer, manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer: s3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer; s4: and visual display, wherein the mode layer and the data layer are combined to construct a knowledge graph, and the knowledge graph is stored and visually displayed. The design not only can intuitively and comprehensively manage the process data, but also can realize deep development of the data on the premise of deep learning reasoning.

Description

Knowledge graph construction method for process data in numerical control machining field
Technical Field
The invention relates to a knowledge graph construction method of process data in the field of numerical control machining, which is particularly suitable for visually managing the process data.
Background
The numerical control machining is carried out by taking a numerical control machine tool or a numerical control machining center as an independent unit, the numerical control machine tool is used for machining and manufacturing a single part, and blank materials are circulated to be machined and molded finally by different machine tools. The process data refers to data used and generated during process design of the mechanical product, and comprises normalized static data and dynamic data generated during processing. From the design of parts to the qualified processing, a large amount of data is generated, the data type is complex, the data format is various, and structured data, semi-structured data and unstructured data are included. The dynamic data in the part machining process is collected by numerical control machining equipment and stored in a database, standard files related to the part design process are mostly unstructured data, the design drawing formats of parts are different due to different modeling software, the machining process of the parts has different presentation modes according to different machining conditions and enterprises, for example, the formats of process cards are not completely the same, and whether the machining process needs to be expressed by the process cards is different. From the content of the process data, the process data can be divided into resource type data, rule type data and procedure type data. Structured data and semi-structured data are easier to associate with a knowledge graph, while unstructured data is more difficult to represent with a knowledge graph. The expression content of unstructured data is rich, but the value of further mining and utilizing the data is difficult to realize. Therefore, the method for constructing the knowledge graph of the unstructured data in the process data in the field of numerical control machining is provided.
Knowledge-graph is a method of structural representation of knowledge that describes objective facts and their relationships in the form of a graph. In essence, a knowledge graph is a large semantic network that describes entities, entity attributes, and entity relationships, where nodes may represent entities and attributes, and edges may represent relationships. The method mainly adopts a triple form for representation, and the description is clear as long as all the triples of < entity, relationship, entity >, < entity, attribute and attribute value > are described in a clear knowledge map. Therefore, the key point of knowledge graph construction is extraction of triple information. The knowledge graph can be divided into a general knowledge graph and a vertical domain knowledge graph, the vertical domain knowledge graph is stronger in speciality, higher in precision and more complete in knowledge system. The knowledge spectrogram has the advantages of intuition, strong learning performance and the like.
The technical data in the field of numerical control machining are complex and various, some researches are based on BOM to manage and integrate the technical data, some researches are based on EBOP technical structure tree to organize and manage the technical data, and some researches are based on developing a technical database system to store and utilize the technical data. The management methods for the process data are not intuitive enough, the utilization of knowledge is not high, and the deep value of the data is difficult to mine.
Disclosure of Invention
The invention aims to solve the problem that the process data management in the prior art is not visual and comprehensive enough, and provides a method for constructing a knowledge graph of process data in the field of numerical control machining, which is used for visually and comprehensively managing the process data.
In order to achieve the above purpose, the technical solution of the invention is as follows:
a knowledge graph construction method of process data in the field of numerical control machining comprises the following steps:
s1: establishing an original data database, searching data information related to the numerical control machining field, and establishing the original data database;
s2: constructing a knowledge map mode layer, manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer:
s3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer; because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module;
s4: and visual display, wherein the mode layer and the data layer are combined to construct a knowledge graph, and the knowledge graph is stored and visually displayed.
The S1: the data information related to the numerical control machining field in the original information database is established and comprises the following steps: drawing data of parts, a normative manual, a processing process technology card, data of a numerical control machine tool and a processing center, data of positioning equipment of a clamp measuring tool, a technology file, a procedure card, a professional book and a technology manual; and storing the searched data information related to the numerical control machining field into a local hard disk as an original information database.
The S2: constructing a knowledge map mode layer, namely manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer;
s2.1: analyzing the extracted process data, and summarizing the data by combining expert knowledge; establishing an ontology model and constructing a knowledge graph mode layer; the knowledge graph mode layer is constructed in an ontology model mode, and the ontology model is described by an ontology language OWL; extracting process knowledge manually aiming at unstructured data in original data, and establishing a body model by combining expert knowledge to construct a knowledge map mode layer; the ontology model is a model of the mode layer, and the ontology language is a language for describing the mode layer;
s2.2: the machining object of the numerical control machining equipment is a part, and the part information and the machining and manufacturing information contained in the CAD model and the numerical control machining process document of the part are analyzed and can be summarized into six types of data: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the process data can be summarized as: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the processing object data is the basis of the whole process data, describes text information such as the number and name of the processing object, and also describes information including the material, the size and the requirement of the mechanical model; the data of the processing equipment mainly refers to a numerical control machine tool or a processing center required by processing parts, such as a numerical control milling machine and a numerical control lathe; the processing method data refers to a method and operation for processing a part, and the steps of a material from a blank to the part, specifically refers to a processing procedure and a working step; the processing equipment data refers to process equipment, and comprises a positioning device and a clamping device which are used for positioning, and measuring tools such as tools on a machine tool are used as aids in the processing process; the machining characteristic data refers to specific machining objects and requirements in the working procedures finished by the machine tool, such as surfaces, holes and threads, and the machining requirements comprise precision requirements and material processing requirements; the semantic relation data refers to the relation among the previous data and represents the relation among different types of data;
s2.3: in order to maintain the accuracy and completeness of the mode layer, the above six data are abstracted into six classes, wherein the processing object data, the processing equipment data, the processing method data, the processing equipment data, the processing characteristic data and the semantic relation data correspond to each other in sequence: parts, equipment, processes, equipment, characteristics and relations; the process class inherits a process class and a process step class, the equipment class inherits a cutter class, a clamp class, a measuring tool class and an auxiliary tool class, and the characteristic class inherits a shape characteristic class, an accuracy characteristic class, a technical characteristic class, a management characteristic class and a material characteristic class, wherein the accuracy characteristic class inherits a size accuracy class, a shape accuracy class, a position accuracy class and a surface roughness class, and the relation class inherits a contact class and an order class;
the class attributes comprise class attributes and instance attributes, the class attributes are shared by various classes and subclasses thereof, all instances share the corresponding class attributes, and the instance attributes are only owned by all the instances; the part class has a name class attribute, the equipment class has a name class attribute, the process class has a name class attribute, the equipment class has a name class attribute, the feature class has a name class attribute, the relationship class has a name class attribute, and other example attributes are added according to different entities;
s2.4: the mode layer of the knowledge graph describes some entity classes and relations among the entity classes, the mode layer is established according to the abstract entity classes in S2.3 and is described through an expression, an ontology model is established on the abstract entity classes and relations, the ontology model is established and displayed through prot g e software, knowledge of the mode layer is expressed through the ontology model, and the ontology model can be described through owl language;
the ontology model formalized expression of the mode layer of the knowledge graph is as follows:
KGPattern={Entity∪Relation}
wherein:
Entity={P∪M∪O∪E∪F}
Relation={R}
R={(Pi,contain,Fj)∪(Oi,contain,Mj)∪(Oi,contain,Oj)∪(Oi,contain,Ej)∪(Oi,contain,Fj)∪(Oi,order,Oj)∪}
in the above formula: KGPattern refers to a formal expression model of a knowledge graph mode layer, Entity refers to an Entity set described by the mode layer, and relationship refers to a relationship set described by the mode layer; p represents a part class, M represents a device class, O represents a process class, E represents an equipment class, F represents a feature class, and R represents a relationship class. The semantic relation R comprises a relation and an order, wherein the relation is defined as an inclusion relation between a part class and a feature class, an inclusion relation between a process class and an equipment class, an equipment class and a feature class, and an inclusion relation between the process classes, and the relation is expressed and described as follows: (P)i,contain,Fj) Indicating part PiIncluding feature Fj,(Oi,contain,Mj) Represents Process OiIncluding a device Mj,(Oi,contain,Oj) To representProcess OiComprising process Oj,(Oi,contain,Ej) Represents Process OiComprising equipment Ej,(Oi,contain,Fj) Represents Process OiIncluding feature Fj(ii) a The order relation is defined as the sequence relation between the working procedures or the working steps in the process class, and the relation is expressed and described as follows: (O)i,order,Oj) Indicating the step (Process step) OiBefore, Process (step) OjThen, carrying out the treatment; as shown in table 1;
TABLE 1 semantic relationship representation and description
Figure BDA0003359323200000041
Figure BDA0003359323200000051
At this time, the knowledge map pattern layer construction is completed, and the next step S3 is proceeded to.
The S3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer;
because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module; the specific operation flow is as follows:
s3.1: converting unstructured data to be processed into a file in a plain text format, reading in the data through programming, and storing the data into a file in a txt format;
s3.2: the method comprises the following steps of utilizing a technical sentence word segmentation module to segment words, wherein the technical sentence word segmentation module is realized based on a Bi-LSTM-CRF algorithm, utilizing the technical sentence word segmentation module to segment words, inputting pure text data obtained in S3.1 into a trained technical sentence word segmentation module, and finally outputting text files with word segmentation labels 'B', 'I', 'W' and 'S', wherein the text word segmentation is finished:
s3.3: classifying words in the text with the word segmentation completed to form word vectors, realizing a process word classification module based on word2vec algorithm, taking the word segmentation in the text file with the word segmentation labels obtained in S3.2 as the words to be classified, calculating the cosine distance between the words to be classified and the words in the word classification corpus, assigning the class of the sample with the shortest distance to the same class of the words to be classified, and finally outputting the text file with the word class labels, wherein the word classification is completed;
s3.4: and (4) semantic relation construction, namely identifying words according to entity types defined by the mode layer of the text file with word type labels obtained in the S3.3, identifying relations between the words according to relations between the entities defined by the mode layer, storing the relations in a triple mode, identifying relations between the entities and the attributes according to the word attribute labels, and storing the relations in a triple mode, wherein the stored triples are output as < entities, relations, entities >, < entities, attributes and attribute values > triples.
And S3.4: constructing a semantic relation, namely removing non-category placeholder words from the text with the word category labels obtained in S3.3 to obtain a category word text set, and sequentially processing words in the category word text set to obtain a triple output until all words are processed; the treatment process comprises the following steps:
s3.4.1: if the former word label is "p" and the next word label is "f", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.2 is entered;
s3.4.2: if the former word label is any one of "p, m, e, f, d, s" and the next word label is "a", adding a triple < the former word, the latter word >, and then entering the processing of the next word; if not, S3.4.3 is entered;
s3.4.3: if the former word label is "d" or "s" and the next word label is "m", "e" or "f", adding the triple < the former word, continain, the latter word >, and then entering the processing of the next word; if not, S3.4.4 is entered;
s3.4.4: if the former word label is "d" and the next word label is "s", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.4 is entered;
s3.4.5: if the former word label and the next word label are both'd' or the former word label and the next word label are both's', adding a triple < the former word, order, the latter word >, and then entering the processing of the next word; if not, directly entering the processing of the next word;
after all the words are processed, the words are stored in the format of the csv file, at this time, the semantic relationship construction is completed, and the process proceeds to S4.
The training process of the technical sentence word segmentation module in the S3.2 is as follows:
s3.2.1: performing preliminary word segmentation on the sentences in the plain text format file by using a word segmentation packet jieba;
s3.2.2: manually judging and modifying the word segmentation result obtained in S3.2.1, and labeling labels of 'B', 'I', 'W' and 'S', wherein 'B' represents the first Chinese character of a word, 'I' represents the middle Chinese character of the word, 'W' represents the last Chinese character of the word, and 'S' represents that the word has only one Chinese character;
s3.2.3: storing the text with the word segmentation labels obtained in S3.2.2 into a local hard disk as a process sentence word segmentation corpus;
s3.2.4: and (3) training a process sentence segmentation module, importing the process sentence segmentation corpus obtained from S3.2.3 into the process sentence segmentation module, reading in a segmentation module program for iterative operation, and completing the training of the process sentence segmentation module after the program is operated.
The training process of the process word classification module in the S3.3 is as follows:
s3.3.1: the words in the text processed on the above are manually labeled with eight categories of "part", "device", "equipment", "feature", "relationship", "process", "step", "attribute", which are respectively represented by "p", "m", "e", "f", "r", "d", "S" and "a", and a non-category word is represented by "x", seven categories of the above-mentioned "part", "device", "equipment", "feature", "relationship", "process" and "step" correspond to six types of data in S2.2, and "attribute" corresponds to the attribute of the class in S2.3;
the processing object data corresponds to a 'part'; the processing equipment data corresponds to 'equipment'; the processing method data corresponds to a procedure and a step, the part processing comprises a plurality of procedures, and each procedure comprises a plurality of steps; the processing equipment data corresponds to 'equipment'; the processing characteristic data corresponds to a 'characteristic'; semantic relationship data corresponds to a "relationship"; "Attribute" contains a value and a name;
s3.3.2: storing the text with the classification labels obtained in S3.3.1 into a local hard disk as a process word classification corpus;
s3.3.3: and (3) training a process word classification module, importing the process word classification corpus obtained from S3.3.2 into the process word classification module, reading in a classification module program to perform iterative operation, and finishing the process word classification module training after the program operation is finished.
S4: and (4) visual display, namely combining the mode layer and the data layer to construct a knowledge graph, inputting the extracted triples into a Neo4j graph database, and storing the knowledge graph for visual display.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a method for managing, integrating and storing process data in the numerical control machining field, and the method is more visual compared with the traditional relational database storage. The process data of the processing process are fused, the data are integrated in the form of the graph, the inference process can be realized by utilizing the algorithm of the graph, the greater and deeper value of the process data is obtained, and the function of the data is played. The knowledge graph constructed according to the method can realize more intelligent functions, store a large number of knowledge graphs, combine the knowledge graphs and store the knowledge graphs into a process data knowledge graph library, and realize process route generation and process optimization through data mining and reasoning.
2. According to the method for constructing the knowledge graph of the process data in the numerical control processing field, the data is classified from a mode layer, three-dimensional machining is carried out, the data is abstracted into a two-dimensional knowledge graph, the process data in the numerical control processing field is sorted and summarized when the mode layer is constructed, various complex data are classified into six types of data and abstracted into six types of entities of a body model in the mode layer, the reasonable classification of the process data is beneficial to the clear understanding of the category composition of the process data from the macroscopic view, and the process data is understood from the top to the bottom view.
3. The data layer construction in the knowledge graph construction method of the process data in the numerical control machining field is completed under the constraint of the mode layer, extracted redundant information is reduced, and the extracted information is more consistent with the information in the original process data; since the present process data exists in a large amount in the unstructured text, which brings obstruction to the mining and intelligent generation of process routes of the process data in the future, the process data is expressed in a more regular manner. The invention excavates effective process data existing in the text, converts the effective process data into semantic relation data, and constructs a knowledge graph by utilizing the semantic relation data and stores the knowledge graph in a graph database.
Drawings
FIG. 1 is an overall flow chart of the knowledge-graph construction method of the present invention.
FIG. 2 is a flow chart of the construction of the schema layer of the present invention.
FIG. 3 is a schematic diagram of a mode layer of the present invention.
FIG. 4 is a flow chart of the construction of the data layer of the present invention.
FIG. 5 is a specific flow chart of the invention for training a Bi-LSTM-CRF based process sentence word segmentation model.
FIG. 6 is a detailed flow chart of the present invention for training a word2 vec-based process word classification model.
FIG. 7 is a detailed flow chart of semantic relationship construction of the present invention.
FIG. 8 is a data map of a numerical control machining process of the present invention using a drive shaft as an example.
Detailed Description
The present invention will be described in further detail with reference to the following description and embodiments in conjunction with the accompanying drawings.
Referring to fig. 1 to 7, a method for constructing a knowledge graph of process data in the field of numerical control machining includes the following steps:
s1: establishing an original data database, searching data information related to the numerical control machining field, and establishing the original data database;
s2: constructing a knowledge map mode layer, manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer:
s3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer; because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module;
s4: and visual display, wherein the mode layer and the data layer are combined to construct a knowledge graph, and the knowledge graph is stored and visually displayed.
The S1: the data information related to the numerical control machining field in the original information database is established and comprises the following steps: drawing data of parts, a normative manual, a processing process technology card, data of a numerical control machine tool and a processing center, data of positioning equipment of a clamp measuring tool, a technology file, a procedure card, a professional book and a technology manual; and storing the searched data information related to the numerical control machining field into a local hard disk as an original information database.
The S2: constructing a knowledge map mode layer, namely manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer;
s2.1: analyzing the extracted process data, and summarizing the data by combining expert knowledge; establishing an ontology model and constructing a knowledge graph mode layer; the knowledge graph mode layer is constructed in an ontology model mode, and the ontology model is described by an ontology language OWL; extracting process knowledge manually aiming at unstructured data in original data, and establishing a body model by combining expert knowledge to construct a knowledge map mode layer; the ontology model is a model of the mode layer, and the ontology language is a language for describing the mode layer;
s2.2: the machining object of the numerical control machining equipment is a part, and the part information and the machining and manufacturing information contained in the CAD model and the numerical control machining process document of the part are analyzed and can be summarized into six types of data: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the process data can be summarized as: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the processing object data is the basis of the whole process data, describes text information such as the number and name of the processing object, and also describes information including the material, the size and the requirement of the mechanical model; the data of the processing equipment mainly refers to a numerical control machine tool or a processing center required by processing parts, such as a numerical control milling machine and a numerical control lathe; the processing method data refers to a method and operation for processing a part, and the steps of a material from a blank to the part, specifically refers to a processing procedure and a working step; the processing equipment data refers to process equipment, and comprises a positioning device and a clamping device which are used for positioning, and measuring tools such as tools on a machine tool are used as aids in the processing process; the machining characteristic data refers to specific machining objects and requirements in the working procedures finished by the machine tool, such as surfaces, holes and threads, and the machining requirements comprise precision requirements and material processing requirements; the semantic relation data refers to the relation among the previous data and represents the relation among different types of data;
s2.3: in order to maintain the accuracy and completeness of the mode layer, the above six data are abstracted into six classes, wherein the processing object data, the processing equipment data, the processing method data, the processing equipment data, the processing characteristic data and the semantic relation data correspond to each other in sequence: parts, equipment, processes, equipment, characteristics and relations; the process class inherits a process class and a process step class, the equipment class inherits a cutter class, a clamp class, a measuring tool class and an auxiliary tool class, and the characteristic class inherits a shape characteristic class, an accuracy characteristic class, a technical characteristic class, a management characteristic class and a material characteristic class, wherein the accuracy characteristic class inherits a size accuracy class, a shape accuracy class, a position accuracy class and a surface roughness class, and the relation class inherits a contact class and an order class;
the class attributes comprise class attributes and instance attributes, the class attributes are shared by various classes and subclasses thereof, all instances share the corresponding class attributes, and the instance attributes are only owned by all the instances; the part class has a name class attribute, the equipment class has a name class attribute, the process class has a name class attribute, the equipment class has a name class attribute, the feature class has a name class attribute, the relationship class has a name class attribute, and other example attributes are added according to different entities;
s2.4: the mode layer of the knowledge graph describes some entity classes and relations among the entity classes, the mode layer is established according to the abstract entity classes in S2.3 and is described through an expression, an ontology model is established on the abstract entity classes and relations, the ontology model is established and displayed through prot g e software, knowledge of the mode layer is expressed through the ontology model, and the ontology model can be described through owl language;
the ontology model formalized expression of the mode layer of the knowledge graph is as follows:
KGPattern={Entity∪Relation}
wherein:
Entity={P∪M∪O∪E∪F}
Relation={R}
R={(Pi,contain,Fj)∪(Oi,contain,Mj)∪(Oi,contain,Oj)∪(Oi,contain,Ej)∪(Oi,contain,Fj)∪(Oi,order,Oj)∪}
in the above formula: KGPattern refers to a formal expression model of a knowledge graph mode layer, Entity refers to an Entity set described by the mode layer, and relationship refers to a relationship set described by the mode layer; p represents a part class, M represents a device class, O represents a process class, E represents an equipment class, F represents a feature class, and R represents a relationship class. The semantic relation R comprises a relation and an order, wherein the relation is defined as an inclusion relation between a part class and a feature class, an inclusion relation between a process class and an equipment class, an equipment class and a feature class, and an inclusion relation between the process classes, and the relation is expressed and described as follows: (P)i,contain,Fj) Indicating part PiIncluding feature Fj,(Oi,contain,Mj) Represents Process OiIncluding a device Mj,(Oi,contain,Oj) Represents Process OiComprising process Oj,(Oi,contain,Ej) Represents Process OiComprising equipment Ej,(Oi,contain,Fj) Represents Process OiIncluding feature Fj(ii) a The order relation is defined as the sequence relation between the working procedures or the working steps in the process class, and the relation is expressed and described as follows: (O)i,order,Oj) Indicating the step (Process step) OiBefore, Process (step) OjThen, carrying out the treatment; as shown in table 1;
TABLE 2 semantic relationship representation and description
Figure BDA0003359323200000101
Figure BDA0003359323200000111
At this time, the knowledge map pattern layer construction is completed, and the next step S3 is proceeded to.
The S3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer;
because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module; the specific operation flow is as follows:
s3.1: converting unstructured data to be processed into a file in a plain text format, reading in the data through programming, and storing the data into a file in a txt format;
s3.2: the method comprises the following steps of utilizing a technical sentence word segmentation module to segment words, wherein the technical sentence word segmentation module is realized based on a Bi-LSTM-CRF algorithm, utilizing the technical sentence word segmentation module to segment words, inputting pure text data obtained in S3.1 into a trained technical sentence word segmentation module, and finally outputting text files with word segmentation labels 'B', 'I', 'W' and 'S', wherein the text word segmentation is finished:
s3.3: classifying words in the text with the word segmentation completed to form word vectors, realizing a process word classification module based on word2vec algorithm, taking the word segmentation in the text file with the word segmentation labels obtained in S3.2 as the words to be classified, calculating the cosine distance between the words to be classified and the words in the word classification corpus, assigning the class of the sample with the shortest distance to the same class of the words to be classified, and finally outputting the text file with the word class labels, wherein the word classification is completed;
s3.4: and (4) semantic relation construction, namely identifying words according to entity types defined by the mode layer of the text file with word type labels obtained in the S3.3, identifying relations between the words according to relations between the entities defined by the mode layer, storing the relations in a triple mode, identifying relations between the entities and the attributes according to the word attribute labels, and storing the relations in a triple mode, wherein the stored triples are output as < entities, relations, entities >, < entities, attributes and attribute values > triples.
And S3.4: constructing a semantic relation, namely removing non-category placeholder words from the text with the word category labels obtained in S3.3 to obtain a category word text set, and sequentially processing words in the category word text set to obtain a triple output until all words are processed; the treatment process comprises the following steps:
s3.4.1: if the former word label is "p" and the next word label is "f", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.2 is entered;
s3.4.2: if the former word label is any one of "p, m, e, f, d, s" and the next word label is "a", adding a triple < the former word, the latter word >, and then entering the processing of the next word; if not, S3.4.3 is entered;
s3.4.3: if the former word label is "d" or "s" and the next word label is "m", "e" or "f", adding the triple < the former word, continain, the latter word >, and then entering the processing of the next word; if not, S3.4.4 is entered;
s3.4.4: if the former word label is "d" and the next word label is "s", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.4 is entered;
s3.4.5: if the former word label and the next word label are both'd' or the former word label and the next word label are both's', adding a triple < the former word, order, the latter word >, and then entering the processing of the next word; if not, directly entering the processing of the next word;
after all the words are processed, the words are stored in the format of the csv file, at this time, the semantic relationship construction is completed, and the process proceeds to S4.
The training process of the technical sentence word segmentation module in the S3.2 is as follows:
s3.2.1: performing preliminary word segmentation on the sentences in the plain text format file by using a word segmentation packet jieba;
s3.2.2: manually judging and modifying the word segmentation result obtained in S3.2.1, and labeling labels of 'B', 'I', 'W' and 'S', wherein 'B' represents the first Chinese character of a word, 'I' represents the middle Chinese character of the word, 'W' represents the last Chinese character of the word, and 'S' represents that the word has only one Chinese character;
s3.2.3: storing the text with the word segmentation labels obtained in S3.2.2 into a local hard disk as a process sentence word segmentation corpus;
s3.2.4: and (3) training a process sentence segmentation module, importing the process sentence segmentation corpus obtained from S3.2.3 into the process sentence segmentation module, reading in a segmentation module program for iterative operation, and completing the training of the process sentence segmentation module after the program is operated.
The training process of the process word classification module in the S3.3 is as follows:
s3.3.1: the words in the text processed on the above are manually labeled with eight categories of "part", "device", "equipment", "feature", "relationship", "process", "step", "attribute", which are respectively represented by "p", "m", "e", "f", "r", "d", "S" and "a", and a non-category word is represented by "x", seven categories of the above-mentioned "part", "device", "equipment", "feature", "relationship", "process" and "step" correspond to six types of data in S2.2, and "attribute" corresponds to the attribute of the class in S2.3;
the processing object data corresponds to a 'part'; the processing equipment data corresponds to 'equipment'; the processing method data corresponds to a procedure and a step, the part processing comprises a plurality of procedures, and each procedure comprises a plurality of steps; the processing equipment data corresponds to 'equipment'; the processing characteristic data corresponds to a 'characteristic'; semantic relationship data corresponds to a "relationship"; "Attribute" contains a value and a name;
s3.3.2: storing the text with the classification labels obtained in S3.3.1 into a local hard disk as a process word classification corpus;
s3.3.3: and (3) training a process word classification module, importing the process word classification corpus obtained from S3.3.2 into the process word classification module, reading in a classification module program to perform iterative operation, and finishing the process word classification module training after the program operation is finished.
S4: and (4) visual display, namely combining the mode layer and the data layer to construct a knowledge graph, inputting the extracted triples into a Neo4j graph database, and storing the knowledge graph for visual display.
The principle of the invention is illustrated as follows:
the above required data are downloaded from a website or obtained from the inside of an enterprise, and are stored in a local hard disk as a raw data database for later use.
The knowledge graph is composed of a mode layer and a data layer. The mode layer is knowledge obtained through induction and extraction, a plurality of models are used for constructing the mode layer, the ontology model is one of the models, the description languages of the ontology model are better, and the OWL language is only one of the models. The data layer is an instantiated schema layer. What is equivalent to the schema layer contains only the name of the class, such as the tv series class, while the data layer contains what is specific, such as the western notes.
In addition, attribute is added, attribute does not belong to entity in the schema layer, but according to the last paragraph of S2.3, attribute contains value and name. This patent does not refer to attributes as entities. For example, if the device is an entity, the device name belongs to the property of the entity of the device and is not attributed to an entity in the schema layer.
S3.2.3, the label in the word segmentation corpus is the position of each word in the word, and the words of the text together may be a word, but it is not known what kind of word is. This step is for training the classification model, and the text content in the corpus of this step is broken by words, which may be the categories specified in this patent. For example, the phrase "machining a gear with gear shaping" is in the text of the corpus of S3.2.3: "machining gear with gear shaping, S BIW BW", and this statement is in the text in the corpus of this step: "machining gears with gear shaping, x m x p". The text in both corpora is a sentence.
S4: and (3) visually displaying, namely combining the mode layer and the data layer to construct a knowledge graph, wherein the combination means the constraint of the mode layer on the data layer, and because the word categories in the data layer are specified in the mode layer and the relations among the words are also specified in the mode layer, the word categories and the semantic relations are combined when the word categories and the semantic relations are constructed in the previous step.
Bi-directional-Long Short Term Memory-Conditional Random Field algorithm.
The word-segmentation packet jieba is a packet name in the software python, and the packet name is jieba.
word2vec is a two-layer neural network algorithm used to convert words in natural language processing into dense vectors. Before the algorithm, the words are represented by a one-hot coding mode, so the words are too sparse, dimension disasters can be caused, and the relations among the words are ignored. word2vec represents words as dense vectors, with vector distances reflecting the relationships between words. The word2vec algorithm may map the input words into vectors when training is complete.
Example 1:
a knowledge graph construction method of process data in the field of numerical control machining comprises the following steps:
s1: establishing an original data database, searching data information related to the numerical control machining field, and establishing the original data database;
s2: constructing a knowledge map mode layer, manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer:
s3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer; because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module;
s4: and visual display, wherein the mode layer and the data layer are combined to construct a knowledge graph, and the knowledge graph is stored and visually displayed.
Example 2:
example 2 is substantially the same as example 1 except that:
the S1: the data information related to the numerical control machining field in the original information database is established and comprises the following steps: drawing data of parts, a normative manual, a processing process technology card, data of a numerical control machine tool and a processing center, data of positioning equipment of a clamp measuring tool, a technology file, a procedure card, a professional book and a technology manual; and storing the searched data information related to the numerical control machining field into a local hard disk as an original information database.
The S2: constructing a knowledge map mode layer, namely manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer;
s2.1: analyzing the extracted process data, and summarizing the data by combining expert knowledge; establishing an ontology model and constructing a knowledge graph mode layer; the knowledge graph mode layer is constructed in an ontology model mode, and the ontology model is described by an ontology language OWL; extracting process knowledge manually aiming at unstructured data in original data, and establishing a body model by combining expert knowledge to construct a knowledge map mode layer; the ontology model is a model of the mode layer, and the ontology language is a language for describing the mode layer;
s2.2: the machining object of the numerical control machining equipment is a part, and the part information and the machining and manufacturing information contained in the CAD model and the numerical control machining process document of the part are analyzed and can be summarized into six types of data: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the process data can be summarized as: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the processing object data is the basis of the whole process data, describes text information such as the number and name of the processing object, and also describes information including the material, the size and the requirement of the mechanical model; the data of the processing equipment mainly refers to a numerical control machine tool or a processing center required by processing parts, such as a numerical control milling machine and a numerical control lathe; the processing method data refers to a method and operation for processing a part, and the steps of a material from a blank to the part, specifically refers to a processing procedure and a working step; the processing equipment data refers to process equipment, and comprises a positioning device and a clamping device which are used for positioning, and measuring tools such as tools on a machine tool are used as aids in the processing process; the machining characteristic data refers to specific machining objects and requirements in the working procedures finished by the machine tool, such as surfaces, holes and threads, and the machining requirements comprise precision requirements and material processing requirements; the semantic relation data refers to the relation among the previous data and represents the relation among different types of data;
s2.3: in order to maintain the accuracy and completeness of the mode layer, the above six data are abstracted into six classes, wherein the processing object data, the processing equipment data, the processing method data, the processing equipment data, the processing characteristic data and the semantic relation data correspond to each other in sequence: parts, equipment, processes, equipment, characteristics and relations; the process class inherits a process class and a process step class, the equipment class inherits a cutter class, a clamp class, a measuring tool class and an auxiliary tool class, and the characteristic class inherits a shape characteristic class, an accuracy characteristic class, a technical characteristic class, a management characteristic class and a material characteristic class, wherein the accuracy characteristic class inherits a size accuracy class, a shape accuracy class, a position accuracy class and a surface roughness class, and the relation class inherits a contact class and an order class;
the class attributes comprise class attributes and instance attributes, the class attributes are shared by various classes and subclasses thereof, all instances share the corresponding class attributes, and the instance attributes are only owned by all the instances; the part class has a name class attribute, the equipment class has a name class attribute, the process class has a name class attribute, the equipment class has a name class attribute, the feature class has a name class attribute, the relationship class has a name class attribute, and other example attributes are added according to different entities;
s2.4: the mode layer of the knowledge graph describes some entity classes and relations among the entity classes, the mode layer is established according to the abstract entity classes in S2.3 and is described through an expression, an ontology model is established on the abstract entity classes and relations, the ontology model is established and displayed through prot g e software, knowledge of the mode layer is expressed through the ontology model, and the ontology model can be described through owl language;
the ontology model formalized expression of the mode layer of the knowledge graph is as follows:
KGPattern={Entity∪Relation}
wherein:
Entity={P∪M∪O∪E∪F}
Relation={R}
R={(Pi,contain,Fj)∪(Oi,contain,Mj)∪(Oi,contain,Oj)∪(Oi,contain,Ej)∪(Oi,contain,Fj)∪(Oi,order,Oj)∪}
in the above formula: KGPattern refers to a formal expression model of a knowledge graph pattern layer, and Entity refers to the pattern layerDescribing entity set, relationship refers to the Relation set described by the mode layer; p represents a part class, M represents a device class, O represents a process class, E represents an equipment class, F represents a feature class, and R represents a relationship class. The semantic relation R comprises a relation and an order, wherein the relation is defined as an inclusion relation between a part class and a feature class, an inclusion relation between a process class and an equipment class, an equipment class and a feature class, and an inclusion relation between the process classes, and the relation is expressed and described as follows: (P)i,contain,Fj) Indicating part PiIncluding feature Fj,(Oi,contain,Mj) Represents Process OiIncluding a device Mj,(Oi,contain,Oj) Represents Process OiComprising process Oj,(Oi,contain,Ej) Represents Process OiComprising equipment Ej,(Oi,contain,Fj) Represents Process OiIncluding feature Fj(ii) a The order relation is defined as the sequence relation between the working procedures or the working steps in the process class, and the relation is expressed and described as follows: (O)i,order,Oj) Indicating the step (Process step) OiBefore, Process (step) OjThen, carrying out the treatment; as shown in table 1;
TABLE 3 semantic relationship representation and description
Figure BDA0003359323200000171
At this time, the knowledge map pattern layer construction is completed, and the next step S3 is proceeded to.
The S3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer;
because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module; the specific operation flow is as follows:
s3.1: converting unstructured data to be processed into a file in a plain text format, reading in the data through programming, and storing the data into a file in a txt format;
s3.2: the method comprises the following steps of utilizing a technical sentence word segmentation module to segment words, wherein the technical sentence word segmentation module is realized based on a Bi-LSTM-CRF algorithm, utilizing the technical sentence word segmentation module to segment words, inputting pure text data obtained in S3.1 into a trained technical sentence word segmentation module, and finally outputting text files with word segmentation labels 'B', 'I', 'W' and 'S', wherein the text word segmentation is finished:
s3.3: classifying words in the text with the word segmentation completed to form word vectors, realizing a process word classification module based on word2vec algorithm, taking the word segmentation in the text file with the word segmentation labels obtained in S3.2 as the words to be classified, calculating the cosine distance between the words to be classified and the words in the word classification corpus, assigning the class of the sample with the shortest distance to the same class of the words to be classified, and finally outputting the text file with the word class labels, wherein the word classification is completed;
s3.4: and (4) semantic relation construction, namely identifying words according to entity types defined by the mode layer of the text file with word type labels obtained in the S3.3, identifying relations between the words according to relations between the entities defined by the mode layer, storing the relations in a triple mode, identifying relations between the entities and the attributes according to the word attribute labels, and storing the relations in a triple mode, wherein the stored triples are output as < entities, relations, entities >, < entities, attributes and attribute values > triples.
And S3.4: constructing a semantic relation, namely removing non-category placeholder words from the text with the word category labels obtained in S3.3 to obtain a category word text set, and sequentially processing words in the category word text set to obtain a triple output until all words are processed; the treatment process comprises the following steps:
s3.4.1: if the former word label is "p" and the next word label is "f", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.2 is entered;
s3.4.2: if the former word label is any one of "p, m, e, f, d, s" and the next word label is "a", adding a triple < the former word, the latter word >, and then entering the processing of the next word; if not, S3.4.3 is entered;
s3.4.3: if the former word label is "d" or "s" and the next word label is "m", "e" or "f", adding the triple < the former word, continain, the latter word >, and then entering the processing of the next word; if not, S3.4.4 is entered;
s3.4.4: if the former word label is "d" and the next word label is "s", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.4 is entered;
s3.4.5: if the former word label and the next word label are both'd' or the former word label and the next word label are both's', adding a triple < the former word, order, the latter word >, and then entering the processing of the next word; if not, directly entering the processing of the next word;
after all the words are processed, the words are stored in the format of the csv file, at this time, the semantic relationship construction is completed, and the process proceeds to S4.
Example 3:
example 3 is substantially the same as example 2 except that:
the training process of the technical sentence word segmentation module in the S3.2 is as follows:
s3.2.1: performing preliminary word segmentation on the sentences in the plain text format file by using a word segmentation packet jieba;
s3.2.2: manually judging and modifying the word segmentation result obtained in S3.2.1, and labeling labels of 'B', 'I', 'W' and 'S', wherein 'B' represents the first Chinese character of a word, 'I' represents the middle Chinese character of the word, 'W' represents the last Chinese character of the word, and 'S' represents that the word has only one Chinese character;
s3.2.3: storing the text with the word segmentation labels obtained in S3.2.2 into a local hard disk as a process sentence word segmentation corpus;
s3.2.4: and (3) training a process sentence segmentation module, importing the process sentence segmentation corpus obtained from S3.2.3 into the process sentence segmentation module, reading in a segmentation module program for iterative operation, and completing the training of the process sentence segmentation module after the program is operated.
The training process of the process word classification module in the S3.3 is as follows:
s3.3.1: the words in the text processed on the above are manually labeled with eight categories of "part", "device", "equipment", "feature", "relationship", "process", "step", "attribute", which are respectively represented by "p", "m", "e", "f", "r", "d", "S" and "a", and a non-category word is represented by "x", seven categories of the above-mentioned "part", "device", "equipment", "feature", "relationship", "process" and "step" correspond to six types of data in S2.2, and "attribute" corresponds to the attribute of the class in S2.3;
the processing object data corresponds to a 'part'; the processing equipment data corresponds to 'equipment'; the processing method data corresponds to a procedure and a step, the part processing comprises a plurality of procedures, and each procedure comprises a plurality of steps; the processing equipment data corresponds to 'equipment'; the processing characteristic data corresponds to a 'characteristic'; semantic relationship data corresponds to a "relationship"; "Attribute" contains a value and a name;
s3.3.2: storing the text with the classification labels obtained in S3.3.1 into a local hard disk as a process word classification corpus;
s3.3.3: and (3) training a process word classification module, importing the process word classification corpus obtained from S3.3.2 into the process word classification module, reading in a classification module program to perform iterative operation, and finishing the process word classification module training after the program operation is finished.
S4: and (4) visual display, namely combining the mode layer and the data layer to construct a knowledge graph, inputting the extracted triples into a Neo4j graph database, and storing the knowledge graph for visual display.

Claims (8)

1. A knowledge graph construction method of process data in the field of numerical control machining is characterized by comprising the following steps:
the method for constructing the knowledge graph of the process data comprises the following steps:
s1: establishing an original data database, searching data information related to the numerical control machining field, and establishing the original data database;
s2: constructing a knowledge map mode layer, manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer:
s3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer; because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module;
s4: and visual display, wherein the mode layer and the data layer are combined to construct a knowledge graph, and the knowledge graph is stored and visually displayed.
2. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 1, characterized in that:
the S1: the data information related to the numerical control machining field in the original information database is established and comprises the following steps: drawing data of parts, a normative manual, a processing process technology card, data of a numerical control machine tool and a processing center, data of positioning equipment of a clamp measuring tool, a technology file, a procedure card, a professional book and a technology manual; and storing the searched data information related to the numerical control machining field into a local hard disk as an original information database.
3. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 2, characterized in that:
the S2: constructing a knowledge map mode layer, namely manually extracting process knowledge in an original data database, establishing an ontology model by combining expert knowledge in the field of numerical control machining processes, and constructing the knowledge map mode layer;
s2.1: analyzing the extracted process data, and summarizing the data by combining expert knowledge; establishing an ontology model and constructing a knowledge graph mode layer; the knowledge graph mode layer is constructed in an ontology model mode, and the ontology model is described by an ontology language OWL; extracting process knowledge manually aiming at unstructured data in original data, and establishing a body model by combining expert knowledge to construct a knowledge map mode layer; the ontology model is a model of the mode layer, and the ontology language is a language for describing the mode layer;
s2.2: the machining object of the numerical control machining equipment is a part, and the part information and the machining and manufacturing information contained in the CAD model and the numerical control machining process document of the part are analyzed and can be summarized into six types of data: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the process data can be summarized as: processing object data, processing equipment data, processing method data, processing equipment data, processing characteristic data and semantic relation data;
the processing object data is the basis of the whole process data, describes text information such as the number and name of the processing object, and also describes information including the material, the size and the requirement of the mechanical model; the data of the processing equipment mainly refers to a numerical control machine tool or a processing center required by processing parts, such as a numerical control milling machine and a numerical control lathe; the processing method data refers to a method and operation for processing a part, and the steps of a material from a blank to the part, specifically refers to a processing procedure and a working step; the processing equipment data refers to process equipment, and comprises a positioning device and a clamping device which are used for positioning, and measuring tools such as tools on a machine tool are used as aids in the processing process; the machining characteristic data refers to specific machining objects and requirements in the working procedures finished by the machine tool, such as surfaces, holes and threads, and the machining requirements comprise precision requirements and material processing requirements; the semantic relation data refers to the relation among the previous data and represents the relation among different types of data;
s2.3: in order to maintain the accuracy and completeness of the mode layer, the above six data are abstracted into six classes, wherein the processing object data, the processing equipment data, the processing method data, the processing equipment data, the processing characteristic data and the semantic relation data correspond to each other in sequence: parts, equipment, processes, equipment, characteristics and relations; the process class inherits a process class and a process step class, the equipment class inherits a cutter class, a clamp class, a measuring tool class and an auxiliary tool class, and the characteristic class inherits a shape characteristic class, an accuracy characteristic class, a technical characteristic class, a management characteristic class and a material characteristic class, wherein the accuracy characteristic class inherits a size accuracy class, a shape accuracy class, a position accuracy class and a surface roughness class, and the relation class inherits a contact class and an order class;
the class attributes comprise class attributes and instance attributes, the class attributes are shared by various classes and subclasses thereof, all instances share the corresponding class attributes, and the instance attributes are only owned by all the instances; the part class has a name class attribute, the equipment class has a name class attribute, the process class has a name class attribute, the equipment class has a name class attribute, the feature class has a name class attribute, the relationship class has a name class attribute, and other example attributes are added according to different entities;
s2.4: the mode layer of the knowledge graph describes some entity classes and relations among the entity classes, the mode layer is established according to the abstract entity classes in S2.3 and is described through an expression, an ontology model is established on the abstract entity classes and relations, the ontology model is established and displayed through prot g e software, knowledge of the mode layer is expressed through the ontology model, and the ontology model can be described through owl language;
the ontology model formalized expression of the mode layer of the knowledge graph is as follows:
KGPattern={Entity∪Relation}
wherein:
Entity={P∪M∪O∪E∪F}
Relation={R}
R={(Pi,contain,Fj)∪(Oi,contain,Mj)∪(Oi,contain,Oj)∪(Oi,contain,Ej)∪(Oi,contain,Fj)∪(Oi,order,Oj)∪}
in the above formula: KGPattern refers to a formal expression model of a knowledge graph mode layer, Entity refers to an Entity set described by the mode layer, and relationship refers to a relationship set described by the mode layer; p represents a part class, M represents an equipment class, O represents a process class, E represents an equipment class, F represents a characteristic class, and R represents a relation class; the semantic relation R comprises a relation and an order, wherein the relation is defined as an inclusion relation between a part class and a feature class, an inclusion relation between a process class and an equipment class, an equipment class and a feature class, and an inclusion relation between the process classes, and the relation is expressed and described as follows: (P)i,contain,Fj) Indicating part PiIncluding feature Fj,(Oi,contain,Mj) Represents Process OiIncluding a device Mj,(Oi,contain,Oj) Represents Process OiComprising process Oj,(Oi,contain,Ej) Represents Process OiComprising equipment Ej,(Oi,contain,Fj) Represents Process OiIncluding feature Fj(ii) a The order relation is defined as the sequence relation between the working procedures or the working steps in the process class, and the relation is expressed and described as follows: (O)i,order,Oj) Indicating the step (Process step) OiBefore, Process (step) OjThen, carrying out the treatment; as shown in table 1;
TABLE 1 semantic relationship representation and description
Figure FDA0003359323190000031
Figure FDA0003359323190000041
At this time, the knowledge map pattern layer construction is completed, and the next step S3 is proceeded to.
4. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 3, characterized in that:
the S3: establishing a knowledge graph data layer, extracting triples, extracting entities, attributes and relations in unstructured data by using a data extraction model, and establishing the knowledge graph data layer;
because the data in the original data database all belong to unstructured process data containing a large amount of information, when the automatic establishment of the knowledge graph is to be completed, the information of the unstructured process data containing a large amount of information needs to be extracted, and the unstructured data is converted into semantic relation data; therefore, a data extraction model is established to automatically extract entities, attributes and relationships in unstructured data and construct a knowledge graph data layer; the data extraction model comprises a process statement word segmentation module, a process word classification module and a semantic relation construction module; the specific operation flow is as follows:
s3.1: converting unstructured data to be processed into a file in a plain text format, reading in the data through programming, and storing the data into a file in a txt format;
s3.2: the method comprises the following steps of utilizing a technical sentence word segmentation module to segment words, wherein the technical sentence word segmentation module is realized based on a Bi-LSTM-CRF algorithm, utilizing the technical sentence word segmentation module to segment words, inputting pure text data obtained in S3.1 into a trained technical sentence word segmentation module, and finally outputting text files with word segmentation labels 'B', 'I', 'W' and 'S', wherein the text word segmentation is finished:
s3.3: classifying words in the text with the word segmentation completed to form word vectors, realizing a process word classification module based on word2vec algorithm, taking the word segmentation in the text file with the word segmentation labels obtained in S3.2 as the words to be classified, calculating the cosine distance between the words to be classified and the words in the word classification corpus, assigning the class of the sample with the shortest distance to the same class of the words to be classified, and finally outputting the text file with the word class labels, wherein the word classification is completed;
s3.4: and (4) semantic relation construction, namely identifying words according to entity types defined by the mode layer of the text file with word type labels obtained in the S3.3, identifying relations between the words according to relations between the entities defined by the mode layer, storing the relations in a triple mode, identifying relations between the entities and the attributes according to the word attribute labels, and storing the relations in a triple mode, wherein the stored triples are output as < entities, relations, entities >, < entities, attributes and attribute values > triples.
5. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 4, characterized in that:
and S3.4: constructing a semantic relation, namely removing non-category placeholder words from the text with the word category labels obtained in S3.3 to obtain a category word text set, and sequentially processing words in the category word text set to obtain a triple output until all words are processed; the treatment process comprises the following steps:
s3.4.1: if the former word label is "p" and the next word label is "f", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.2 is entered;
s3.4.2: if the former word label is any one of "p, m, e, f, d, s" and the next word label is "a", adding a triple < the former word, the latter word >, and then entering the processing of the next word; if not, S3.4.3 is entered;
s3.4.3: if the former word label is "d" or "s" and the next word label is "m", "e" or "f", adding the triple < the former word, continain, the latter word >, and then entering the processing of the next word; if not, S3.4.4 is entered;
s3.4.4: if the former word label is "d" and the next word label is "s", adding a triple < the former word, continin, the latter word >, and then processing the next word; if not, S3.4.4 is entered;
s3.4.5: if the former word label and the next word label are both'd' or the former word label and the next word label are both's', adding a triple < the former word, order, the latter word >, and then entering the processing of the next word; if not, directly entering the processing of the next word;
after all the words are processed, the words are stored in the format of the csv file, at this time, the semantic relationship construction is completed, and the process proceeds to S4.
6. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 4, characterized in that:
the training process of the technical sentence word segmentation module in the S3.2 is as follows:
s3.2.1: performing preliminary word segmentation on the sentences in the plain text format file by using a word segmentation packet jieba;
s3.2.2: manually judging and modifying the word segmentation result obtained in S3.2.1, and labeling labels of 'B', 'I', 'W' and 'S', wherein 'B' represents the first Chinese character of a word, 'I' represents the middle Chinese character of the word, 'W' represents the last Chinese character of the word, and 'S' represents that the word has only one Chinese character;
s3.2.3: storing the text with the word segmentation labels obtained in S3.2.2 into a local hard disk as a process sentence word segmentation corpus;
s3.2.4: and (3) training a process sentence segmentation module, importing the process sentence segmentation corpus obtained from S3.2.3 into the process sentence segmentation module, reading in a segmentation module program for iterative operation, and completing the training of the process sentence segmentation module after the program is operated.
7. The method for constructing the knowledge graph of the process data in the numerical control machining field according to claim 4, characterized in that:
the training process of the process word classification module in the S3.3 is as follows:
s3.3.1: the words in the text processed on the above are manually labeled with eight categories of "part", "device", "equipment", "feature", "relationship", "process", "step", "attribute", which are respectively represented by "p", "m", "e", "f", "r", "d", "S" and "a", and a non-category word is represented by "x", seven categories of the above-mentioned "part", "device", "equipment", "feature", "relationship", "process" and "step" correspond to six types of data in S2.2, and "attribute" corresponds to the attribute of the class in S2.3;
the processing object data corresponds to a 'part'; the processing equipment data corresponds to 'equipment'; the processing method data corresponds to a procedure and a step, the part processing comprises a plurality of procedures, and each procedure comprises a plurality of steps; the processing equipment data corresponds to 'equipment'; the processing characteristic data corresponds to a 'characteristic'; semantic relationship data corresponds to a "relationship"; "Attribute" contains a value and a name;
s3.3.2: storing the text with the classification labels obtained in S3.3.1 into a local hard disk as a process word classification corpus;
s3.3.3: and (3) training a process word classification module, importing the process word classification corpus obtained from S3.3.2 into the process word classification module, reading in a classification module program to perform iterative operation, and finishing the process word classification module training after the program operation is finished.
8. The method for constructing the knowledge graph of the process data in the field of numerical control machining according to any one of claims 4 to 7, wherein the method comprises the following steps:
s4: and (4) visual display, namely combining the mode layer and the data layer to construct a knowledge graph, inputting the extracted triples into a Neo4j graph database, and storing the knowledge graph for visual display.
CN202111361153.5A 2021-11-17 2021-11-17 Knowledge graph construction method for process data in numerical control machining field Pending CN113987212A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111361153.5A CN113987212A (en) 2021-11-17 2021-11-17 Knowledge graph construction method for process data in numerical control machining field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111361153.5A CN113987212A (en) 2021-11-17 2021-11-17 Knowledge graph construction method for process data in numerical control machining field

Publications (1)

Publication Number Publication Date
CN113987212A true CN113987212A (en) 2022-01-28

Family

ID=79749031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111361153.5A Pending CN113987212A (en) 2021-11-17 2021-11-17 Knowledge graph construction method for process data in numerical control machining field

Country Status (1)

Country Link
CN (1) CN113987212A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962098A (en) * 2021-10-26 2022-01-21 重庆忽米网络科技有限公司 Model construction method based on business process industrial mechanism
CN114186759A (en) * 2022-02-16 2022-03-15 杭州杰牌传动科技有限公司 Material scheduling control method and system based on reducer knowledge graph
CN114722158A (en) * 2022-06-01 2022-07-08 中科航迈数控软件(深圳)有限公司 Method and system for matching numerical control machine tool manufacturing process based on subject word clustering
CN115168606A (en) * 2022-07-01 2022-10-11 北京理工大学 Mapping template knowledge extraction method for semi-structured process data
CN115309912A (en) * 2022-08-08 2022-11-08 重庆大学 Knowledge graph construction method, intelligent reasoning method and rapid design method of integrated electric drive structure
CN115455192A (en) * 2022-08-16 2022-12-09 广州极点三维信息科技有限公司 Data processing method and system based on customized cabinet process knowledge map
CN115640758A (en) * 2022-12-23 2023-01-24 南京维拓科技股份有限公司 Three-dimensional model digital quality inspection method based on knowledge construction
CN115981615A (en) * 2023-03-20 2023-04-18 中科航迈数控软件(深圳)有限公司 G code generation method fusing language model and knowledge graph and related equipment
CN116028571A (en) * 2023-03-31 2023-04-28 南京航空航天大学 Knowledge graph construction method and system based on thin-wall part
CN116258438A (en) * 2022-09-09 2023-06-13 武汉理工大学 Workshop logistics knowledge graph construction and logistics equipment path planning method
CN116431818A (en) * 2022-11-15 2023-07-14 电子科技大学 Automatic knowledge graph construction method for hot working process design
CN117236446A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Method and system for reasoning 3D model structure by using rational atlas
CN117236432A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Multi-mode data-oriented manufacturing process knowledge graph construction method and system
CN117235929A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
CN117252201A (en) * 2023-11-17 2023-12-19 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system
CN117656082A (en) * 2024-01-29 2024-03-08 青岛创新奇智科技集团股份有限公司 Industrial robot control method and device based on multi-mode large model
CN117972813A (en) * 2024-03-28 2024-05-03 山东山大华天软件有限公司 Intelligent process method, system, equipment and medium for machining parts

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032648A (en) * 2019-03-19 2019-07-19 微医云(杭州)控股有限公司 A kind of case history structuring analytic method based on medical domain entity
CN110362660A (en) * 2019-07-23 2019-10-22 重庆邮电大学 A kind of Quality of electronic products automatic testing method of knowledge based map
CN111444351A (en) * 2020-03-24 2020-07-24 清华苏州环境创新研究院 Method and device for constructing knowledge graph in industrial process field
CN111898371A (en) * 2020-07-10 2020-11-06 中国标准化研究院 Ontology construction method and device for rational design knowledge and computer storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032648A (en) * 2019-03-19 2019-07-19 微医云(杭州)控股有限公司 A kind of case history structuring analytic method based on medical domain entity
CN110362660A (en) * 2019-07-23 2019-10-22 重庆邮电大学 A kind of Quality of electronic products automatic testing method of knowledge based map
CN111444351A (en) * 2020-03-24 2020-07-24 清华苏州环境创新研究院 Method and device for constructing knowledge graph in industrial process field
CN111898371A (en) * 2020-07-10 2020-11-06 中国标准化研究院 Ontology construction method and device for rational design knowledge and computer storage medium

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962098A (en) * 2021-10-26 2022-01-21 重庆忽米网络科技有限公司 Model construction method based on business process industrial mechanism
CN114186759A (en) * 2022-02-16 2022-03-15 杭州杰牌传动科技有限公司 Material scheduling control method and system based on reducer knowledge graph
CN114722158A (en) * 2022-06-01 2022-07-08 中科航迈数控软件(深圳)有限公司 Method and system for matching numerical control machine tool manufacturing process based on subject word clustering
CN115168606A (en) * 2022-07-01 2022-10-11 北京理工大学 Mapping template knowledge extraction method for semi-structured process data
CN115168606B (en) * 2022-07-01 2024-05-24 北京理工大学 Mapping template knowledge extraction method for semi-structured process data
CN115309912A (en) * 2022-08-08 2022-11-08 重庆大学 Knowledge graph construction method, intelligent reasoning method and rapid design method of integrated electric drive structure
CN115455192A (en) * 2022-08-16 2022-12-09 广州极点三维信息科技有限公司 Data processing method and system based on customized cabinet process knowledge map
CN115455192B (en) * 2022-08-16 2023-06-16 广州极点三维信息科技有限公司 Data processing method and system based on customized cabinet process knowledge graph
CN116258438B (en) * 2022-09-09 2024-08-20 武汉理工大学 Workshop logistics knowledge graph construction and logistics equipment path planning method
CN116258438A (en) * 2022-09-09 2023-06-13 武汉理工大学 Workshop logistics knowledge graph construction and logistics equipment path planning method
CN116431818A (en) * 2022-11-15 2023-07-14 电子科技大学 Automatic knowledge graph construction method for hot working process design
CN116431818B (en) * 2022-11-15 2023-12-05 电子科技大学 Automatic knowledge graph construction method for hot working process design
CN115640758A (en) * 2022-12-23 2023-01-24 南京维拓科技股份有限公司 Three-dimensional model digital quality inspection method based on knowledge construction
CN115981615B (en) * 2023-03-20 2023-06-30 中科航迈数控软件(深圳)有限公司 G code generation method integrating language model and knowledge graph and related equipment
CN115981615A (en) * 2023-03-20 2023-04-18 中科航迈数控软件(深圳)有限公司 G code generation method fusing language model and knowledge graph and related equipment
CN116028571A (en) * 2023-03-31 2023-04-28 南京航空航天大学 Knowledge graph construction method and system based on thin-wall part
CN117235929A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
CN117236432A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Multi-mode data-oriented manufacturing process knowledge graph construction method and system
CN117235929B (en) * 2023-09-26 2024-06-04 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
CN117236446B (en) * 2023-09-26 2024-06-07 中国科学院沈阳自动化研究所 Method and system for reasoning 3D model structure by utilizing logic atlas
CN117236432B (en) * 2023-09-26 2024-07-02 中国科学院沈阳自动化研究所 Multi-mode data-oriented manufacturing process knowledge graph construction method and system
CN117236446A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Method and system for reasoning 3D model structure by using rational atlas
CN117252201A (en) * 2023-11-17 2023-12-19 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system
CN117252201B (en) * 2023-11-17 2024-02-27 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system
CN117656082A (en) * 2024-01-29 2024-03-08 青岛创新奇智科技集团股份有限公司 Industrial robot control method and device based on multi-mode large model
CN117656082B (en) * 2024-01-29 2024-05-14 青岛创新奇智科技集团股份有限公司 Industrial robot control method and device based on multi-mode large model
CN117972813A (en) * 2024-03-28 2024-05-03 山东山大华天软件有限公司 Intelligent process method, system, equipment and medium for machining parts

Similar Documents

Publication Publication Date Title
CN113987212A (en) Knowledge graph construction method for process data in numerical control machining field
CN110399457B (en) Intelligent question answering method and system
CN111708773B (en) Multi-source scientific and creative resource data fusion method
CN111680173A (en) CMR model for uniformly retrieving cross-media information
CN109947921B (en) Intelligent question-answering system based on natural language processing
CN110990590A (en) Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN110795932B (en) Geological report text information extraction method based on geological ontology
CN102609512A (en) System and method for heterogeneous information mining and visual analysis
CN113190687B (en) Knowledge graph determining method and device, computer equipment and storage medium
CN106897437B (en) High-order rule multi-classification method and system of knowledge system
CN114661914A (en) Contract examination method, device, equipment and storage medium based on deep learning and knowledge graph
CN116340530A (en) Intelligent design method based on mechanical knowledge graph
CN116187323A (en) Knowledge graph in field of numerical control machine tool and construction method thereof
CN115438195A (en) Construction method and device of knowledge graph in financial standardization field
CN111241299A (en) Knowledge graph automatic construction method for legal consultation and retrieval system thereof
Sun A natural language interface for querying graph databases
CN116821376B (en) Knowledge graph construction method and system in coal mine safety production field
CN116432965B (en) Post capability analysis method and tree diagram generation method based on knowledge graph
Khekare et al. Design of Automatic Key Finder for Search Engine Optimization in Internet of Everything
US20220092123A1 (en) Knowledge insight capturing system
CN112488593B (en) Auxiliary bid evaluation system and method for bidding
CN112668836B (en) Risk spectrum-oriented associated risk evidence efficient mining and monitoring method and apparatus
CN114417008A (en) Construction engineering field-oriented knowledge graph construction method and system
CN113849621A (en) Intelligent pushing method for civil colleague case data
CN113688250A (en) Automatic construction method of legal knowledge graph based on part-of-speech and word order analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination