CN117236435B - Knowledge fusion method, device and storage medium of design rationality knowledge network - Google Patents

Knowledge fusion method, device and storage medium of design rationality knowledge network Download PDF

Info

Publication number
CN117236435B
CN117236435B CN202311475638.6A CN202311475638A CN117236435B CN 117236435 B CN117236435 B CN 117236435B CN 202311475638 A CN202311475638 A CN 202311475638A CN 117236435 B CN117236435 B CN 117236435B
Authority
CN
China
Prior art keywords
knowledge
information
nodes
sentence
same
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311475638.6A
Other languages
Chinese (zh)
Other versions
CN117236435A (en
Inventor
岳高峰
李文武
王淑敏
温娜
高亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China National Institute of Standardization
Original Assignee
China National Institute of Standardization
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China National Institute of Standardization filed Critical China National Institute of Standardization
Priority to CN202311475638.6A priority Critical patent/CN117236435B/en
Publication of CN117236435A publication Critical patent/CN117236435A/en
Application granted granted Critical
Publication of CN117236435B publication Critical patent/CN117236435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a knowledge fusion method, device and storage medium of a rational knowledge network, relates to the technical field of text mining, and solves the problem of knowledge fusion of a plurality of technical documents. Comprising the following steps: extracting knowledge nodes from a plurality of technical documents in a rational knowledge network; for the design rational knowledge extracted from a single technical document, merging the same knowledge nodes according to a knowledge fusion rule based on a document structure level for each knowledge node; traversing each knowledge node of each technical document, carrying out semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in a plurality of technical documents, determining similarity relations among knowledge nodes of the same category, and merging the knowledge nodes of the same or similar knowledge nodes. According to the semantic similarity analysis result, the redundancy of knowledge nodes from different sources can be eliminated, so that knowledge fusion among a plurality of technical documents is realized.

Description

Knowledge fusion method, device and storage medium of design rationality knowledge network
Technical Field
The present disclosure relates to the field of text mining technologies, and in particular, to a knowledge fusion method, apparatus, and storage medium for a design rational knowledge network.
Background
Knowledge fusion can be used to solve the knowledge acquisition problem of different sources. Aiming at the multidimensional, heterogeneous and time series problems of data collection, knowledge fusion can be utilized to disambiguate, integrate and infer knowledge, and high-quality knowledge is obtained after a series of redundant and error information is eliminated. Knowledge fusion, which includes entity alignment, or cognitive matching of entities, relationships, and attributes, is one of the basic techniques for constructing a knowledge network. Meanwhile, knowledge fusion is performed in different application scenes, and knowledge fusion technologies and methods are different. The open network knowledge fusion method mainly solves the problems of correlation and merging calculation between knowledge acquired from the Internet and knowledge existing in a knowledge base, and knowledge fusion of multiple knowledge bases mainly solves the problem that multiple knowledge bases are merged into one knowledge base.
Often, the technical literature has repeatedly expressed entities and relationships, which have complex and redundant problems, and can generate ambiguity for computer automation interpretation, thereby causing errors. For knowledge fusion among a plurality of documents, the difficulty and the calculated amount of the knowledge fusion can be increased in geometric multiples, and the accuracy of the knowledge fusion is difficult to guarantee.
Disclosure of Invention
In order to solve the technical defects, the embodiment of the application provides a knowledge fusion method, a device and a storage medium of a design rationality knowledge network.
The embodiment of the application provides a knowledge fusion method of a design rationality knowledge network, which comprises the following steps:
acquiring knowledge nodes of multi-space target technical documents;
combining the same knowledge nodes of each target technical document according to knowledge fusion rules based on document structure levels;
traversing each knowledge node after the combination of each target technical document, carrying out semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in the multi-object technical documents pair by pair, determining similarity relations among the knowledge nodes of the same category, fusing the knowledge nodes of the same or similar, wherein the similarity relations among the knowledge nodes of the same category comprise the same, similar or irrelevant;
and establishing a design rationality knowledge network according to the association relation among the knowledge nodes of the multiple fused target technical documents.
Optionally, the knowledge node includes at least one of: technical literature information, artifact information, design problem information, design intent information, design demonstration information, advantage and disadvantage information, holding stand information, design scheme information, alternative scheme information.
Optionally, merging the same knowledge nodes according to the depth residual shrinkage network knowledge fusion rule based on the document structure by each knowledge node comprises:
for artifact information or good and bad information, carrying out vocabulary semantic comparison or narrative table comparison on knowledge nodes in the same document structure or different document structures in the target technical literature, and merging knowledge nodes with the same meaning or similar meaning;
for design problem information or holding standing information, analyzing positive description information and negative description information in the same document structure or different document structures in the target technical document, and merging knowledge nodes with the same meaning or similar meanings;
and combining knowledge nodes in the same document structure or different document structures in the target technical literature according to preset rules of the depth residual error shrinkage network for design intention information, design demonstration information, design scheme information or alternative scheme information.
Optionally, performing the semantic similarity calculation based on the deep learning method on the knowledge nodes of the same category in the multi-space target technical literature pair by pair includes:
for a first knowledge node A1 and a second knowledge node A2 in different target technical documents, the first knowledge node A1 and the second knowledge node A2 are knowledge nodes of the same category, text is extracted from the first knowledge node A1 and the second knowledge node A2 to obtain a first text T1 and a second text T2, and text is extracted from the first text T1 and the second textT2 respectively carries out pre-training language characterization model processing, and then respectively carries out pooling operation on the processed data to obtain a first sentence embedded vector V with a preset length t1 And a second sentence embedding vector V t2 Embedding a vector V into the first sentence t1 And the second sentence embedding vector V t2 And (3) performing cosine similarity calculation, comparing a cosine similarity calculation result with a preset threshold value, and determining a similarity relation between the first knowledge node A1 and the second knowledge node A2 in the same category.
Optionally, embedding a vector V into the first sentence t1 And the second sentence embedding vector V t2 The cosine similarity calculation includes:
the first sentence embedded vector V t1 And the second sentence embedding vector V t2 Cosine similarity calculation based on vector embedding features is performed by the following expression:
wherein θ is the first sentence embedding vector V t1 And the second sentence embedding vector V t2 Included angle V of t1i And V t2i The first sentence embedded vectors V t1 And the second sentence embedding vector V t2 N is the number of sentence embedded vector elements, ||v t1 I and I V t2 I are the first sentence embedded vectors V, respectively t1 And the second sentence embedding vector V t2 Is a mold of (a).
Optionally, the pre-trained language characterization model is trained by:
obtaining a preset number of technical documents, carrying out pairwise comparison analysis on knowledge nodes of the same category in different technical documents, adopting a sentence converter neural network algorithm based on a pre-training language characterization model, converting the knowledge node pairs of the same category in the technical documents into embedded vector pairs, and determining similarity coefficients of the pre-training language characterization model by cosine similarity calculation of text semantics.
Optionally, the similarity relationship between the first knowledge node A1 and the second knowledge node A2 of the same category is determined according to the following similarity value range:
in a second aspect, the present invention further provides a knowledge fusion apparatus for a design rational knowledge network, including:
the extraction module is used for obtaining knowledge nodes of multi-space target technical documents;
the internal fusion module is used for merging the same knowledge nodes according to knowledge fusion rules based on document structure layers for each knowledge node of each target technical document;
the external fusion module is used for traversing all the knowledge nodes after the combination of each target technical document, carrying out semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in the multi-space target technical document pair by pair, determining similarity relations among the knowledge nodes of the same category, and fusing the knowledge nodes of the same or similar category, wherein the similarity relations among the knowledge nodes of the same category comprise the same, similar or irrelevant;
the construction module is used for building a design rationality knowledge network according to the association relation among the knowledge nodes of the multiple fused target technical documents.
In a third aspect, the present invention also provides a computing device comprising:
at least one processor and a memory storing program instructions;
the program instructions, when read and executed by the processor, cause the computing device to perform a knowledge fusion method of a design rational knowledge network as described above.
In a fourth aspect, the present invention also provides a readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform a knowledge fusion method of a design rational knowledge network as described above.
The knowledge fusion method, the device and the storage medium of the design rationality knowledge network provided by the embodiment of the application firstly merge and fuse knowledge nodes from the same technical literature, eliminate redundancy, eliminate ambiguity possibly generated by different expression modes and avoid errors caused by word ambiguity. Secondly, fusing the 'equal' relationship or the 'similar' relationship with knowledge nodes from different sources according to semantic similarity analysis results, eliminating redundancy of the knowledge nodes from different sources, achieving knowledge fusion among a plurality of technical documents, and establishing a design rational knowledge network according to each knowledge node of the fused technical documents.
Drawings
FIG. 1 is a flowchart of a knowledge fusion method for a design rational knowledge network according to an embodiment of the present application;
fig. 2 is a schematic diagram of calculating similarity of text vectors using cosine included angles according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a semantic similarity calculation process according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a knowledge fusion method device for a design rational knowledge network according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of exemplary embodiments of the present application is given with reference to the accompanying drawings, and it is apparent that the described embodiments are only some of the embodiments of the present application and not exhaustive of all the embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.
As shown in fig. 1, an embodiment of the present invention provides a knowledge fusion method of a rational knowledge network, including steps S101 to S104:
s101, acquiring knowledge nodes of multi-space technical documents;
s102, merging the same knowledge nodes of each target technical document according to a knowledge fusion rule based on a document structure level;
s103, traversing all the knowledge nodes after the combination of the target technical documents, carrying out semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in the multi-space target technical documents one by one, determining similarity relations among the knowledge nodes of the same category, and fusing the knowledge nodes of the same or similar categories, wherein the similarity relations among the knowledge nodes of the same category comprise the same, similar or irrelevant;
s104, establishing a design rationality knowledge network according to the association relation among the knowledge nodes of the plurality of the fused target technical documents.
In an embodiment of the present invention, the knowledge node includes at least one of: technical literature information, artifact information, design problem information, design intent information, design arguments information, advantages and disadvantages information, holding stand information, design scheme information, alternative scheme information, wherein the technical literature information includes: name information, number information, creator organization information, and the like.
In the embodiment of the invention, first, knowledge nodes from the same technical literature are combined and fused. There are repeated expressions of entities and relationships in the same technical document, usually in different expressions, or in different sentence patterns, grammar or synonyms, which are repeated to express the same meaning. The diversity of the natural language expression has no problem of contradiction and conflict of knowledge in terms of semantic content and no influence on conceptual confusion of designers, and has only a complicated and redundant problem, but ambiguity can be generated for computer automation interpretation, so that errors are caused.
In the embodiment of the present invention, solving the problem, in step S102, merging the same knowledge nodes according to the knowledge fusion rule based on the document structure level includes:
for artifact information or good and bad information, carrying out vocabulary semantic comparison or narrative table comparison on knowledge nodes in the same document structure or different document structures in the target technical literature, and merging knowledge nodes with the same meaning or similar meaning;
for design problem information or holding standing information, analyzing positive description information and negative description information in the same document structure or different document structures in the target technical document, and merging knowledge nodes with the same meaning or similar meanings;
and combining knowledge nodes in the same document structure or different document structures in the target technical literature according to preset rules of the depth residual error shrinkage network for design intention information, design demonstration information, design scheme information or alternative scheme information.
In the embodiment of the invention, the same entities and relations such as the design intention information, the design demonstration information, the design scheme information or the alternative scheme information are combined according to the depth residual error shrinkage network (DRSN, deep Residual Networks) knowledge fusion rule based on the document structure, so that the knowledge redundancy is eliminated, and the ambiguity is eliminated. For example, for technical documents such as patent documents, the same semantic content often appears multiple times in the same document. Knowledge nodes extracted from the same technical document also have knowledge redundancy problems. Entities that may be redundant in the patent literature include questions (sentences), intents (phrases, clauses or sentences), schemes (sentences), advantages and disadvantages (words), artifacts (words), and the like. In the embodiment of the invention, the 'equivalent' relationship is established between the repeated entities, so that the knowledge node combination is realized. For the knowledge nodes such as design intention information, design demonstration information, design scheme information or alternative scheme information and the like in the same technical document, whether redundancy exists in the two knowledge nodes is judged according to the knowledge fusion rule of the DRSN.
For fusion of design intent information, design demonstration information, design solution information or alternative solution information, according to predefined rules of DRSN, in the same technical document, if sentence T1 and sentence T2 contain the same element and the same design intent, sentence T1 is equivalent to sentence T2, and an 'equality' relationship is established between the two sentences.
For the fusion of design issues, holding standpoint, only one core issue is typically described in a certain section or paragraph of the same technical document. Extracting the corresponding sentence, the negative description may be regarded as a detailed description of the corresponding sentence for the design problem and the holding standpoint. Positive or negative statements that would appear in the same document structure or in a different document structure may be considered "equal".
Fusion of the information of the artifacts (including the artifact entity and the artifact element) and the advantages and disadvantages. By combining the entity and the object elements in the same document structure and between different document structures in the same technical document, knowledge fusion is realized by a vocabulary semantic comparison method or by means of a narrative list method.
The embodiment of the invention can reduce redundant information by fusing knowledge of knowledge nodes in the same technical document, and is more convenient to understand and use.
In the embodiment of the present invention, in step S103, performing semantic similarity calculation based on a deep learning method on pairs of knowledge nodes of the same category in a multi-object technical document includes:
for a first knowledge node A1 and a second knowledge node A2 in different target technical documents, the first knowledge node A1 and the second knowledge node A2 are knowledge nodes of the same category, text is extracted from the first knowledge node A1 and the second knowledge node A2 to obtain a first text T1 and a second text T2, pre-training language characterization model processing is respectively carried out on the first text T1 and the second text T2, pooling operation is respectively carried out on the processed data to obtain a first sentence embedded vector V with a preset length t1 And a second sentence embedding vector V t2 Embedding a vector V into the first sentence t1 And the second sentence embedding vector V t2 Cosine ofAnd (3) similarity calculation, namely comparing a cosine similarity calculation result with a preset threshold value, and determining a similarity relation between the first knowledge node A1 and the second knowledge node A2 in the same category.
In the embodiment of the invention, different knowledge fragments need to be integrated for knowledge nodes from different sources. Whether two knowledge nodes are repeated or not is analyzed through semantic similarity calculation, wherein the semantic similarity analysis is one research branch in the field of natural language processing, and aims to determine the similarity of expressed semantics of two words, phrases, sentences or paragraph texts. In the embodiment of the invention, the object of knowledge fusion is mainly sentences, short texts and words, and the semantic similarity of the short texts is analyzed by adopting semantic similarity calculation based on deep learning. For example, in technical documents such as patent documents, the object of knowledge fusion includes an artifact sentence, a testimonial sentence, an intention sentence, a question sentence, a scheme sentence, and the like, each of which is composed of phrase nodes or word nodes (artifact information includes an artifact entity and an artifact element, and both of which are superior and inferior information and intention information). By calculating the semantic similarity of the corresponding knowledge nodes, whether two patent documents aim at the same technical problem or not can be analyzed, whether the proposed technical scheme surrounds the same artifact or has the same/similar design intention or not is judged, and therefore knowledge fusion of the patent technical documents is achieved. By analyzing whether the association relation exists among knowledge nodes in the same category one by one.
In the embodiment of the invention, the vector V is embedded into the first sentence t1 And the second sentence embedding vector V t2 The cosine similarity calculation includes:
the first sentence embedded vector V t1 And the second sentence embedding vector V t2 Cosine similarity calculation based on vector embedding features is performed by the following expression:
wherein θ is the first sentence embedding vector V t1 And said secondSentence embedding vector V t2 Included angle V of t1i And V t2i The first sentence embedded vectors V t1 And the second sentence embedding vector V t2 N is the number of sentence embedded vector elements, ||v t1 I and I V t2 I are the first sentence embedded vectors V, respectively t1 And the second sentence embedding vector V t2 Is a mold of (a).
In the embodiment of the invention, the pre-training language characterization model is trained by the following modes:
obtaining a preset number of technical documents, carrying out pairwise comparison analysis on knowledge nodes of the same category in different technical documents, adopting a sentence converter neural network algorithm based on a pre-training language characterization model, converting the knowledge node pairs of the same category in the technical documents into embedded vector pairs, and determining similarity coefficients of the pre-training language characterization model by cosine similarity calculation of text semantics.
In the embodiment of the invention, the similarity relation between the first knowledge node A1 and the second knowledge node A2 in the same category is determined according to the following similarity value ranges:
as shown in fig. 2, in the embodiment of the present invention, a deep neural network model is used to calculate cosine similarity of two text vectors. Two texts to be aligned are defined as T1 and T2, and the text pair (T1, T2) is represented by a corresponding text embedding (embedding) vector (V t1 ,V t2 ) And (3) representing. Semantic similarity between each text pair (T1, T2) is analyzed by cosine similarity calculation.
In FIG. 2, V t1 And V t2 Is the embedded vector of the text T1 and T2 in the vector space, and the semantic similarity is represented by the included angle theta of the two vectors. The smaller the included angle, the higher the similarity of the two texts. Text similarity is calculated based on cosine similarity of vector embedded features using the following expression:
wherein V is t1i And V t2i V is respectively t1 And V t2 The i-th constituent element of the vector.
The embodiment of the invention adopts a deep neural network model based on the BERT model to calculate the semantic similarity of two texts, as shown in figure 3. In FIG. 3, the text T1 and T2 is input by BERT processing and pooling (pooling) to generate a fixed-size sentence-embedded vector V t1 And V t2 . And then, carrying out cosine similarity calculation on the two sentence embedded vectors. And comparing the cosine similarity calculation result with a preset threshold value. The output results are "same", "similar" and "irrelevant", respectively.
The embodiment of the invention trains and verifies the performance and effect of knowledge fusion based on a deep neural network algorithm through semantic similarity analysis, wherein the training process is divided into: data preparation, embedded text vector construction, similarity calculation, knowledge fusion analysis, establishment of equality/similarity relation and the like.
1. Data preparation
The knowledge nodes extracted from a preset number of patent documents are taken as the original data. The extracted knowledge nodes initially construct a rational knowledge network among different documents according to the metadata of patent documents, wherein the rational knowledge network comprises association relations of the same applicant, the same inventor and the like. The original data contains knowledge nodes such as artifacts, relations, problems and the like.
2. Embedded text vector construction
Aiming at the 'problem' of knowledge nodes, carrying out pairwise pairing and nested circulation, comparing and analyzing one by one, adopting a BERT-based sentence converter (sentence transformer, SBERT) neural network algorithm, more specifically adopting an 'all_MiniLM-L12-V2' model, and converting the extracted problem pair (T1, T2) into an embedded vector pair (V) t1 ,V t2 ). The model is trained in MicrosoftFine tuning is performed on the basis of a training model Microsoft/MiniLM-L12-H384-uncased, sentences and paragraphs are mapped to 384-dimensional dense vector space, and the training model Microsoft/MiniLM-L12-H384-uncased sentence mapping method is more suitable for cosine similarity calculation of text semantics.
3. Similarity calculation and analysis
The cosine similarity calculation module of SBERT (sentence converter) is adopted, the similarity coefficient is larger than 0.8 and is considered to be equal, and an 'issueidedinctical' relation is established for the two problems; similarity coefficients greater than 0.6 and less than 0.8 are considered to be similar relationships, and an 'issuesimililarto' relationship is established for the two problems. And (5) performing comparison analysis one by one and performing knowledge fusion.
The similarity analysis of the embodiment of the invention can be used for patent navigation. The method can be used for analyzing the distribution schematic diagrams of the related applicant in the technical field by searching a certain similar problem, and related companies or institutions focusing on the technical field and other related technical problems focusing on the company institutions can be rapidly found through similar patents.
As shown in fig. 4, the embodiment of the present invention further provides a knowledge fusion device for designing a rational knowledge network, including:
an extracting module 401, configured to obtain knowledge nodes of multi-space technical documents;
an internal fusion module 402, configured to combine, for each knowledge node of each target technical document, the same knowledge node according to a knowledge fusion rule based on a document structure level;
the external fusion module 403 is configured to traverse each knowledge node after the merging of each target technical document, perform semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in the multi-space target technical document pair by pair, determine similarity relationships between knowledge nodes of the same category, and fuse knowledge nodes of the same or similar, where the similarity relationships between knowledge nodes of the same category include the same, similar or unrelated;
and a construction module 404, configured to establish a design rationality knowledge network according to the association relationships between the knowledge nodes of the fused multiple target technical documents.
The embodiment of the invention also provides a computing device, which comprises:
at least one processor and a memory storing program instructions;
the program instructions, when read and executed by the processor, cause the computing device to perform the knowledge fusion method of the design rational knowledge network described above.
The embodiment of the invention also provides a readable storage medium storing program instructions, which when read and executed by a computing device, cause the computing device to execute the knowledge fusion method of the design rational knowledge network.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions of the methods and apparatus of the present invention, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U-drives, floppy diskettes, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the method of the invention in accordance with instructions in said program code stored in the memory.
By way of example, and not limitation, readable media include readable storage media and communication media. The readable storage medium stores information such as computer readable instructions, data structures, program modules, or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with examples of the invention. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It should be appreciated that the teachings of the present invention as described herein may be implemented in a variety of programming languages and that the foregoing descriptions of specific languages are provided for disclosure of preferred embodiments of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into a plurality of sub-modules.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments.
Furthermore, some of the embodiments are described herein as methods or combinations of method elements that may be implemented by a processor of a computer system or by other means of performing the functions. Thus, a processor with the necessary instructions for implementing the described method or method element forms a means for implementing the method or method element. Furthermore, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is for carrying out the functions performed by the elements for carrying out the objects of the invention.
As used herein, unless otherwise specified the use of the ordinal terms "first," "second," "third," etc., to describe a general object merely denote different instances of like objects, and are not intended to imply that the objects so described must have a given order, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter.

Claims (6)

1. A knowledge fusion method for designing a rational knowledge network, comprising:
acquiring knowledge nodes of multi-space target technical documents;
combining the same knowledge nodes of each target technical document according to knowledge fusion rules based on document structure levels;
traversing each knowledge node after the combination of each target technical document, carrying out semantic similarity calculation based on a deep learning method on knowledge nodes of the same category in the multi-object technical documents pair by pair, determining similarity relations among the knowledge nodes of the same category, fusing the knowledge nodes of the same or similar, wherein the similarity relations among the knowledge nodes of the same category comprise the same, similar or irrelevant;
establishing a design rationality knowledge network according to the association relation among knowledge nodes of the multiple fused target technical documents;
the knowledge node comprises at least one of: technical literature information, artifact information, design problem information, design intent information, design demonstration information, advantage and disadvantage information, holding stand information, design scheme information, alternative scheme information;
merging the same knowledge nodes according to the knowledge fusion rule based on the document structure level comprises:
for artifact information or good and bad information, carrying out vocabulary semantic comparison or narrative table comparison on knowledge nodes in the same document structure or different document structures in the target technical literature, and merging knowledge nodes with the same meaning or similar meaning;
for design problem information or holding standing information, analyzing positive description information and negative description information in the same document structure or different document structures in the target technical document, and merging knowledge nodes with the same meaning or similar meanings;
for design intention information, design demonstration information, design scheme information or alternative scheme information, combining knowledge nodes in the same document structure or different document structures in the target technical literature according to a preset rule of a depth residual error shrinkage network;
the semantic similarity calculation based on the deep learning method is carried out on knowledge nodes of the same category in the multi-space target technical literature pair by pair, and comprises the following steps:
for a first knowledge node A1 and a second knowledge node A2 in different target technical documents, the first knowledge node A1 and the second knowledge node A2 are knowledge nodes of the same category, text is extracted from the first knowledge node A1 and the second knowledge node A2 to obtain a first text T1 and a second text T2, pre-training language characterization model processing is respectively carried out on the first text T1 and the second text T2, pooling operation is respectively carried out on the processed data to obtain a first sentence embedded vector V with a preset length t1 And a second sentence embedding vector V t2 Embedding a vector V into the first sentence t1 And the second sentence embedding vector V t2 And (3) performing cosine similarity calculation, comparing a cosine similarity calculation result with a preset threshold value, and determining a similarity relation between the first knowledge node A1 and the second knowledge node A2 in the same category.
2. The knowledge fusion method of claim 1, wherein a vector V is embedded for the first sentence t1 And the second sentence embedding vector V t2 The cosine similarity calculation includes:
the first sentence embedded vector V t1 And the second sentence embedding vector V t2 Cosine similarity calculation based on vector embedding features is performed by the following expression:
wherein θ is the first sentence embedding vector V t1 And the second sentence embedding vector V t2 Included angle V of t1i And V t2i The first sentence embedded vectors V t1 And the second sentence embedding vector V t2 N is the number of sentence embedded vector elements, ||v t1 I and I V t2 I are the first sentence embedded vectors V, respectively t1 And the second sentence embedding vector V t2 Is a mold of (a).
3. The knowledge fusion method of claim 1, wherein the pre-trained language characterization model is trained by:
obtaining a preset number of technical documents, carrying out pairwise comparison analysis on knowledge nodes of the same category in different technical documents, adopting a sentence converter neural network algorithm based on a pre-training language characterization model, converting the knowledge node pairs of the same category in the technical documents into embedded vector pairs, and determining similarity coefficients of the pre-training language characterization model by cosine similarity calculation of text semantics.
4. The knowledge fusion method according to claim 2, wherein the similarity relationship between the first knowledge node A1 and the second knowledge node A2 of the same category is determined according to the following similarity value range:
5. a computing device, comprising:
at least one processor and a memory storing program instructions;
the program instructions, when read and executed by the processor, cause the computing device to perform the knowledge fusion method of the design rational knowledge network of any of claims 1-4.
6. A readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the knowledge fusion method of the design rational knowledge network of any of claims 1-4.
CN202311475638.6A 2023-11-08 2023-11-08 Knowledge fusion method, device and storage medium of design rationality knowledge network Active CN117236435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311475638.6A CN117236435B (en) 2023-11-08 2023-11-08 Knowledge fusion method, device and storage medium of design rationality knowledge network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311475638.6A CN117236435B (en) 2023-11-08 2023-11-08 Knowledge fusion method, device and storage medium of design rationality knowledge network

Publications (2)

Publication Number Publication Date
CN117236435A CN117236435A (en) 2023-12-15
CN117236435B true CN117236435B (en) 2024-01-30

Family

ID=89089612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311475638.6A Active CN117236435B (en) 2023-11-08 2023-11-08 Knowledge fusion method, device and storage medium of design rationality knowledge network

Country Status (1)

Country Link
CN (1) CN117236435B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725555B (en) * 2024-02-08 2024-06-11 暗物智能科技(广州)有限公司 Multi-source knowledge tree association fusion method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739994A (en) * 2018-12-14 2019-05-10 复旦大学 A kind of API knowledge mapping construction method based on reference documents
CN112200317A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-modal knowledge graph construction method
CN112463980A (en) * 2020-11-25 2021-03-09 南京摄星智能科技有限公司 Intelligent plan recommendation method based on knowledge graph
US11164153B1 (en) * 2021-04-27 2021-11-02 Skyhive Technologies Inc. Generating skill data through machine learning
CN115455935A (en) * 2022-09-14 2022-12-09 华东师范大学 Intelligent text information processing system
WO2023284991A1 (en) * 2021-07-14 2023-01-19 NEC Laboratories Europe GmbH Method and system for a semantic textual similarity search
CN116821371A (en) * 2023-06-30 2023-09-29 电子科技大学 Method for generating scientific abstracts of multiple documents by combining and enhancing topic knowledge graphs
CN116975068A (en) * 2023-09-25 2023-10-31 中国标准化研究院 Metadata-based patent document data storage method, device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220253729A1 (en) * 2021-02-01 2022-08-11 Otsuka Pharmaceutical Development & Commercialization, Inc. Scalable knowledge database generation and transactions processing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739994A (en) * 2018-12-14 2019-05-10 复旦大学 A kind of API knowledge mapping construction method based on reference documents
CN112200317A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-modal knowledge graph construction method
CN112463980A (en) * 2020-11-25 2021-03-09 南京摄星智能科技有限公司 Intelligent plan recommendation method based on knowledge graph
US11164153B1 (en) * 2021-04-27 2021-11-02 Skyhive Technologies Inc. Generating skill data through machine learning
WO2023284991A1 (en) * 2021-07-14 2023-01-19 NEC Laboratories Europe GmbH Method and system for a semantic textual similarity search
CN115455935A (en) * 2022-09-14 2022-12-09 华东师范大学 Intelligent text information processing system
CN116821371A (en) * 2023-06-30 2023-09-29 电子科技大学 Method for generating scientific abstracts of multiple documents by combining and enhancing topic knowledge graphs
CN116975068A (en) * 2023-09-25 2023-10-31 中国标准化研究院 Metadata-based patent document data storage method, device and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A semantic-based knowledge fusion model for solution-oriented information network development: a case study in intrusion detection field;Yu Zhang等;Scientometrics;第857-886页 *
一种基于知识图谱技术的智能制造数据标准数字化转型方法;岳高峰等;学术研讨;第45-51、73页 *
基于深度学习的智能运维知识库系统;赵迪;中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑;第2-4章 *
融合知识网络嵌入特征的高价值专利预测;任海英等;北京工业大学学报(社会科学版);第138-152页 *

Also Published As

Publication number Publication date
CN117236435A (en) 2023-12-15

Similar Documents

Publication Publication Date Title
CN110309267B (en) Semantic retrieval method and system based on pre-training model
US10963794B2 (en) Concept analysis operations utilizing accelerators
US10255275B2 (en) Method and system for generation of candidate translations
US8560300B2 (en) Error correction using fact repositories
US9373075B2 (en) Applying a genetic algorithm to compositional semantics sentiment analysis to improve performance and accelerate domain adaptation
WO2018214486A1 (en) Method and apparatus for generating multi-document summary, and terminal
WO2017092380A1 (en) Method for human-computer dialogue, neural network system and user equipment
WO2018165579A1 (en) Automated tool for question generation
CN117236435B (en) Knowledge fusion method, device and storage medium of design rationality knowledge network
US20090299729A1 (en) Parallel fragment extraction from noisy parallel corpora
CN114911892A (en) Interaction layer neural network for search, retrieval and ranking
RU2640297C2 (en) Definition of confidence degrees related to attribute values of information objects
US20230153534A1 (en) Generating commonsense context for text using knowledge graphs
CN112581327B (en) Knowledge graph-based law recommendation method and device and electronic equipment
CN114491174A (en) Image-text matching method and system based on hierarchical feature aggregation
CN113901783B (en) Domain-oriented document duplication checking method and system
CN115033733A (en) Audio text pair generation method, electronic device and storage medium
KR20210125449A (en) Method for industry text increment, apparatus thereof, and computer program stored in medium
WO2023088278A1 (en) Method and apparatus for verifying authenticity of expression, and device and medium
CN117290478A (en) Knowledge graph question-answering method, device, equipment and storage medium
CN117076636A (en) Information query method, system and equipment for intelligent customer service
WO2023160346A1 (en) Meaning and sense preserving textual encoding and embedding
CN114490946A (en) Xlnet model-based class case retrieval method, system and equipment
CN114036956A (en) Tourism knowledge semantic analysis method and device
RU2643438C2 (en) Detection of linguistic ambiguity in a text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant