CN117150050A - Knowledge graph construction method and system based on large language model - Google Patents

Knowledge graph construction method and system based on large language model Download PDF

Info

Publication number
CN117150050A
CN117150050A CN202311423122.7A CN202311423122A CN117150050A CN 117150050 A CN117150050 A CN 117150050A CN 202311423122 A CN202311423122 A CN 202311423122A CN 117150050 A CN117150050 A CN 117150050A
Authority
CN
China
Prior art keywords
knowledge
text
language model
entity
cot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311423122.7A
Other languages
Chinese (zh)
Other versions
CN117150050B (en
Inventor
赵策
王亚
屠静
苏岳
万晶晶
李伟伟
孙岩
颉彬
周勤民
张玥
潘亮亮
刘岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuoshi Future Beijing technology Co ltd
Original Assignee
Zhuoshi Future Beijing technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuoshi Future Beijing technology Co ltd filed Critical Zhuoshi Future Beijing technology Co ltd
Priority to CN202311423122.7A priority Critical patent/CN117150050B/en
Publication of CN117150050A publication Critical patent/CN117150050A/en
Application granted granted Critical
Publication of CN117150050B publication Critical patent/CN117150050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a knowledge graph construction method and system based on a large language model, and belongs to the technical field of text processing. The method comprises the following steps: carrying out text clustering on the knowledge text data to obtain a plurality of knowledge text data sets T with different text types; submitting the knowledge text data set T to a first HDFS for distributed file storage; according to the length of the knowledge text type, the knowledge text type is orderly extracted from the first HDFS, a preset large language model CoT is adopted, knowledge entity identification is carried out on the extracted knowledge text type, and associated information of each knowledge entity is obtained; submitting the associated information of each knowledge entity to a second HDFS for distributed file storage; and constructing graph node links among all the knowledge entities according to the associated information of all the knowledge entities stored in the second HDFS to obtain a knowledge graph. The invention can adapt to the language processing and saving functions of massive knowledge text data and process the knowledge graph construction of large-scale text types.

Description

Knowledge graph construction method and system based on large language model
Technical Field
The invention relates to the technical field of text processing, in particular to a knowledge graph construction method and system based on a large language model.
Background
The Knowledge Graph (knowledgegraph) is a framework for visually displaying the core structure, development history, front edge field and overall Knowledge of disciplines by using a visual Graph, the complex Knowledge field is displayed by data mining, information processing, knowledge metering and Graph drawing, the dynamic development rule of the Knowledge field is revealed, and a practical and valuable reference is provided for discipline research. In the book emotion, the knowledge map is called knowledge domain visualization or knowledge domain mapping map, which is a series of different graphs showing the knowledge development process and the structural relationship, and knowledge resources and carriers thereof are described by using a visualization technology, and knowledge and the interrelationship between the knowledge resources and the carriers are mined, analyzed, constructed, drawn and displayed.
The basic composition unit of the knowledge graph is an entity-relation-entity triplet, and the entities and related attribute-value pairs thereof are mutually connected through the relation to form a net-shaped knowledge structure. The general flow is as follows:
and extracting the entity and entity relation from the knowledge text data, and establishing a knowledge network map between the entities according to the extracted entity relation.
However, the traditional knowledge graph construction process is mainly used for processing a single text or only two or three texts, is only suitable for extracting small-scale data sets, is very laborious for processing data sets with more than two types of text data sets, cannot be quickly suitable for and process large-scale text type knowledge graph construction, is slow for constructing a knowledge graph by using a traditional single entity extraction method for large-scale data sets containing multiple text types, can only be used for one type of integrated processing, and has a longer period for generating the knowledge graph. Therefore, it is not suitable for the current demand of big data development.
Moreover, under the condition of facing a large data text data set, the traditional knowledge graph construction method does not have the data storage capacity of the large data graph, and is easy to cause the problem of insufficient memory and machine locking.
Disclosure of Invention
The embodiment of the invention provides a knowledge graph construction method and a knowledge graph construction system based on a large language model, which can adapt to language processing and storage functions of massive knowledge text data and process knowledge graph construction of large-scale text types. The technical scheme is as follows:
in one aspect, a knowledge graph construction method based on a large language model is provided, and the method is applied to electronic equipment and comprises the following steps:
acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
carrying out text clustering on the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types; wherein t= { knowledge text type1, knowledge text type2, knowledge text type3.
Submitting the knowledge text data set T to a first HDFS, and storing distributed files; wherein, HDFS represents a Hadoop distributed file system;
according to the length of the knowledge text type, extracting corresponding knowledge text types from the first HDFS in order, and carrying out knowledge entity identification on the extracted knowledge text types by adopting a preset large language model CoT to obtain associated information of each knowledge entity;
submitting the associated information of each knowledge entity to a second HDFS, and storing distributed files;
submitting the associated information of each knowledge entity to a knowledge graph construction module, and constructing graph node links among the knowledge entities by the knowledge graph construction module according to the associated information of each knowledge entity stored in the second HDFS to obtain a knowledge graph.
Further, the text clustering of the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types includes:
constructing a support vector machine, and deploying the support vector machine on a background server;
the preprocessed knowledge text data is sent to the background server to serve as a text clustering sample, and the background server forwards the text clustering sample to the support vector machine to perform text clustering;
the support vector machine performs text structure recognition and clustering processing on the samples by using a support vector clustering algorithm to obtain a plurality of knowledge text types with different text types and outputs the knowledge text types;
and the background server gathers the knowledge text types of a plurality of different text types to obtain the knowledge text data set T.
Further, submitting the knowledge text data set T to a first HDFS, and performing distributed file storage includes:
calculating the text type length of each item of the knowledge text type in the knowledge text data set T, and marking the calculated length value on each item of the knowledge text type;
sequentially arranging the knowledge text types of each item in the knowledge text data set T according to the sequence from large to small by the length values, and rearranging the knowledge text data set T;
traversing all storage nodes of a first HDFS, checking available storage nodes, and sequentially storing all knowledge text types in the rearranged knowledge text data set T in the storage nodes of the first HDFS according to a rearrangement sequence;
and sending the storage addresses of the knowledge text data blocks to a background server.
Further, the sequentially extracting the corresponding knowledge text types from the first HDFS according to the length of the knowledge text types includes:
and sequentially and orderly retrieving each item of knowledge text type from the rearranged knowledge text data set T according to the length value of the knowledge text type, and sending the knowledge text type to the large language model CoT.
Further, the constructing step of the large language model CoT includes:
acquiring training data for training a large language model CoT, wherein the training data comprises text data of different text types/structures;
selecting a GPT natural language processing model, and learning and training a knowledge entity, an association relation of the knowledge entity and an attribute of the knowledge entity in the training data;
when training reaches a preset optimization iteration training condition, stopping training, and generating the large language model CoT;
testing the large language model CoT by using the obtained test set, and judging whether the prediction accuracy of the large language model CoT meets the standard;
and after reaching the standard, optimally training the large language model CoT by using a real-time knowledge text sample, and deploying the large language model CoT on a background server after the optimization training is finished.
Further, the step of performing knowledge entity recognition on the extracted knowledge text type by using a preset large language model CoT, and obtaining association information of each knowledge entity includes:
inputting the knowledge text types which are sequentially called into the large language model CoT;
carrying out knowledge entity identification on the knowledge text type by using the large language model CoT to obtain each knowledge entity contained in the knowledge text type;
extracting the association relation between the knowledge entities according to the context of the knowledge entities, and extracting the attribute information of the knowledge entities;
and outputting the association information formed by the association relation and the attribute information of each knowledge entity.
Further, after knowledge entity identification is performed on the extracted knowledge text type by adopting a preset large language model CoT to obtain associated information of each knowledge entity, the method further comprises:
and carrying out knowledge entity identification on the knowledge text type by using a graph neural network GNN to obtain the associated information of each knowledge entity, and carrying out contrast verification and result correction on the associated information obtained by using the graph neural network GNN and the associated information obtained by using the large language model CoT.
Further, submitting the association information of each knowledge entity to a knowledge graph construction module, wherein the knowledge graph construction module constructs graph node links between each knowledge entity according to the association information of each knowledge entity stored in the second HDFS, and obtaining a knowledge graph includes:
a knowledge graph construction tool TopBraid Composer is deployed in advance on the knowledge graph construction module;
transmitting the association information of the knowledge entities to the TopBraid Composer, and reading the association relation between the knowledge entities and the attribute information of the knowledge entities in the association information by the TopBraid Composer;
distributing corresponding map nodes for the knowledge entities, and establishing map links among the map nodes according to the read association relation among the knowledge entities and the attribute information of the knowledge entities to obtain a knowledge map;
and storing the knowledge graph into a dynamic database Nosql for supporting dynamic application of the knowledge graph.
In one aspect, a knowledge graph construction system based on a large language model is provided, including:
the acquisition module is used for acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
the clustering module is used for carrying out text clustering on the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types; wherein t= { knowledge text type1, knowledge text type2, knowledge text type3.
The first storage module is used for carrying out distributed file storage on the knowledge text data set T;
the recognition module is used for orderly extracting the corresponding knowledge text types from the first storage module according to the length of the knowledge text types, and carrying out knowledge entity recognition on the extracted knowledge text types by adopting a preset large language model CoT to obtain the associated information of each knowledge entity;
the second storage module is used for carrying out distributed file storage on the associated information of each knowledge entity;
and the construction module is used for constructing graph node links among the knowledge entities according to the associated information of the knowledge entities stored in the second storage module to obtain a knowledge graph.
In one aspect, an electronic device is provided, where the electronic device includes a processor and a memory, where the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the knowledge graph construction method based on a large language model.
In one aspect, a computer readable storage medium is provided, in which at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the knowledge graph construction method based on a large language model.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
1) The knowledge graph construction method based on the large language model CoT can convert knowledge text data into a knowledge graph form by utilizing strong natural language processing capability so as to better understand and organize knowledge, is expected to provide a stronger tool for knowledge management and application in various fields, can quickly and efficiently generate the knowledge graph with high capacity and wide range, and reduces the period;
2) Clustering and distributed storage technology of knowledge text data are utilized to perform clustering storage on massive knowledge text data, and associated information of each knowledge entity extracted by a large language model CoT is stored in a distributed mode, so that the problem that the prior art cannot adapt to language processing and storage functions of massive knowledge samples is solved, the storage problem and data processing flow of the knowledge samples are solved, the storage and calling processing of the associated information of the massive knowledge samples are adapted, and knowledge map construction of large-scale text types is quickly adapted and processed;
drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a knowledge graph construction method based on a large language model according to an embodiment of the present invention;
FIG. 2 is a detailed flow chart of a knowledge graph construction method based on a large language model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a distributed file storage flow according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without creative efforts, based on the described embodiments of the present invention fall within the protection scope of the present invention.
As shown in fig. 1 and fig. 2, an embodiment of the present invention provides a knowledge graph construction method based on a large language model, where the method may be implemented by an electronic device, and the electronic device may be a terminal or a server, and the method includes:
s101, acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
in this embodiment, a large amount of knowledge text data may be collected as a sample for constructing a knowledge graph from a plurality of data sources, such as text documents, web pages, databases, log files, social media, and other channels, using tools such as web crawlers or APIs, where the knowledge text data includes: entity, relationship, and attribute information of the knowledge graph.
In this embodiment, the preprocessing is mainly data cleaning: duplicate, invalid or erroneous data is removed, data with inconsistent formats is processed, and format standardization is performed to ensure data quality.
S102, carrying out text clustering on the preprocessed knowledge text data to obtain knowledge text data sets T with different text types (or structures); wherein t= { knowledge text type1, knowledge text type2, knowledge text type3. The method specifically comprises the following steps:
a1, constructing a support vector machine (SVC), and deploying the support vector machine in a background server;
a2, the preprocessed knowledge text data is sent to the background server to serve as a text clustering sample, and the background server forwards the knowledge text data to the support vector machine to perform text clustering;
a3, the support vector machine utilizes a support vector clustering algorithm to perform text structure recognition and clustering processing on the samples to obtain a plurality of knowledge text types with different text types and output the knowledge text types;
in this embodiment, in order to improve the efficiency of text storage and recognition of samples, the samples may be classified, and if the samples are samples of an enterprise, the samples may be collected from a log database, a document database, a technical document recycle bin, etc. of the enterprise, and the collected samples are sent to a background server for clustering.
In this embodiment, a supervised learning manner may be used to classify samples, for example, using a generalized linear classifier to perform nonlinear classification on massive knowledge text data. The support vector machine adopting the support vector clustering algorithm is specifically used, a support vector machine corresponding to the clustering algorithm is constructed, and the support vector machine is deployed on a background server so as to execute clustering operation.
In this embodiment, the support vector machine performs text structure recognition and clustering processing on the samples, and performs sample classification, so as to classify the input knowledge text data into sample sets of different text types, i.e., knowledge text types of independent text types.
And A4, the background server gathers the outputted knowledge text types of a plurality of different text types to obtain the knowledge text data set T.
In this embodiment, the support vector machine outputs each sample set, and the background server gathers each sample set to form the knowledge text data set T. At this time, the sequence of each knowledge text type in the knowledge text data set T is chapter-free, and in order to facilitate distributed storage, the ordered distributed storage needs to be performed by using the HDFS.
S103, submitting the knowledge text data set T to a first HDFS, and storing a distributed file; wherein, HDFS represents a Hadoop distributed file system; as shown in fig. 3, the method specifically includes the following steps:
b1, calculating the text type length of each item of the knowledge text type in the knowledge text data set T, and marking the calculated length value on each item of the knowledge text type;
in this embodiment, the text type length of the knowledge text type refers to the number of characters that can be stored (the capacity size of one knowledge text type), so that a certain storage space (the storage capacity of the storage node in the HDFS may be different) is allocated for each character in advance according to the length value, so that the time consumption caused by capacity matching during storage can be avoided. Thus, in designing database tables or data models, the limitations of field length need to be considered for reasonable storage and text.
B2, orderly arranging all the knowledge text types in the knowledge text data set T according to the sequence from the large value to the small value, and rearranging the knowledge text data set T;
in this embodiment, the text type length of each knowledge text type in the knowledge text data set T may be automatically read by the background server, read from the file attribute of the knowledge text type, and then sequentially arrange the knowledge text types of each item in the knowledge text data set T according to the length value from large to small, and reorder the knowledge text data set T. After rearrangement, the text type length of the original knowledge text data set T, such as the possible knowledge text type3, is the largest, and then it is arranged in the first position, and the knowledge text type1 is replaced. Other knowledge text types are ranked in the same way. If the length values are consistent, the sequence is no matter what the sequence is.
In this embodiment, the larger the length value of the text type, the more preferentially processed, the preferentially distributed stored and preferentially used for the subsequent CoT identification.
B3, traversing all storage nodes of the first HDFS, checking available storage nodes, and sequentially storing all knowledge text types in the rearranged knowledge text data set T in the storage nodes of the first HDFS according to a rearrangement sequence;
in this embodiment, in order to improve the storage efficiency of the text and to sequentially call and process the knowledge text types from the first HDFS, the knowledge text types in the knowledge text data set T are sequentially distributed and stored. Specific: after sorting, traversing each storage node of the first HDFS, checking available storage nodes, and storing all knowledge text types in the rearranged knowledge text data set T in the storage nodes of the first HDFS (storage nodes with storage capacity adapted in advance) in a rearranged order.
In this embodiment, the first HDFS may perform distributed storage processing on each item of the knowledge text type in the knowledge text data set T by using a batch processing function, such as Apache Spark, and store each item of the knowledge text type onto an idle storage node.
And B4, transmitting the storage addresses of the knowledge text data blocks to a background server.
In this embodiment, the storage addresses of the knowledge text data blocks are sent to the background server, so that the background server can conveniently call the data of each knowledge text type according to the address response.
S104, sequentially extracting the corresponding knowledge text types from the first HDFS according to the length of the knowledge text types, and carrying out knowledge entity identification on the extracted knowledge text types by adopting a preset large language model CoT to obtain the associated information of each knowledge entity;
in this embodiment, according to the length value of the knowledge text type, the knowledge text types of each item are sequentially and orderly fetched from the rearranged knowledge text data set T, and sent to the large language model CoT.
In this embodiment, the step of constructing the large language model CoT includes:
c1, acquiring training data for training a large language model CoT, wherein the training data comprises text data of different text types/structures;
in this embodiment, the training data may be structured, semi-structured, or unstructured data, such as text, audio, video, graphics, and the like.
C2, selecting a GPT natural language processing model, and learning and training the knowledge entity, the association relation of the knowledge entity and the attribute of the knowledge entity in the training data;
in this embodiment, the training of the large language model CoT may refer to the training mode of the training model of the existing deep learning technology, such as CNN.
In this embodiment, a large language model suitable for CoT, such as GPT4, is selected, learning and identifying a knowledge entity in the knowledge text data, an association relationship of the knowledge entity, and an attribute of the knowledge entity, and extracting the knowledge entity, the attribute of the knowledge entity, and the association relationship contained in the text data.
C3, stopping training when training reaches preset optimization iteration training conditions, and generating the large language model CoT;
c4, testing the large language model CoT by using the obtained test set, and judging whether the prediction accuracy of the large language model CoT meets the standard or not;
in this embodiment, accuracy of Accuracy may be used to evaluate Accuracy of the large language model CoT generated by training, and if the Accuracy reaches 0.95, training is stopped.
And C5, optimally training the large language model CoT by using a real-time knowledge text sample after reaching the standard, and deploying the large language model CoT on a background server after the optimization training is finished.
In this embodiment, the knowledge text data of the site may also be used to perform real-time optimization training on the large language model CoT, so that the large language model CoT learns the text features of the site in real time. And determining an optimized data set according to the knowledge source and the type of the knowledge graph which are required by the user.
In this embodiment, the knowledge entity identification is performed on the extracted knowledge text type by using a preset large language model CoT to obtain the associated information of each knowledge entity, which may specifically include the following steps:
d1, inputting the knowledge text types which are sequentially called into the large language model CoT;
d2, carrying out knowledge entity identification on the knowledge text type by using the large language model CoT to obtain each knowledge entity contained in the knowledge text type;
d3, extracting the association relation between the knowledge entities according to the context of the knowledge entities, and extracting the attribute information of the knowledge entities;
and D4, outputting the association information formed by the association relation and the attribute information of each knowledge entity.
In this embodiment, the large language model CoT outputs the knowledge entity identifying each extracted knowledge text type, and the attribute information of each knowledge entity and the association relationship between the knowledge entities as the association information corresponding to the knowledge text type.
In this embodiment, the large language model CoT can identify entities, relationships and attributes contained in the text in the knowledge text type, convert text data into nodes and edges in the knowledge graph, further map concepts in the text into the graph structure, and determine relationships between the entities according to the text context.
In this embodiment, during outputting, a package mode may be adopted to output a data packet containing the associated information for each knowledge text type, and the data packet of each knowledge entity is submitted to the second HDFS for distributed storage.
In this embodiment, after a preset large language model CoT is adopted to identify knowledge entities of the extracted knowledge text type, and associated information of each knowledge entity is obtained, the method further includes:
and carrying out knowledge entity identification on the knowledge text type by using a graph neural network GNN to obtain the associated information of each knowledge entity, and carrying out contrast verification and result correction on the associated information obtained by using the graph neural network GNN and the associated information obtained by using the large language model CoT.
As shown in fig. 2, in order to further improve accuracy of the text recognition result, in this embodiment, knowledge entity recognition may be further performed on the knowledge text type by using the graph neural network GNN on the background server, and accuracy of the recognition result of the knowledge text type by using the graph neural network GNN to prove the large language model CoT.
In this embodiment, the process of knowledge entity recognition by the graph neural network GNN for the knowledge text type may refer to the process of text recognition by the large language model CoT, the background server compares the results of two model recognition outputs to determine whether there is a large difference between the recognition result of the large language model CoT and the recognition result output by the graph neural network GNN, and if there is a large difference, the recognition result of the graph neural network GNN may be used to correct the associated information output by the large language model CoT; and otherwise, giving up. For specific reference, it is determined by an administrator whether to intervene in the correction.
S105, submitting the associated information of each knowledge entity to a second HDFS for distributed file storage;
in this embodiment, the specific storage manner of "submitting the associated information of each knowledge entity to the second HDFS for distributed file storage" may refer to the above scheme of "submitting the knowledge text data set T to the first HDFS for distributed file storage".
S106, submitting the associated information of each knowledge entity to a knowledge graph construction module, and constructing graph node links among the knowledge entities by the knowledge graph construction module according to the associated information of each knowledge entity stored in the second HDFS to obtain a knowledge graph; the method specifically comprises the following steps:
e1, pre-deploying a knowledge graph construction tool TopBraid Composer on the knowledge graph construction module;
in this embodiment, topBraid Composer is a knowledge graph construction tool based on the Semantic Web technology, which can allocate corresponding graph nodes for each knowledge entity, allocate attributes of the corresponding knowledge entities for each graph node, and establish graph links between the corresponding graph nodes according to the relationships between the knowledge entities, so as to link the corresponding graph nodes to form a graph network.
E2, sending the association information of the knowledge entities to the TopBraid Composer, and reading the association relation between the knowledge entities in the association information and the attribute information of the knowledge entities by the TopBraid Composer;
e3, distributing corresponding map nodes for the knowledge entities, and establishing map links among the map nodes according to the read association relation among the knowledge entities and the attribute information of the knowledge entities to obtain a knowledge map;
and E4, storing the knowledge graph into a dynamic database Nosql for supporting dynamic application of the knowledge graph.
In this embodiment, when the knowledge graph is constructed, the method mainly includes:
entity identification: identifying entities (such as person names, place names, organization names and the like) from knowledge text data by using a large language model CoT, taking the identified entities as nodes of a map, and distributing a unique identifier for each entity;
and (3) relation extraction: identifying relationships between entities, e.g. "A is the originator of B", defining the type of relationship, based on context in the text;
extracting attributes: attribute information of the entity, such as birthday of a person, longitude and latitude of place, etc., is extracted from the text, and attribute information such as description, category, attribute value, etc. is added to the entity.
In this embodiment, the Nosql database is a data read-write module with high performance and uses various data models, so that various types of text knowledge on the knowledge graph can be processed efficiently, and dynamic information can be provided, so that the Nosql database is selected for storage.
The knowledge graph construction method based on the large language model has the following advantages:
1) The knowledge graph construction method based on the large language model CoT can convert knowledge text data into a knowledge graph form by utilizing strong natural language processing capability so as to better understand and organize knowledge, is expected to provide a stronger tool for knowledge management and application in various fields, can quickly and efficiently generate the knowledge graph with high capacity and wide range, and reduces the period;
2) Clustering and distributed storage technology of knowledge text data are utilized to perform clustering storage on massive knowledge text data, and associated information of each knowledge entity extracted by a large language model CoT is stored in a distributed mode, so that the problem that the prior art cannot adapt to language processing and storage functions of massive knowledge samples is solved, the storage problem and data processing flow of the knowledge samples are solved, the storage and calling processing of the associated information of the massive knowledge samples are adapted, and knowledge map construction of large-scale text types is quickly adapted and processed;
the knowledge graph construction method based on the large language model provided by the embodiment can be customized according to specific requirements and application scenes, so that the organization is helped to better organize and understand a large amount of information and knowledge, and various intelligent applications are supported.
The invention also provides a specific implementation mode of the knowledge graph construction system based on the large language model, and the knowledge graph construction system based on the large language model corresponds to the specific implementation mode of the knowledge graph construction method based on the large language model, and the knowledge graph construction system based on the large language model can achieve the purpose of the invention by executing the flow steps in the specific implementation mode of the method, so that the explanation in the specific implementation mode of the knowledge graph construction method based on the large language model is also applicable to the specific implementation mode of the knowledge graph construction system based on the large language model, which is provided by the invention, and will not be repeated in the following specific implementation mode of the invention.
The embodiment of the invention also provides a knowledge graph construction system based on the large language model, which comprises the following steps:
the acquisition module is used for acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
the clustering module is used for carrying out text clustering on the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types; wherein t= { knowledge text type1, knowledge text type2, knowledge text type3.
The first storage module is used for carrying out distributed file storage on the knowledge text data set T; the first storage module is a first HDFS, which represents a Hadoop distributed file system;
the recognition module is used for orderly extracting the corresponding knowledge text types from the first storage module according to the length of the knowledge text types, and carrying out knowledge entity recognition on the extracted knowledge text types by adopting a preset large language model CoT to obtain the associated information of each knowledge entity;
the second storage module is used for carrying out distributed file storage on the associated information of each knowledge entity; the second storage module is a second HDFS;
and the construction module is used for constructing graph node links among the knowledge entities according to the associated information of the knowledge entities stored in the second storage module to obtain a knowledge graph.
The knowledge graph construction system based on the large language model provided by the embodiment of the invention has at least the following beneficial effects:
1) The knowledge graph construction method based on the large language model CoT can convert knowledge text data into a knowledge graph form by utilizing strong natural language processing capability so as to better understand and organize knowledge, is expected to provide a stronger tool for knowledge management and application in various fields, can quickly and efficiently generate the knowledge graph with high capacity and wide range, and reduces the period;
2) Clustering and distributed storage technology of knowledge text data are utilized to perform clustering storage on massive knowledge text data, and associated information of each knowledge entity extracted by a large language model CoT is stored in a distributed mode, so that the problem that the prior art cannot adapt to language processing and storage functions of massive knowledge samples is solved, the storage problem and data processing flow of the knowledge samples are solved, the storage and calling processing of the associated information of the massive knowledge samples are adapted, and knowledge map construction of large-scale text types is quickly adapted and processed;
the knowledge graph construction system based on the large language model provided by the embodiment can be customized according to specific requirements and application scenes, so that the organization is helped to better organize and understand a large amount of information and knowledge, and various intelligent applications are supported.
Fig. 4 is a schematic structural diagram of an electronic device 600 according to an embodiment of the present invention, where the electronic device 600 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 601 and one or more memories 602, where at least one instruction is stored in the memories 602, and the at least one instruction is loaded and executed by the processors 601 to implement the above-mentioned knowledge graph construction method based on a large language model.
In an exemplary embodiment, a computer readable storage medium, such as a memory including instructions executable by a processor in a terminal to perform the above-described knowledge-graph construction method based on a large language model, is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
References in the specification to "one embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the relevant art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The invention is intended to cover any alternatives, modifications, equivalents, and variations that fall within the spirit and scope of the invention. In the following description of preferred embodiments of the invention, specific details are set forth in order to provide a thorough understanding of the invention, and the invention will be fully understood to those skilled in the art without such details. In other instances, well-known methods, procedures, flows, components, circuits, and the like have not been described in detail so as not to unnecessarily obscure aspects of the present invention.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in implementing the methods of the embodiments described above may be implemented by a program that instructs associated hardware, and the program may be stored on a computer readable storage medium, such as: ROM/RAM, magnetic disks, optical disks, etc.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (9)

1. The knowledge graph construction method based on the large language model is characterized by comprising the following steps of:
acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
carrying out text clustering on the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types; wherein t= { knowledge text type1, knowledge text type2, knowledge text type3.
Submitting the knowledge text data set T to a first HDFS, and storing distributed files; wherein, HDFS represents a Hadoop distributed file system;
according to the length of the knowledge text types, sequentially extracting the corresponding knowledge text types from the first HDFS, and carrying out knowledge entity identification on the extracted knowledge text types by adopting a preset large language model CoT to obtain the associated information of each knowledge entity;
submitting the associated information of each knowledge entity to a second HDFS, and storing distributed files;
submitting the associated information of each knowledge entity to a knowledge graph construction module, and constructing graph node links among the knowledge entities by the knowledge graph construction module according to the associated information of each knowledge entity stored in the second HDFS to obtain a knowledge graph.
2. The knowledge graph construction method based on the large language model according to claim 1, wherein the text clustering the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T of different text types includes:
constructing a support vector machine, and deploying the support vector machine on a background server;
the preprocessed knowledge text data is sent to the background server to serve as a text clustering sample, and the background server forwards the text clustering sample to the support vector machine to perform text clustering;
the support vector machine performs text structure recognition and clustering processing on the samples by using a support vector clustering algorithm to obtain a plurality of knowledge text types with different text types and outputs the knowledge text types;
and the background server gathers the knowledge text types of a plurality of different text types to obtain the knowledge text data set T.
3. The knowledge graph construction method based on a large language model according to claim 1, wherein submitting the knowledge text data set T to a first HDFS for distributed file storage comprises:
calculating the text type length of each item of the knowledge text type in the knowledge text data set T, and marking the calculated length value on each item of the knowledge text type;
sequentially arranging the knowledge text types of each item in the knowledge text data set T according to the sequence from large to small by the length values, and rearranging the knowledge text data set T;
traversing all storage nodes of a first HDFS, checking available storage nodes, and sequentially storing all knowledge text types in the rearranged knowledge text data set T in the storage nodes of the first HDFS according to a rearrangement sequence;
and sending the storage addresses of the knowledge text data blocks to a background server.
4. The knowledge graph construction method based on a large language model according to claim 3, wherein the sequentially extracting the corresponding knowledge text types from the first HDFS according to the lengths of the knowledge text types comprises:
and sequentially and orderly retrieving each item of knowledge text type from the rearranged knowledge text data set T according to the length value of the knowledge text type, and sending the knowledge text type to the large language model CoT.
5. The knowledge graph construction method based on the large language model according to claim 1, wherein the large language model CoT construction step includes:
acquiring training data for training a large language model CoT, wherein the training data comprises text data of different text types/structures;
selecting a GPT natural language processing model, and learning and training a knowledge entity, an association relation of the knowledge entity and an attribute of the knowledge entity in the training data;
when training reaches a preset optimization iteration training condition, stopping training, and generating the large language model CoT;
testing the large language model CoT by using the obtained test set, and judging whether the prediction accuracy of the large language model CoT meets the standard;
and after reaching the standard, optimally training the large language model CoT by using a real-time knowledge text sample, and deploying the large language model CoT on a background server after the optimization training is finished.
6. The knowledge graph construction method based on a large language model according to claim 1, wherein the knowledge entity identification is performed on the extracted knowledge text type by using a preset large language model CoT, and obtaining the associated information of each knowledge entity comprises:
inputting the knowledge text types which are sequentially called into the large language model CoT;
carrying out knowledge entity identification on the knowledge text type by using the large language model CoT to obtain each knowledge entity contained in the knowledge text type;
extracting the association relation between the knowledge entities according to the context of the knowledge entities, and extracting the attribute information of the knowledge entities;
and outputting the association information formed by the association relation and the attribute information of each knowledge entity.
7. The knowledge graph construction method based on a large language model according to claim 1, wherein after knowledge entity identification is performed on the extracted knowledge text type by using a preset large language model CoT, and associated information of each knowledge entity is obtained, the method further comprises:
and carrying out knowledge entity identification on the knowledge text type by using a graph neural network GNN to obtain the associated information of each knowledge entity, and carrying out contrast verification and result correction on the associated information obtained by using the graph neural network GNN and the associated information obtained by using the large language model CoT.
8. The knowledge graph construction method based on the large language model according to claim 1, wherein submitting the associated information of each knowledge entity to a knowledge graph construction module, the knowledge graph construction module constructing graph node links between each knowledge entity according to the associated information of each knowledge entity stored in the second HDFS, and obtaining a knowledge graph comprises:
a knowledge graph construction tool TopBraid Composer is deployed in advance on the knowledge graph construction module;
transmitting the association information of the knowledge entities to the TopBraid Composer, and reading the association relation between the knowledge entities and the attribute information of the knowledge entities in the association information by the TopBraid Composer;
distributing corresponding map nodes for the knowledge entities, and establishing map links among the map nodes according to the read association relation among the knowledge entities and the attribute information of the knowledge entities to obtain a knowledge map;
and storing the knowledge graph into a dynamic database Nosql for supporting dynamic application of the knowledge graph.
9. A knowledge graph construction system based on a large language model, comprising:
the acquisition module is used for acquiring knowledge text data for constructing a knowledge graph and preprocessing the knowledge text data;
the clustering module is used for carrying out text clustering on the preprocessed knowledge text data to obtain a plurality of knowledge text data sets T with different text types; wherein t= { knowledge text type1, knowledge text type2, knowledge text type3.
The first storage module is used for carrying out distributed file storage on the knowledge text data set T;
the recognition module is used for orderly extracting the corresponding knowledge text types from the first storage module according to the length of the knowledge text types, and carrying out knowledge entity recognition on the extracted knowledge text types by adopting a preset large language model CoT to obtain the associated information of each knowledge entity;
the second storage module is used for carrying out distributed file storage on the associated information of each knowledge entity;
and the construction module is used for constructing graph node links among the knowledge entities according to the associated information of the knowledge entities stored in the second storage module to obtain a knowledge graph.
CN202311423122.7A 2023-10-31 2023-10-31 Knowledge graph construction method and system based on large language model Active CN117150050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311423122.7A CN117150050B (en) 2023-10-31 2023-10-31 Knowledge graph construction method and system based on large language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311423122.7A CN117150050B (en) 2023-10-31 2023-10-31 Knowledge graph construction method and system based on large language model

Publications (2)

Publication Number Publication Date
CN117150050A true CN117150050A (en) 2023-12-01
CN117150050B CN117150050B (en) 2024-01-26

Family

ID=88912455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311423122.7A Active CN117150050B (en) 2023-10-31 2023-10-31 Knowledge graph construction method and system based on large language model

Country Status (1)

Country Link
CN (1) CN117150050B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117391192A (en) * 2023-12-08 2024-01-12 杭州悦数科技有限公司 Method and device for constructing knowledge graph from PDF by using LLM based on graph database
CN117494806A (en) * 2023-12-28 2024-02-02 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Relation extraction method, system and medium based on knowledge graph and large language model
CN117592562A (en) * 2024-01-18 2024-02-23 卓世未来(天津)科技有限公司 Knowledge base automatic construction method based on natural language processing

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106168965A (en) * 2016-07-01 2016-11-30 竹间智能科技(上海)有限公司 Knowledge mapping constructing system
CN106815307A (en) * 2016-12-16 2017-06-09 中国科学院自动化研究所 Public Culture knowledge mapping platform and its use method
CN107480125A (en) * 2017-07-05 2017-12-15 重庆邮电大学 A kind of relational links method of knowledge based collection of illustrative plates
CN110162639A (en) * 2019-04-16 2019-08-23 深圳壹账通智能科技有限公司 Knowledge figure knows the method, apparatus, equipment and storage medium of meaning
CN113268609A (en) * 2021-06-22 2021-08-17 中国平安人寿保险股份有限公司 Dialog content recommendation method, device, equipment and medium based on knowledge graph
CN113468107A (en) * 2021-09-02 2021-10-01 阿里云计算有限公司 Data processing method, device, storage medium and system
CN114399006A (en) * 2022-03-24 2022-04-26 山东省计算中心(国家超级计算济南中心) Multi-source abnormal composition image data fusion method and system based on super-calculation
WO2022116417A1 (en) * 2020-12-03 2022-06-09 平安科技(深圳)有限公司 Triple information extraction method, apparatus, and device, and computer-readable storage medium
CN115344712A (en) * 2022-08-17 2022-11-15 河北工业大学 Carbon standard knowledge graph construction method based on fusion text
WO2023004807A1 (en) * 2021-07-30 2023-02-02 西门子股份公司 Knowledge management system, method and apparatus, electronic device, and storage medium
WO2023093355A1 (en) * 2021-11-25 2023-06-01 支付宝(杭州)信息技术有限公司 Data fusion method and apparatus for distributed graph learning
CN116795973A (en) * 2023-08-16 2023-09-22 腾讯科技(深圳)有限公司 Text processing method and device based on artificial intelligence, electronic equipment and medium
CN116932733A (en) * 2023-04-07 2023-10-24 北京百度网讯科技有限公司 Information recommendation method and related device based on large language model
CN116955857A (en) * 2022-11-16 2023-10-27 腾讯科技(深圳)有限公司 Data processing method, device, medium and electronic equipment

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106168965A (en) * 2016-07-01 2016-11-30 竹间智能科技(上海)有限公司 Knowledge mapping constructing system
CN106815307A (en) * 2016-12-16 2017-06-09 中国科学院自动化研究所 Public Culture knowledge mapping platform and its use method
CN107480125A (en) * 2017-07-05 2017-12-15 重庆邮电大学 A kind of relational links method of knowledge based collection of illustrative plates
CN110162639A (en) * 2019-04-16 2019-08-23 深圳壹账通智能科技有限公司 Knowledge figure knows the method, apparatus, equipment and storage medium of meaning
WO2022116417A1 (en) * 2020-12-03 2022-06-09 平安科技(深圳)有限公司 Triple information extraction method, apparatus, and device, and computer-readable storage medium
CN113268609A (en) * 2021-06-22 2021-08-17 中国平安人寿保险股份有限公司 Dialog content recommendation method, device, equipment and medium based on knowledge graph
WO2023004807A1 (en) * 2021-07-30 2023-02-02 西门子股份公司 Knowledge management system, method and apparatus, electronic device, and storage medium
CN113468107A (en) * 2021-09-02 2021-10-01 阿里云计算有限公司 Data processing method, device, storage medium and system
WO2023093355A1 (en) * 2021-11-25 2023-06-01 支付宝(杭州)信息技术有限公司 Data fusion method and apparatus for distributed graph learning
CN114399006A (en) * 2022-03-24 2022-04-26 山东省计算中心(国家超级计算济南中心) Multi-source abnormal composition image data fusion method and system based on super-calculation
CN115344712A (en) * 2022-08-17 2022-11-15 河北工业大学 Carbon standard knowledge graph construction method based on fusion text
CN116955857A (en) * 2022-11-16 2023-10-27 腾讯科技(深圳)有限公司 Data processing method, device, medium and electronic equipment
CN116932733A (en) * 2023-04-07 2023-10-24 北京百度网讯科技有限公司 Information recommendation method and related device based on large language model
CN116795973A (en) * 2023-08-16 2023-09-22 腾讯科技(深圳)有限公司 Text processing method and device based on artificial intelligence, electronic equipment and medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117391192A (en) * 2023-12-08 2024-01-12 杭州悦数科技有限公司 Method and device for constructing knowledge graph from PDF by using LLM based on graph database
CN117391192B (en) * 2023-12-08 2024-03-15 杭州悦数科技有限公司 Method and device for constructing knowledge graph from PDF by using LLM based on graph database
CN117494806A (en) * 2023-12-28 2024-02-02 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Relation extraction method, system and medium based on knowledge graph and large language model
CN117494806B (en) * 2023-12-28 2024-03-08 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Relation extraction method, system and medium based on knowledge graph and large language model
CN117592562A (en) * 2024-01-18 2024-02-23 卓世未来(天津)科技有限公司 Knowledge base automatic construction method based on natural language processing
CN117592562B (en) * 2024-01-18 2024-04-09 卓世未来(天津)科技有限公司 Knowledge base automatic construction method based on natural language processing

Also Published As

Publication number Publication date
CN117150050B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN117150050B (en) Knowledge graph construction method and system based on large language model
US20190311025A1 (en) Methods and systems for modeling complex taxonomies with natural language understanding
CN103678418B (en) Information processing method and message processing device
US10146862B2 (en) Context-based metadata generation and automatic annotation of electronic media in a computer network
WO2019174132A1 (en) Data processing method, server and computer storage medium
CN107391677B (en) Method and device for generating Chinese general knowledge graph with entity relation attributes
CN111125086B (en) Method, device, storage medium and processor for acquiring data resources
CN102855282B (en) A kind of document recommendation method and device
KR101801257B1 (en) Text-Mining Application Technique for Productive Construction Document Management
JP6868576B2 (en) Event presentation system and event presentation device
CN106777140B (en) Method and device for searching unstructured document
KR20210129465A (en) Apparatus for managing laboratory note and method for searching laboratory note using thereof
CN116243869A (en) Data processing method and device and electronic equipment
CN113407678B (en) Knowledge graph construction method, device and equipment
CN113127650A (en) Technical map construction method and system based on map database
CN113779248A (en) Data classification model training method, data processing method and storage medium
CN113641705A (en) Marketing disposal rule engine method based on calculation engine
CN113407803A (en) Method for acquiring internet data in one step
CN113032436A (en) Searching method and device based on article content and title
Drobac et al. The Laborious Cleaning: Acquiring and Transforming 19th-Century Epistolary Metadata
CN110633315A (en) Data processing method and device and computer storage medium
CN117688140B (en) Document query method, device, computer equipment and storage medium
CN115438236B (en) Unified hybrid search method and system
CN116028620A (en) Method and system for generating patent abstract based on multi-task feature cooperation
CN112560455A (en) Data fusion method and related system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant