CN114969441A - Knowledge mining engine system based on graph database - Google Patents

Knowledge mining engine system based on graph database Download PDF

Info

Publication number
CN114969441A
CN114969441A CN202210303691.7A CN202210303691A CN114969441A CN 114969441 A CN114969441 A CN 114969441A CN 202210303691 A CN202210303691 A CN 202210303691A CN 114969441 A CN114969441 A CN 114969441A
Authority
CN
China
Prior art keywords
graph
layer
database
data
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210303691.7A
Other languages
Chinese (zh)
Inventor
杨波
王冬冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Langxin Data Technology Co ltd
Original Assignee
Langxin Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Langxin Data Technology Co ltd filed Critical Langxin Data Technology Co ltd
Priority to CN202210303691.7A priority Critical patent/CN114969441A/en
Publication of CN114969441A publication Critical patent/CN114969441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/027Frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a knowledge mining engine system based on a graph database, which comprises: the system comprises a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer which are sequentially connected from bottom to top. The knowledge mining engine system based on the graph database, provided by the invention, is a graph engine platform integrating calculation, query, storage and visualization based on a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer, can be accessed to data sources of multiple users and multiple industrial scenes and perform calculation force adaptation on the data sources so as to generate knowledge graphs of multiple application approaches, can improve the availability and expansibility of the system, further can improve the knowledge mining efficiency under multiple scenes and save the cost.

Description

Knowledge mining engine system based on graph database
Technical Field
The invention relates to the technical field of big data, in particular to a knowledge mining engine system based on a graph database.
Background
With the advent of the big data era, the ability of combing mass data and the ability of connecting multi-source multi-domain data for associated value mining are important.
The knowledge map is a map formed by nodes and relations, is a basic and universal expression language, can express various complex relations and structures in an objective world with high fidelity, and can describe each service scene in the real world more intuitively and efficiently without conversion and processing in an intermediate process. This conversion and processing often complicates the problem or misses out on many valuable information.
In a traditional knowledge mining means, reading and cross-table analysis are generally carried out in a form, a field and the like mode so as to mine the relation between data. However, for data in different fields or different service scenarios, it is difficult to connect data that originally appears to have no correlation and integrate discrete data together. Therefore, the traditional knowledge mining method has poor adaptability to different industries and weak mining capability.
Disclosure of Invention
The invention provides a knowledge mining engine system based on a graph database, which is used for solving the defect that the traditional knowledge mining method based on graph data in the prior art has a single scene, and realizing high-efficiency and high-expansibility knowledge mining on the basis of accessing data sources in multiple industries.
The invention provides a knowledge mining engine system based on a graph database, which comprises a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer which are sequentially connected from bottom to top;
the graph data storage layer is used for storing various types of databases in the target field;
the data persistence layer is used for managing and adapting each database;
the graph strategy management layer is used for generating a business strategy so that the graph data service layer can execute the business strategy to generate a knowledge graph;
the graph data service layer is used for establishing a target service according to task requirements, integrating user authority management and monitoring management capacity, executing the service strategy and acquiring the knowledge graph;
the interface layer is used for providing a service interface for third-party service access and establishing standardized data interaction;
the application layer is used for realizing the visualization of a graph database and the display of the knowledge graph;
wherein the target area comprises one or more.
According to the knowledge mining engine system based on the graph database, the database at least comprises one of a memory database, an embedded database, a relational database, a nosql database and a distributed database.
According to the knowledge mining engine system based on the graph database provided by the invention, the data persistence layer comprises:
and the characteristic management module is used for acquiring the characteristic information of the database so as to determine the graph data retrieval mode corresponding to the database.
According to the knowledge mining engine system based on the graph database provided by the invention, the graph strategy management layer comprises:
the strategy management module is used for determining a corresponding service strategy according to the characteristic information of the database;
and the algorithm library is used for storing an algorithm of map calculation so as to be called by the map data service layer to generate the knowledge map.
According to the knowledge mining engine system based on the graph database, the business strategies at least comprise scene strategies, main key strategies, index strategies and cache strategies.
According to the knowledge mining engine system based on the graph database provided by the invention, the graph data service layer comprises:
the authority management module is used for configuring the authority corresponding to the target service;
and the graph management module is used for calling the algorithm library based on the authority to generate the knowledge graph.
According to the knowledge mining engine system based on the graph database provided by the invention, the graph management module further comprises:
and the studying and judging unit is used for recording the analysis and studying and judging process of the data.
According to the invention, the interface layer comprises:
the interface management module is used for managing the service interface;
a normalization module for normalizing each of the service interfaces;
wherein the service interface comprises at least one of a graph operation interface, a graph query interface, and a third party interface.
According to the knowledge mining engine system based on the graph database, provided by the invention, the interface management module comprises an external interface and an internal interface, so that users with different roles can call the interface management module, and sharing cooperation is realized.
According to the knowledge mining engine system based on the graph database provided by the invention, the application layer comprises:
the access module is used for receiving data in various formats;
a tool module for storing components associated with the data analysis tool for front-end display;
and the plug-in module is used for accessing the plug-in of the required service application according to the task requirement.
The knowledge mining engine system based on the graph database, provided by the invention, is a graph engine platform integrating calculation, query, storage and visualization based on a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer, can be accessed to data sources of multiple users and multiple industrial scenes and perform calculation force adaptation on the data sources so as to generate knowledge graphs of multiple application approaches, can improve the availability and expansibility of the system, further can improve the knowledge mining efficiency under multiple scenes and save the cost.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is one of the schematic structural diagrams of a graph database based knowledge mining engine system provided by the present invention;
FIG. 2 is a second schematic diagram of a knowledge mining engine system based on a graph database according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
FIG. 1 is a schematic diagram of a knowledge mining engine system based on a graph database according to the present invention. As shown in FIG. 1, the knowledge mining engine system based on graph database provided by the embodiment of the invention comprises a graph data storage layer 110, a data persistence layer 120, a graph policy management layer 130, a graph data service layer 140, an interface layer 150 and an application layer 160 which are connected in sequence from bottom to top.
Specifically, the knowledge mining engine system based on graph databases includes at least a bottom-to-top graph data store layer 110, a data persistence layer 120, a graph policy management layer 130, a graph data services layer 140, an interface layer 150, and an application layer 160.
During the application process of the system, the user inputs a knowledge mining request to the system according to the actual task requirement, and the knowledge mining request is received by the top application layer 160, so as to establish the interaction with the user. The requests are transmitted downwards layer by layer and corresponding responses are carried out by each layer until various databases stored in the image data storage layer 110 at the bottommost layer are mined, and after a knowledge graph is generated, the knowledge graph is fed back to the application layer 160 for front-end display.
And a graph data storage layer 110 for storing various types of databases in the target domain.
Wherein the target area comprises one or more.
The target field refers to a subdivided industrial field. The embodiment of the invention is not particularly limited to this, and the target field and the application scenario thereof include, but are not limited to, decision systems, recommendation systems, intelligent question answering and the like in multiple fields such as e-commerce, finance, law, medical treatment, smart home and the like.
Specifically, the data storage layer 110 may access data sources in various target fields, and store the data sources in the form of a database, so that an upper layer may read data for transaction processing.
The structure of the data storage layer 110 is not particularly limited in the embodiment of the present invention.
Preferably, the data storage layer 110 may store schema data related to the knowledge graph and Resource Description Framework (RDF) data in a data warehouse, so that the graph database engine may perform adaptation in different degrees according to databases with different characteristics and corresponding application scenarios, so as to implement heterogeneity, extensibility, and high flexibility of the graph database engine.
Illustratively, for general demonstration requirements, a memory database or an embedded database can be selected as much as possible, and the method has high performance, simple deployment and strong heterogeneous capability.
For example, in professional application scenarios such as knowledge analysis, a simple QA question-answering capability is required, and management using a relational database is considered, which is convenient for maintenance and problem troubleshooting.
For example, in an application scenario of mass data processing (for example, in the field of intelligent wind control and the like), storage and analysis may be performed based on a distributed database to meet the requirement of mass data knowledge reasoning.
It will be appreciated that the hierarchical storage separates the raw data information in the database from the refined data information, allowing each component to have a different scope and responsibility for easier location and understanding when in use. The embodiment of the present invention does not specifically limit the layering.
Illustratively, the data storage layer 110 is divided into at least an original data layer, a body layer, a graph data layer and a statistical index layer, a DML layer, for hierarchical storage of data in the data warehouse. Through standard data layering, a complex task is decomposed into a plurality of steps to be completed, and each layer only processes a single step, so that repeated calculation can be reduced. And when the data has problems, all the data does not need to be repaired, and the repairing is only started from the problematic step. The efficiency of various data in inquiry, storage and calculation call is ensured, and architectural support and safety guarantee are provided for later data form expansion and data safety.
And the data persistence layer 120 is used for managing and adapting each database.
Specifically, the data persistence layer 120 may adopt ways such as a unified database driver version, a unified database coding specification, and a unified database connection configuration management, so as to manage and adapt different types of databases, and encapsulate configurations corresponding to different data in a form of function handles, so that an upper layer can read corresponding data in a correct way when executing business logic.
And the graph policy management layer 130 is used for generating business policies, so that the graph data service layer executes the business policies to generate the knowledge graph.
Specifically, after the graph policy management layer 130 performs basic adaptation according to a certain database, a series of logic processing is performed, a corresponding service policy is configured for the graph policy management layer, and the service policy is encapsulated in the form of a function handle, so that an upper layer can use a correct policy to perform data processing when performing service logic.
The business strategy at least comprises an analysis strategy, a data processing strategy and a graph calculation tool for the mass data. The business strategy is used for providing a processing mode with reasonable computational cost for actual business logic.
And the graph data service layer 140 is used for establishing a target service according to task requirements, integrating user authority management and monitoring management capacity, executing a service strategy and acquiring a knowledge graph.
It should be noted that the top level of the knowledge mining engine system based on the graph database interacts with the user, and the user inputs a service request to the system based on the task requirements of the actual knowledge mining.
Specifically, the graph data service layer 140 may receive a service request transmitted by an upper layer, create a corresponding target service in response to the service request, and execute a corresponding service policy after acquiring the service policy from the graph policy management layer 130 by calling an API, thereby generating a knowledge graph.
And the interface layer 150 is used for providing a service interface for third-party service access and establishing standardized data interaction.
Specifically, the interface layer 150 decouples the various service interfaces, respectively, and may unify the standard specifications of the various service interfaces so as to quickly interface with the database engine service when applied.
The standard specification related to the service interface includes, but is not limited to, a service request specification, a transport format standard specification, an abnormal data dictionary specification, and the like, which is not specifically limited in this embodiment of the present invention.
And the application layer 160 is used for realizing the visualization of the graph database and the display of the knowledge graph.
Specifically, the application layer 160 may receive a service request input by a user in a direction of actual task requirements for knowledge mining, and may synchronize data updates in real time to complete real-time map refreshing.
The embodiment of the present invention does not specifically limit the data format accessed in the service request.
Optionally, the service request may include a text element, a picture element, or a relationship between elements that the user determines according to task requirements.
Optionally, the service request may include the accessed data, and the data format of the service request includes, but is not limited to, MySQL, Postgresql, and the like, which is not specifically limited in this embodiment of the present invention.
The embodiment of the invention forms a graph engine platform integrating calculation, query, storage and visualization based on a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer, can be accessed to data sources of multiple users and multiple industrial scenes, and performs calculation force adaptation on the data sources to generate knowledge graphs of multiple application ways, can improve the availability and expansibility of a system, further can improve the knowledge mining efficiency under multiple scenes, and saves the cost.
FIG. 2 is a second schematic diagram of a knowledge mining engine system based on a graph database according to the present invention. Based on the content of any of the above embodiments, as shown in fig. 2, the database includes at least one of an in-memory database 211, an embedded database 212, a relational database 213, a nosql-type database 214, and a distributed database 215.
Specifically, the graph data storage layer 210 may support multiple types of databases in multiple industry scenarios, and the embodiment of the present invention does not specifically limit the included databases and the number thereof.
Optionally, the graph data storage layer 210 may include a memory database 211, where the memory database 211 is a database management system that mainly depends on a memory to store data, and functions to minimize disk access and improve data reading and writing speed.
Optionally, the graph data store layer 210 can contain embedded databases 212, and the embedded databases 212 can be executed embedded in a process without the need for a separate engine. Illustratively, the embedded database 212 includes, but is not limited to, RokctDB, ScyllaDB, and the like.
The memory database 211 and the embedded database 212 are convenient for being rapidly built in a demonstration environment or a test environment, reduce the dependence on other middleware and realize the aim of rapid application of services.
Alternatively, graph data store layer 210 can contain relational database 213, relational database 213 being a database that employs a relational model to organize data and store and associate data in rows and columns. The structure of the two-dimensional table is very close to the real world, is easy to understand and is convenient to operate. Illustratively, the relational database 213 includes, but is not limited to, SQL Server, Oracle, Mysql, PostgreSQL, and the like.
Alternatively, graph data store layer 210 may contain nosql-type databases 214, nosql-type databases 214 referring to those data store systems that are non-relational, distributed, and generally do not guarantee ACID. The nosql-type database 214 is stored with key values and is structurally unstable, and each tuple can have different fields, which is not limited to a fixed structure, and can reduce some time and space overhead. The nosql-type database 214 includes, but is not limited to, non-relational databases such as MongoDB.
The relational database 213 and the nosql database 214 are convenient for searching and maintaining uniformly in the application scene of small-batch business data, such as popular science knowledge application requirements of garbage classification and the like, and meanwhile, the technical selection is consistent with the business application.
Optionally, the graph data storage layer 210 may include a distributed database 215, where the distributed database 215 integrates items and frames such as Hbase, HDFS, Spark, and the like, supports batch processing and stream processing, accelerates processing speed through parallel processing, performs data processing in an efficient and scalable manner, and provides storage and computation for mass data.
The embodiment of the invention forms a graph data storage layer based on the memory database, the embedded database, the relational database, the nosql database and the distributed database, can flexibly select a real knowledge storage mode according to different service application scene requirements, further can fuse data formats with multiple dimensions, and is convenient for carrying out more-dimensional and deeper data analysis on the graph.
Based on the contents of any of the above embodiments, the data persistence layer 220 includes: the feature management module 221 is configured to obtain feature information of the database to determine a graph data retrieval method corresponding to the database.
Specifically, if uniform management and adaptation are required in the data persistence layer 220, at least a feature management module 221 is provided, which can obtain feature information of the database to adapt corresponding basic settings according to characteristics of different databases.
The embodiment of the present invention does not specifically limit the basic configuration of the database.
Illustratively, the base settings may include a graph data retrieval approach.
According to the characteristics of different databases, the service performance can be greatly improved in certain application scenarios, for example, in the knowledge reasoning process of massive data, the Hbase database is taken as an example, the field attribute of the Hbase database does not support the data type, but the prefix is supported to perform interval retrieval in the graph data retrieval process, so that the graph data retrieval mode corresponding to the Hbase database can be configured to be a prefix interval retrieval mode by using rowkey, and the retrieval performance is greatly improved. .
It is understood that the data persistence layer 220 may include other modules to adapt other basic settings besides supporting feature management, which is not specifically limited by the embodiment of the present invention.
Optionally, the data persistence layer 220 may include a connection pool management module 222, which may reasonably adjust parameters of a database connection pool according to different service scenarios, where the parameters include information such as an initialized connection pool size, a maximum active number, a maximum connection waiting time, and the like, and performance of database operation may be significantly improved through parameter optimization.
Optionally, the data persistence layer 220 may include a meta information management module 223, so that a user may clearly know which medium storage is used by the current graph database engine, including information such as the version of the medium, the deployment IP address, the size of the storage space, and the like, and also include the number of stored nodes, the total number of attributes, the total number of relationships between nodes, and index-related index information. The service condition of the current knowledge graph can be well concerned through the index information, and the monitoring is convenient.
Optionally, the data persistence layer 220 may include a transaction management module 224, which may ensure atomicity and serializability in business applications, automatically control the database by multiple transactions, and restore the database to a state that existed before the transaction failure, so as to meet the requirement of mass knowledge storage.
Optionally, the data persistence layer 220 may include a monitoring management module 225, which may monitor various data operation indexes generated by the business service and support creation of a unified monitoring center through projects and frameworks such as promemeus, Grafana, and victoria metrics.
Optionally, the data persistence layer 220 may include a serialization management module 226, which may greatly improve the storage and retrieval performance of the graph data, and the serialization manner is not limited to JDK serialization, JSON serialization, and the like.
The embodiment of the invention manages and adapts based on the characteristic management module according to the characteristic information of the database, has the universal characteristic capability and the exclusive characteristic capability aiming at different data sources, and can realize the improvement of the performance.
Based on the content of any of the above embodiments, the graph policy management layer 230 includes: and the strategy management module is used for determining the corresponding service strategy according to the characteristic information of the database.
Specifically, the graph policy management layer 230 is provided with a policy management module 231, which can perform corresponding setting for the non-logical part involved in the service according to the characteristic information of a certain database, so as to determine a corresponding service policy for the non-logical part involved in the service.
And the algorithm library 232 is used for storing the algorithm of the map calculation so as to be called by the map data service layer to generate the knowledge map.
Specifically, an algorithm library 232 is further provided in the graph policy management layer 230, and the algorithm library 232 may store related algorithms of the graph calculation, providing a machine learning calculation capability based on the graph. Knowledge calculation and mining of the map are realized, and the user is helped to make an auxiliary decision
The graph calculation algorithm includes, but is not limited to, PageRank, LPA, clustering coefficients, graph diameter, connected components, subgraph matching, inference calculation, and the like, which is not specifically limited in this embodiment of the present invention.
The embodiment of the invention can obtain perfect strategy management based on the cooperation of the strategy management module and the algorithm library, and configure the required calculation tool on the premise of optimal calculation power. The computational cost for implementing the map related system can be saved, and the knowledge mining efficiency is further improved.
Based on the content of any of the above embodiments, the service policy at least includes a scenario policy, a primary key policy, an index policy, and a cache policy.
Specifically, the service policy that is uniformly managed in the policy management module 231 may be correspondingly set and expanded according to different task requirements. The embodiment of the present invention is not particularly limited thereto.
Optionally, the business policy may include a scenario policy to perform switching between Online Analytical Processing (OLAP) and Online transaction Processing (OLTP) according to different task requirements.
The OLAP can perform transaction processing on the massive data of the conventional relational database.
OLTP may perform batch and streaming processing operations on the mass data of the data warehouse.
Optionally, the service policy may include a primary key policy, where generation of a primary key is implemented by a default Universal Unique Identifier (UUID), and also supports an autonomous key policy, a distributed ID generation policy, and the like, so as to implement multiple primary key generation policies, which is convenient to flexibly adjust according to different service scenarios to ensure data consistency.
Optionally, the service policy may include an index policy, and index creation is performed on the nodes, the attributes, and the relationships, so that the retrieval speed of knowledge application is increased, multi-degree relationship analysis is performed quickly, and the knowledge reasoning capability is realized.
Optionally, the service policy may include cache policy management, where the query result includes multiple cache expiration policies, such as a common First In First Out (FIFO) expiration policy, a Least Recently Used (LRU) expiration policy, and a Least recently used (LFU) expiration policy, and the cache policy may be flexibly configured according to different service requirements.
The embodiment of the invention describes the non-logic configuration of the service based on the scene strategy, the main key strategy, the index strategy and the cache strategy, and can improve the efficiency of online and offline parallel computation.
Based on the content of any of the above embodiments, the graph data service layer 240 includes: and the right management module 241 is configured to configure a right corresponding to the target service.
Specifically, a right management module 241 is disposed in the graph data service layer 240, and may configure corresponding rights for the target service according to the service attribute of the target service. For example, the user right, the service type, and the like are used, which is not specifically limited in this embodiment of the present invention.
Rights management module 241 may include, but is not limited to, cluster management, authentication management, task management, schema management, paging management, version management, and the like.
The cluster management can uniformly manage all machines and computing resources, tasks are issued to the machines, and after the tasks are executed, the resources need to be recovered, so that other tasks are distributed later. For example, in the face of large-traffic knowledge mining, all machines and computing resources need to be brought to process. Machines and computing resources can be rented to other companies or individuals when the machines and computing resources are idle, and certain economic benefits are formed.
And authentication management can determine the identity of the user, and corresponding authority can be given to the user after authentication.
And task management, which can receive multiple services of multiple users and monitor each state in which a task can be positioned.
Schema is managed, where Schema corresponds to each database one-to-one, and is a collection of objects (tables, indexes, views, stored procedures, operators) within a database. schema management can determine whether a user has the right to use a certain database. If the user does not create any Schema, the object is created in a public Schema. All of the database roles (users) have CREATE and USAGE privileges in the default public schema.
Paging management can divide a memory into equal small partitions, and then split a process into small parts according to the size of the partitions, so that the memory utilization rate is improved.
The version management is a process of recording, tracking, maintaining and controlling the change condition of a product or system series generated by locally improving and modifying the same product or system in order to meet different requirements.
And the graph management module 242 is used for calling the algorithm library based on the authority to generate the knowledge graph.
Specifically, the graph data service layer 240 is provided with a graph management module 242, which may be in butt joint with the policy management module 231 after defining the authority, acquire a function handle provided by the policy management module 231, execute a corresponding service policy, and call a corresponding graph calculation algorithm to search the underlying database to generate a knowledge graph.
The embodiment of the invention is based on the authority management module and the graph management module, and can endow corresponding authority according to different service scene attributes to generate the knowledge graph. The method can separate the service from the technology, and improve the flexibility of the engine.
Based on the content of any of the above embodiments, the graph management module 242 further includes: and the studying and judging unit is used for recording the analysis and studying and judging process of the data.
Specifically, the graph management module 242 is further provided with a research and judgment unit, which can perform lossless storage on the calculation result and research and judgment idea of the data service, and the stored information can be used for report display and subsequent analysis.
The embodiment of the invention is based on the studying and judging unit, can save the studying and judging process and can ensure the traceability of studying and judging in each step.
Based on the contents of any of the above embodiments, the interface layer 250 includes: and an interface management module 251, configured to manage the service interfaces.
The service interface at least comprises one of a graph operation interface, a graph query interface and a third party interface.
Specifically, an interface management module 251 is provided in the interface layer 250, and may decouple corresponding service interfaces respectively with different functions registered in the system. The embodiment of the present invention does not specifically limit the number of service interfaces.
Alternatively, the service interface may be a Graph operation interface (Graph API) to which requests may be sent to access and manipulate Graph resources in the system after registering an application and obtaining an authentication token for a user or service.
Alternatively, the service interface may be a Graph query interface (Search API) to which a request may be sent to retrieve Graph resources in the system after registering the application and obtaining the authentication token for the user or service.
Alternatively, the service interface may be a third party interface (Open API) to which requests may be sent to access and manipulate other resources in the system after registering an application and obtaining an authentication token for a user or service.
It is understood that the embodiment of the present invention may open up corresponding interfaces in the system according to other functions.
And a normalization module 252 for normalizing each service interface.
Specifically, a specification module 252 is provided in the interface layer 250, and can specify various service interfaces according to a uniform standard. The embodiment of the present invention does not specifically limit the specification items of the specification module 252.
Alternatively, the specification items of the specification module 252 may include specifications of a data dictionary, i.e., unified definitions and descriptions of data items, data structures, data flows, data stores, processing logic, and the like of the data of each service interface flow.
Optionally, the specification items of the specification module 252 may include abnormal standard specifications, and a set of abnormal systems may be customized to perform abnormal management of the specification, so that the external layer callers inherit a set of specification systems, and thus all the external layer callers are inherited derived class objects, and normalization can be achieved by capturing a base class.
Optionally, the specification items of the specification module 252 may include specifications of transmission formats, and a set of communication protocols may be customized to perform uniform encapsulation and parsing on the streaming data.
Optionally, the specification item of the specification module 252 may include a constraint of the architecture, for example, an interaction form between a client and a server in a network may be constructed by using a representation layer State Transfer (REST), and by designing a REST API (REST-style network interface), in the REST API provided by the server, only nouns are used in URLs to specify resources, and verbs are not used, so as to meet the integration requirement of the system.
Alternatively, the specification project of the specification module 252 may include constraints on process communication mechanisms, illustratively, the RPC protocol (gRPC) proposed and developed by google may be employed, the gRPC providing a set of mechanisms to enable communication between applications. The use threshold of a degradation developer is used for shielding the network protocol, and the interface of the opposite terminal is called just like a local function.
The embodiment of the invention is based on the interface management module and the specification module, can standardize each service interface in the system when in use, can ensure the transaction consistency of databases of different service domains during cross-database data operation, can integrate and finish multiple calls in one operation, and reduces the waste of resources as much as possible.
Based on the content of any of the above embodiments, the interface management module includes an external interface and an internal interface, so as to be called by users of different roles, thereby implementing sharing cooperation.
Specifically, an external interface and an internal interface may be provided in the interface management module 251.
Wherein the internal interface may be open to developers or users authenticated to the system to provide interaction with that type of user and the system.
For developers, the expansion of the functional modules of the system, the addition, deletion, modification, check and other operations of the data source can be realized through the internal interface.
For the users authenticated by the system, the functions corresponding to the authorities in the system can be called to perform deep-level and wide-range knowledge mining according to different user authorities through the internal interface.
And an external interface which can be opened for users who are not authenticated by the system temporarily so as to provide interaction between the users and the system.
For users who are not authenticated by the system, simple knowledge mining can be performed by using basic functions provided by the system through an external interface.
The embodiment of the invention can establish the interaction between users with different roles and the system based on the internal interface and the external interface arranged in the interface management module, and can improve the usability and reusability of the system.
Based on the content of any of the above embodiments, the application layer 250 includes: and an access module 251, configured to receive data in various formats.
Specifically, the application layer 250 may be used as a client end in the software architecture to interact with the bottom layer. At the application layer 250, an access module 251 is provided, which can receive data input by a user and also receive data fed back by the underlying layer. The data content is not particularly limited in the embodiments of the present invention.
Optionally, the embodiment of the present invention is not particularly limited to the data input by the user, including but not limited to text and images related to the mining topics, and the description of the logical relationship between the mining topics.
Optionally, the data fed back from the bottom layer includes, but is not limited to, multi-dimensional data formats such as associated map data, geospatial data, spatiotemporal data, time series data, report data, and traditional row-column data, which is not specifically limited in this embodiment of the present invention.
A tools module 252 for storing components associated with the data analysis tool for front-end display.
Specifically, a tool module 252 is provided at the application layer 250, which can combine various built-in analysis tools to implement multi-angle data exploration.
The analysis tools include, but are not limited to, a mathematical statistics tool, a statistical chart tool, a timeline tool, and a Geographic Information System (GIS), which is not limited in this embodiment of the present invention.
And the plug-in module 253 is used for accessing the plug-in of the required service application according to the task requirement.
Specifically, a plug-in module 253 is provided in the application layer 250, and can provide a complete client SDK plug-in to integrate plug-in capabilities required by platform access, so as to access service applications more conveniently.
The embodiment of the invention is based on the fact that the access module, the tool module and the plug-in module are arranged on the application layer, and can receive simple instructions of a user, namely, the interaction with the system can be realized, and an analysis diagram platform which is easy to use, interact, analyze, cooperate and expand is formed. The method can generate the map within the level of minutes, saves time and labor for research, code and map function development of enterprises and industry users, and greatly improves the resource utilization efficiency.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A knowledge mining engine system based on a graph database is characterized by comprising a graph data storage layer, a data persistence layer, a graph strategy management layer, a graph data service layer, an interface layer and an application layer which are sequentially connected from bottom to top;
the graph data storage layer is used for storing various types of databases in the target field;
the data persistence layer is used for managing and adapting each database;
the graph strategy management layer is used for generating a business strategy so that the graph data service layer can execute the business strategy to generate a knowledge graph;
the graph data service layer is used for establishing a target service according to task requirements, integrating user authority management and monitoring management capacity, executing the service strategy and acquiring the knowledge graph;
the interface layer is used for providing a service interface for third-party service access and establishing standardized data interaction;
the application layer is used for realizing the visualization of a graph database and the display of the knowledge graph;
wherein the target area comprises one or more.
2. The system of graph-database-based knowledge mining engine of claim 1, wherein the database comprises at least one of an in-memory database, an embedded database, a relational database, a nosql-type database, and a distributed database.
3. The system of graph database-based knowledge mining engine of claim 2, wherein the data persistence layer comprises:
and the characteristic management module is used for acquiring the characteristic information of the database so as to determine the graph data retrieval mode corresponding to the database.
4. The system of graph database-based knowledge mining engine of claim 1, wherein the graph policy management layer comprises:
the strategy management module is used for determining a corresponding service strategy according to the characteristic information of the database;
and the algorithm library is used for storing an algorithm of map calculation so as to be called by the map data service layer to generate the knowledge map.
5. The system of graph database-based knowledge mining engine of claim 4, wherein the business policies include at least a scenario policy, a primary key policy, an index policy, and a cache policy.
6. The system of graph database-based knowledge mining engine of claim 1, wherein the graph data service layer comprises:
the authority management module is used for configuring the authority corresponding to the target service;
and the graph management module is used for calling the algorithm library based on the authority to generate the knowledge graph.
7. The system of graph database-based knowledge mining engine of claim 6, wherein said graph management module further comprises:
and the studying and judging unit is used for recording the analysis and studying and judging process of the data.
8. The system of graph database-based knowledge mining engine of claim 1, wherein said interface layer comprises:
the interface management module is used for managing the service interface;
a normalization module for normalizing each of the service interfaces;
wherein the service interface comprises at least one of a graph operation interface, a graph query interface and a third party interface.
9. A knowledge mining engine system based on a graph database as claimed in claim 8, wherein said interface management module includes an external interface and an internal interface for users of different roles to invoke for shared collaboration.
10. The system of graph database-based knowledge mining engine of claim 1, wherein said application layer comprises:
the access module is used for receiving data in various formats;
a tool module for storing components associated with the data analysis tool for front-end display;
and the plug-in module is used for accessing the plug-in of the required service application according to the task requirement.
CN202210303691.7A 2022-03-24 2022-03-24 Knowledge mining engine system based on graph database Pending CN114969441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210303691.7A CN114969441A (en) 2022-03-24 2022-03-24 Knowledge mining engine system based on graph database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210303691.7A CN114969441A (en) 2022-03-24 2022-03-24 Knowledge mining engine system based on graph database

Publications (1)

Publication Number Publication Date
CN114969441A true CN114969441A (en) 2022-08-30

Family

ID=82975998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210303691.7A Pending CN114969441A (en) 2022-03-24 2022-03-24 Knowledge mining engine system based on graph database

Country Status (1)

Country Link
CN (1) CN114969441A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203488A (en) * 2022-09-15 2022-10-18 国网智能电网研究院有限公司 Graph database management method and device and electronic equipment
CN116701717A (en) * 2023-08-04 2023-09-05 杭州悦数科技有限公司 Graph database data importing method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203488A (en) * 2022-09-15 2022-10-18 国网智能电网研究院有限公司 Graph database management method and device and electronic equipment
CN115203488B (en) * 2022-09-15 2022-12-06 国网智能电网研究院有限公司 Graph database management method and device and electronic equipment
CN116701717A (en) * 2023-08-04 2023-09-05 杭州悦数科技有限公司 Graph database data importing method and system
CN116701717B (en) * 2023-08-04 2023-10-27 杭州悦数科技有限公司 Graph database data importing method and system

Similar Documents

Publication Publication Date Title
WO2019210758A1 (en) Data protection method and device and storage medium
Motro et al. Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources
CN110032604B (en) Data storage device, translation device and database access method
US9298775B2 (en) Changing the compression level of query plans
CN107038222B (en) Database cache implementation method and system
US10394805B2 (en) Database management for mobile devices
US9971820B2 (en) Distributed system with accelerator-created containers
CN111625510A (en) Multi-source data sharing system and method based on cloud mapping
CN106294695A (en) A kind of implementation method towards the biggest data search engine
CN114969441A (en) Knowledge mining engine system based on graph database
CN103455540A (en) System and method of generating in-memory models from data warehouse models
CN111221791A (en) Method for importing multi-source heterogeneous data into data lake
WO2016046658A1 (en) Simplifying invocation of import procedures to transfer data from data sources to data targets
CN114356971A (en) Data processing method, device and system
WO2024001493A1 (en) Visual data analysis method and device
US10949409B2 (en) On-demand, dynamic and optimized indexing in natural language processing
CN113282599A (en) Data synchronization method and system
Mehmood et al. Distributed real-time ETL architecture for unstructured big data
CN111221785A (en) Semantic data lake construction method of multi-source heterogeneous data
Kuderu et al. Relational database to NoSQL conversion by schema migration and mapping
KR20170053013A (en) Data Virtualization System for Bigdata Analysis
US8200673B2 (en) System and method for on-demand indexing
CN112181950B (en) Construction method of distributed object database
CN110928963B (en) Column-level authority knowledge graph construction method for operation and maintenance service data table
CN115168474B (en) Internet of things central station system building method based on big data model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination