CN117194676A - Method, apparatus, electronic device and readable medium for generating knowledge graph - Google Patents

Method, apparatus, electronic device and readable medium for generating knowledge graph Download PDF

Info

Publication number
CN117194676A
CN117194676A CN202311069875.2A CN202311069875A CN117194676A CN 117194676 A CN117194676 A CN 117194676A CN 202311069875 A CN202311069875 A CN 202311069875A CN 117194676 A CN117194676 A CN 117194676A
Authority
CN
China
Prior art keywords
data
knowledge
target
graph
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311069875.2A
Other languages
Chinese (zh)
Inventor
庄文斌
尹嘉函
尹慧敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoneng Ningxia Yuanyanghu First Power Generation Co ltd
Original Assignee
Guoneng Ningxia Yuanyanghu First Power Generation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoneng Ningxia Yuanyanghu First Power Generation Co ltd filed Critical Guoneng Ningxia Yuanyanghu First Power Generation Co ltd
Priority to CN202311069875.2A priority Critical patent/CN117194676A/en
Publication of CN117194676A publication Critical patent/CN117194676A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present disclosure disclose methods, apparatuses, electronic devices, and computer-readable media for generating a knowledge-graph. One embodiment of the method comprises the following steps: acquiring target data; processing the target data to obtain processed data; and generating a knowledge graph according to the processing data. The embodiment realizes knowledge association and reasoning, and can discover hidden relations and potential modes between the data by associating the data with the knowledge graph. The platform can extract deeper knowledge and insight from the data through knowledge reasoning. The data quality and consistency are realized, and the knowledge graph can be used as a standardized and verified tool for data, so that the quality and consistency of the data are ensured. By establishing the corresponding relation between the data and the knowledge graph, the data can be better managed and maintained, and data errors and redundancy are reduced.

Description

Method, apparatus, electronic device and readable medium for generating knowledge graph
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method, an apparatus, an electronic device, and a computer-readable medium for generating a knowledge-graph.
Background
Conventional information platforms focus mainly on structured data, such as business transaction data, customer information, product information, etc., which are typically stored in tabular form in relational databases. Most information management platforms have difficulty in processing unstructured and semi-structured data because the processing of such data requires significant manpower for preprocessing and cleaning.
Such manual work is not only inefficient, but also prone to errors. For data analysis, the existing information management platform generally can only perform static analysis, but cannot perform dynamic real-time analysis on data, which severely limits timeliness and application value of the data. In addition, most of the existing platforms do not provide data interfaces outwards, which brings difficulties to data sharing and integration between the platforms.
Disclosure of Invention
The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose methods, apparatuses, electronic devices, and computer-readable media for generating knowledge-maps to solve the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a method for generating a knowledge-graph, the method comprising: acquiring target data; processing the target data to obtain processed data; and generating a knowledge graph according to the processing data.
In a second aspect, some embodiments of the present disclosure provide an apparatus for generating a knowledge-graph, the apparatus comprising: an acquisition unit configured to acquire target data; a processing unit configured to process the target data to obtain processed data; and a generation unit configured to generate a knowledge graph from the processing data.
In a third aspect, an embodiment of the present application provides an electronic device, where the network device includes: one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
One of the above embodiments of the present disclosure has the following advantageous effects: and obtaining target data, processing the target data to obtain processed data, and generating a knowledge graph according to the processed data.
Therefore, the data and the knowledge graph are associated, knowledge association and reasoning are realized, and hidden relations and potential modes between the data can be found by associating the data with the knowledge graph. The platform can extract deeper knowledge and insight from the data through knowledge reasoning.
The data quality and consistency are realized, and the knowledge graph can be used as a standardized and verified tool for data, so that the quality and consistency of the data are ensured. By establishing the corresponding relation between the data and the knowledge graph, the data can be better managed and maintained, and data errors and redundancy are reduced.
The intelligent searching and recommending are realized, and the semantic representation of the knowledge graph enables the platform to realize more intelligent searching and recommending functions. The user can acquire information related to the query by the associated knowledge graph, so that the discovery and utilization efficiency of the data is improved.
The knowledge graph is combined with the industrial information management platform, so that the intelligent degree and the data management capability of the platform can be improved, and the requirements of enterprises in the aspects of diversified data management and intelligent data analysis are further met.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a schematic illustration of one application scenario of a method for generating a knowledge-graph, in accordance with some embodiments of the present disclosure;
FIG. 2 is a flow chart of some embodiments of a method for generating a knowledge-graph according to the present disclosure;
FIG. 3 is a schematic structural diagram of some embodiments of an apparatus for generating a knowledge-graph, in accordance with the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings. Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 is a schematic diagram of one application scenario of a method for generating a knowledge-graph, according to some embodiments of the present disclosure.
As shown in fig. 1, a server 101 may acquire target data 102, process the target data 102 to obtain processed data 103, and generate a knowledge graph 104 according to the processed data 103.
It is to be understood that the method for generating a knowledge graph may be performed by a terminal device, or may be performed by the server 101, and the main body of the method may include a device formed by integrating the terminal device and the server 101 through a network, or may be performed by various software programs. The terminal device may be, among other things, various electronic devices with information processing capabilities including, but not limited to, smartphones, tablet computers, electronic book readers, laptop and desktop computers, and the like. The execution body may be embodied as a server 101, software, or the like. When the execution subject is software, the execution subject can be installed in the electronic device enumerated above. It may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present application is not particularly limited herein.
It should be understood that the number of servers in fig. 1 is merely illustrative. There may be any number of servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of some embodiments of a method for generating a knowledge-graph according to the present disclosure is shown. The method for generating the knowledge graph comprises the following steps:
in step 201, target data is acquired.
In some embodiments, the execution subject of the method for generating a knowledge-graph (e.g., the server shown in fig. 1) may acquire the target data through a wired connection or a wireless connection. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.
Specifically, data acquisition: this is the first step of the data management system. It involves collecting data from different sources, which may be sensors, devices, databases, files, etc. The data collection may be performed in real time or periodically, depending on the source and actual requirements of the data.
And 202, processing the target data to obtain processed data.
In some embodiments, based on the target data obtained in step 201, the executing entity (e.g., the server shown in fig. 1) may process the target data to obtain processed data.
In some optional implementations of some embodiments, the executing body may perform data cleansing on the target data to obtain cleansing data; converting the cleaning data into a target format to obtain target format data; and normalizing the target format data to obtain normalized data.
Intelligent data analysis: intelligent data analysis may utilize a variety of techniques including machine learning, natural language processing, image processing, and the like. In these aspects, intelligent data analysis may be implemented using different algorithms and models, such as decision trees, neural networks, support vector machines, and the like. Common to these alternatives is the pattern, association and insight of extracting data through data analysis techniques.
High-efficiency production data: the ability to efficiently produce data may be achieved in different ways, such as parallel computing, distributed computing, streaming computing, and the like. Different frameworks and tools, such as Hadoop, spark, flink, etc., may be used to process large-scale data and implement real-time data processing. Common to these alternatives is the ability to provide high performance data processing and analysis.
And 203, generating a knowledge graph according to the processing data.
In some embodiments, the executing entity may generate the knowledge-graph according to the processing data.
Specifically, knowledge collected from various sources is structured and stored in a knowledge graph. The knowledge graph adopts a graph database or similar technology, so that the relation and the semantics among the knowledge can be effectively represented.
Data cleaning: the collected data often contains various noise, error and missing values. Data cleansing is a critical step to ensure data quality, which includes removing duplicate data, processing missing values, correcting erroneous data, etc., to ensure accuracy and reliability of subsequent analysis and processing.
Data conversion: in a data management system, data conversion is the conversion of raw data into a format suitable for analysis and application. This may involve data format conversion, unit conversion, timestamp processing, etc., to ensure that the data is compatible with other components of the platform.
Data normalization: industrial information management platforms often involve multiple data sources and multiple departments of data. Data normalization ensures that data has a consistent structure and format throughout the platform, allowing for easier integration and analysis.
And (3) data storage: the data management system requires an efficient data storage mechanism for persisting the cleaned and converted data. This may involve the use of database systems, data warehouse or cloud storage, etc.
Data security and rights management: in a data management system, the security of data is critical. This includes ensuring that data is not subject to unauthorized access, protecting sensitive data, and assigning appropriate access rights to different users and roles.
And (3) data quality monitoring: and establishing a data quality monitoring mechanism, and periodically checking the accuracy, the integrity and the consistency of the data. If the data quality is problematic, it needs to be found and resolved in time.
Data backup and recovery: and establishing a data backup and recovery strategy to ensure that data cannot be lost under the condition of unexpected faults or disasters and can be recovered quickly.
In some optional implementations of some embodiments, the executing body may identify the standardized data by using a pre-trained identification model to obtain an identification result; and according to the identification result, carrying out data association on the standardized data and the data in the target platform to obtain a knowledge graph.
In particular, natural language processing and machine learning algorithms may be included to extract knowledge from text, documents, or other unstructured data and to infer based on the existing knowledge, thereby continuously enriching the content of the knowledge graph.
Natural language processing techniques are employed to allow users to query using natural language rather than just keyword searches. The intelligent retrieval can better understand the intention of the user and improve the accuracy and efficiency of query.
The application of the knowledge graph can be realized by adopting different technologies and methods, such as semantic network, ontology, graph database and the like. Common to these alternatives is that the data is represented and organized in the form of graph structures and that the data is related and inferred using graph-related algorithms and query languages.
A knowledge correlation system for representing and organizing structured and semi-structured knowledge. Entities, attributes, and relationships in the knowledge-graph may be associated with data in the platform.
In terms of data management and storage, different database systems or storage techniques may be used. For example, a relational database, a document database, a graph database, or a distributed file system may be selected for storing and managing data. Common to these alternatives is to provide persistent storage, efficient querying, and reliability of data.
In some alternative implementations of some embodiments, the recognition model is trained according to the following steps: acquiring a training sample set, wherein the training sample set comprises sample standardized data and sample identification results corresponding to the sample standardized data; inputting the sample standardized data into a model to be trained to obtain a recognition result; comparing the identification result with the sample identification result to obtain a comparison result; determining whether the model to be trained is trained according to the comparison result; in response to determining that training of the model to be trained is complete, the model to be trained is determined to be an identification model.
Here, the above-mentioned recognition model is used to characterize the correspondence between the normalized data and the recognition result. As another example, the identification model may be a correspondence table generated by a researcher based on a plurality of sample identification results corresponding to the sample standardized data, and as another example, the identification model may be a neural network model.
In some optional implementations of some embodiments, the execution body may adjust relevant parameters in the model to be trained in response to determining that the model to be trained does not complete training.
In some optional implementations of some embodiments, the executing body may perform statistical analysis on the data in the knowledge graph to obtain a statistical analysis result; displaying a first display interface, wherein the first display interface comprises the knowledge graph and the statistical analysis result.
Specifically, statistical analysis and visual display are carried out on the data in the knowledge graph. This helps to understand knowledge distribution, relationship density, etc. information in the knowledge graph and to mine valuable insight therefrom. The behavior of the user in the knowledge query process is analyzed, including query frequency, query habit and the like. Such information is useful for improving knowledge queries and personalized recommendations.
Through the improvement measures, the industrial information construction management platform can better process unstructured and semi-structured data, provide a dynamic real-time data analysis function, realize data interface and data sharing, strengthen the data analysis function by adopting advanced data storage and processing technology, and ensure data security and privacy protection. The improvement greatly improves the function and application value of the platform, meets the requirements of users on efficient, accurate and real-time data management and analysis, and promotes the development of the field of industrial information management.
The application provides an industrial information construction management platform, which manages various data structures: the scheme can effectively manage and utilize the data of various data structures, including structured, unstructured and semi-structured data. This enables businesses to more fully collect, store, and analyze various types of data, obtaining more information and insight.
Intelligent data analysis: through the built-in intelligent analysis engine, the scheme can carry out intelligent analysis and mining on various data. These analysis techniques, including machine learning, natural language processing, image recognition, etc., can provide deep insight and valuable information. Enterprises may discover potential patterns from the data, predict trends, and make more accurate decisions.
High-efficiency production data: the scheme has high-performance data processing capability and can process large-scale data and real-time data. Enterprises can process data rapidly and obtain insight and analysis results of the data in real time. The enterprise can respond to the change in time to make a quick and accurate decision, and the production efficiency and the competitiveness are improved.
Data management and consistency: by applying the knowledge graph technology, the scheme can realize standardization, verification and consistency of data. The knowledge graph is used as a semantic association graph structure, so that enterprises can be helped to establish association between data and knowledge, and the quality and consistency of the data are improved. This helps to reduce data errors and redundancy, improving data management efficiency and data quality.
One of the above embodiments of the present disclosure has the following advantageous effects: and obtaining target data, processing the target data to obtain processed data, and generating a knowledge graph according to the processed data.
Therefore, the data and the knowledge graph are associated, knowledge association and reasoning are realized, and hidden relations and potential modes between the data can be found by associating the data with the knowledge graph. The platform can extract deeper knowledge and insight from the data through knowledge reasoning.
The data quality and consistency are realized, and the knowledge graph can be used as a standardized and verified tool for data, so that the quality and consistency of the data are ensured. By establishing the corresponding relation between the data and the knowledge graph, the data can be better managed and maintained, and data errors and redundancy are reduced.
The intelligent searching and recommending are realized, and the semantic representation of the knowledge graph enables the platform to realize more intelligent searching and recommending functions. The user can acquire information related to the query by the associated knowledge graph, so that the discovery and utilization efficiency of the data is improved.
The knowledge graph is combined with the industrial information management platform, so that the intelligent degree and the data management capability of the platform can be improved, and the requirements of enterprises in the aspects of diversified data management and intelligent data analysis are further met.
With further reference to fig. 3, as an implementation of the method shown in the above figures, the present disclosure provides some embodiments of an apparatus for generating a knowledge-graph, which apparatus embodiments correspond to those method embodiments shown in fig. 2, and the apparatus is particularly applicable in various electronic devices.
As shown in fig. 3, an apparatus 300 for generating a knowledge-graph of some embodiments includes: an acquisition unit 301, a processing unit 302, and a generation unit 303. Wherein the acquisition unit 301 is configured to acquire target data; the processing unit 302 is configured to process the target data to obtain processed data; the generating unit 303 is configured to generate a knowledge-graph from the above-described processing data.
In some alternative implementations of some embodiments, the processing unit 302 is further configured to: performing data cleaning on the target data to obtain cleaning data; converting the cleaning data into a target format to obtain target format data; and normalizing the target format data to obtain normalized data.
In some alternative implementations of some embodiments, the processing unit 302 is further configured to: utilizing a pre-trained recognition model to recognize the standardized data to obtain a recognition result; and according to the identification result, carrying out data association on the standardized data and the data in the target platform to obtain a knowledge graph.
In some alternative implementations of some embodiments, the recognition model is trained according to the following steps: acquiring a training sample set, wherein the training sample set comprises sample standardized data and sample identification results corresponding to the sample standardized data; inputting the sample standardized data into a model to be trained to obtain a recognition result; comparing the identification result with the sample identification result to obtain a comparison result; determining whether the model to be trained is trained according to the comparison result; in response to determining that training of the model to be trained is complete, the model to be trained is determined to be an identification model.
In some optional implementations of some embodiments, the apparatus further includes an adjustment unit configured to: and adjusting relevant parameters in the model to be trained in response to determining that the model to be trained does not complete training.
In some optional implementations of some embodiments, the apparatus further includes a display unit configured to: carrying out statistical analysis on the data in the knowledge graph to obtain a statistical analysis result; displaying a first display interface, wherein the first display interface comprises the knowledge graph and the statistical analysis result.
It will be appreciated that the elements described in the apparatus 300 correspond to the various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting benefits described above with respect to the method are equally applicable to the apparatus 300 and the units contained therein, and are not described in detail herein.
One of the above embodiments of the present disclosure has the following advantageous effects: and obtaining target data, processing the target data to obtain processed data, and generating a knowledge graph according to the processed data.
Therefore, the data and the knowledge graph are associated, knowledge association and reasoning are realized, and hidden relations and potential modes between the data can be found by associating the data with the knowledge graph. The platform can extract deeper knowledge and insight from the data through knowledge reasoning.
The data quality and consistency are realized, and the knowledge graph can be used as a standardized and verified tool for data, so that the quality and consistency of the data are ensured. By establishing the corresponding relation between the data and the knowledge graph, the data can be better managed and maintained, and data errors and redundancy are reduced.
The intelligent searching and recommending are realized, and the semantic representation of the knowledge graph enables the platform to realize more intelligent searching and recommending functions. The user can acquire information related to the query by the associated knowledge graph, so that the discovery and utilization efficiency of the data is improved.
The knowledge graph is combined with the industrial information management platform, so that the intelligent degree and the data management capability of the platform can be improved, and the requirements of enterprises in the aspects of diversified data management and intelligent data analysis are further met.
Referring now to fig. 4, a schematic diagram of an electronic device (e.g., server in fig. 1) 400 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device shown in fig. 4 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 4, the electronic device 400 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 401, which may perform various suitable actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
In general, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, magnetic tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device 400 to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device 400 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 4 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 409, or from storage 408, or from ROM 402. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing device 401.
It should be noted that, in some embodiments of the present disclosure, the computer readable medium may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring target data; processing the target data to obtain processed data; and generating a knowledge graph according to the processing data.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a processing unit, and a generation unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the acquisition unit may also be described as "acquisition unit configured to acquire target data".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the application in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the application. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (10)

1. A method for generating a knowledge-graph, comprising:
acquiring target data;
processing the target data to obtain processed data;
and generating a knowledge graph according to the processing data.
2. The method of claim 1, wherein the processing the target data to obtain processed data comprises:
performing data cleaning on the target data to obtain cleaning data;
converting the cleaning data into a target format to obtain target format data;
and normalizing the target format data to obtain normalized data.
3. The method of claim 2, wherein the generating a knowledge-graph from the processing data comprises:
identifying the standardized data by utilizing a pre-trained identification model to obtain an identification result;
and according to the identification result, carrying out data association on the standardized data and the data in the target platform to obtain a knowledge graph.
4. A method according to claim 3, wherein the recognition model is trained according to the steps of:
acquiring a training sample set, wherein the training sample set comprises sample standardized data and sample identification results corresponding to the sample standardized data;
inputting the sample standardized data into a model to be trained to obtain a recognition result;
comparing the identification result with the sample identification result to obtain a comparison result;
determining whether the model to be trained is trained according to the comparison result;
in response to determining that training of the model to be trained is complete, the model to be trained is determined to be an identification model.
5. The method of claim 4, wherein the method further comprises:
and adjusting relevant parameters in the model to be trained in response to determining that the model to be trained does not complete training.
6. The method of claim 1, wherein the method further comprises:
carrying out statistical analysis on the data in the knowledge graph to obtain a statistical analysis result;
and displaying a first display interface, wherein the first display interface comprises the knowledge graph and the statistical analysis result.
7. An apparatus for generating a knowledge-graph, comprising:
an acquisition unit configured to acquire target data;
the processing unit is configured to process the target data to obtain processed data;
and the generation unit is configured to generate a knowledge graph according to the processing data.
8. The apparatus of claim 7, wherein the processing unit is further configured to:
performing data cleaning on the target data to obtain cleaning data;
converting the cleaning data into a target format to obtain target format data;
and normalizing the target format data to obtain normalized data.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
10. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-6.
CN202311069875.2A 2023-08-23 2023-08-23 Method, apparatus, electronic device and readable medium for generating knowledge graph Pending CN117194676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311069875.2A CN117194676A (en) 2023-08-23 2023-08-23 Method, apparatus, electronic device and readable medium for generating knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311069875.2A CN117194676A (en) 2023-08-23 2023-08-23 Method, apparatus, electronic device and readable medium for generating knowledge graph

Publications (1)

Publication Number Publication Date
CN117194676A true CN117194676A (en) 2023-12-08

Family

ID=89004433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311069875.2A Pending CN117194676A (en) 2023-08-23 2023-08-23 Method, apparatus, electronic device and readable medium for generating knowledge graph

Country Status (1)

Country Link
CN (1) CN117194676A (en)

Similar Documents

Publication Publication Date Title
US20210374610A1 (en) Efficient duplicate detection for machine learning data sets
US11119980B2 (en) Self-learning operational database management
EP3161635B1 (en) Machine learning service
Strohbach et al. Towards a big data analytics framework for IoT and smart city applications
US10572494B2 (en) Bootstrapping the data lake and glossaries with ‘dataset joins’ metadata from existing application patterns
US20170308620A1 (en) Making graph pattern queries bounded in big graphs
US20130097134A1 (en) System and method for subject identification from free format data sources
Patwardhan et al. A survey on predictive maintenance through big data
CN111078776A (en) Data table standardization method, device, equipment and storage medium
Hsu et al. Integrating machine learning and open data into social Chatbot for filtering information rumor
US10671631B2 (en) Method, apparatus, and computer-readable medium for non-structured data profiling
Boranbayev et al. The method of data analysis from social networks using apache hadoop
CN113282611A (en) Method and device for synchronizing stream data, computer equipment and storage medium
US20180276566A1 (en) Automated meta parameter search for invariant based anomaly detectors in log analytics
US11601339B2 (en) Methods and systems for creating multi-dimensional baselines from network conversations using sequence prediction models
Prakash et al. Big data preprocessing for modern world: opportunities and challenges
CN115051863A (en) Abnormal flow detection method and device, electronic equipment and readable storage medium
CN117194676A (en) Method, apparatus, electronic device and readable medium for generating knowledge graph
Senthil ENHANCED BIG DATA CLASSIFICATION SUSHISEN ALGORITHMS TECHNIQUES IN HADOOP CLUSTER (META)
US10152556B1 (en) Semantic modeling platform
Mary et al. A study on basic concepts of big data
US20160247077A1 (en) System and method for processing raw data
Kiio Apache Spark based big data analytics for social network cybercrime forensics
Fotopoulou et al. Exploiting Linked Data Towards the Production of Added-Value Business Analytics and Vice-versa.
Singh et al. A METHOD FOR HANDLING CLOUD COMPUTING USING INTERNET CRAWLERS AND DATA MINING

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination