CN114416102A - Data processing method, device and equipment based on knowledge graph script and storage medium - Google Patents

Data processing method, device and equipment based on knowledge graph script and storage medium Download PDF

Info

Publication number
CN114416102A
CN114416102A CN202210103654.1A CN202210103654A CN114416102A CN 114416102 A CN114416102 A CN 114416102A CN 202210103654 A CN202210103654 A CN 202210103654A CN 114416102 A CN114416102 A CN 114416102A
Authority
CN
China
Prior art keywords
knowledge graph
data
xml data
script
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210103654.1A
Other languages
Chinese (zh)
Inventor
郑林
丁军
黄振
张渝
张涛
聂庆
贺芳
王磬音
谢秋学
马青
孙金
赵秋慧
常秀
张悦
陈添添
王昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haiyizhi Information Technology Nanjing Co ltd
INDAA MEDIA INVESTMENT HOLDINGS Ltd
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Original Assignee
Haiyizhi Information Technology Nanjing Co ltd
INDAA MEDIA INVESTMENT HOLDINGS Ltd
Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haiyizhi Information Technology Nanjing Co ltd, INDAA MEDIA INVESTMENT HOLDINGS Ltd, Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd filed Critical Haiyizhi Information Technology Nanjing Co ltd
Priority to CN202210103654.1A priority Critical patent/CN114416102A/en
Publication of CN114416102A publication Critical patent/CN114416102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Water Supply & Treatment (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method based on a knowledge graph script, which is characterized by comprising the following steps: constructing a knowledge graph script based on a knowledge graph service system; after receiving xml data, putting the xml data into a specified directory, wherein the xml data is a book; processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object; after the data in the book data object is analyzed, adding the entities in the data into a knowledge graph, and then matching semantic and word similarity so as to enable the entities with the same characteristics to be mutually associated; by the method, the analysis and the drawing operation of the xml data can be realized based on the knowledge graph script, so that the operation is simple, the complexity of the drawing storage of the knowledge is reduced, and finally, a personnel management means which is high in reliability and can support the personnel management means in the power industry enterprise is formed.

Description

Data processing method, device and equipment based on knowledge graph script and storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a data processing method, apparatus, device, and storage medium based on a knowledge-graph script.
Background
At present, most enterprises in the power industry manually record knowledge of systems such as posts, equipment, infrastructure, science and technology, marketing, power grids and the like through historical files, all the knowledge is compiled into books to a certain extent, most of experience for judging categories is deposited on responsible hands in various fields, when all information and the knowledge need to be concatenated for analysis and the incidence relation of related knowledge is found, the work of inquiring and retrieving becomes very difficult, and the knowledge is not stored systematically. However, with the rapid development of the internet and information technology, a large and complex information system has been derived. The knowledge graph becomes a powerful tool for realizing the purpose, and the dependence on the traditional guiding experience and the traditional problem searching document resolution of disputed nations can be reduced through the knowledge graph.
In the prior art, the knowledge graph is generally constructed. The adopted technical means is knowledge fusion, D2R mapping and the like to carry out association among data, finally obtain corresponding data streams, and call the knowledge of the data streams into a knowledge graph. If the user wants to perform operations such as data decomposition and knowledge graph storage based on the constructed knowledge graph, technical means such as knowledge fusion and D2R mapping are also required. But the technical problems of algorithm judgment, complex knowledge mapping, difficult operation and the like exist.
Therefore, how to realize the analysis and mapping operation of xml data based on the knowledge graph script is a technical problem to be solved at present, so that the operation is simple, the complexity of mapping and storing knowledge is reduced, and finally a personnel management means in the electric power industry enterprise with higher reliability and support is formed.
Disclosure of Invention
The invention discloses a data processing method based on a knowledge graph script, which is used for solving the problems that the prior art can not realize the analysis and drawing operation of xml data based on the knowledge graph script, so that the operation is simple, the complexity of drawing and storing knowledge is reduced, and finally, a personnel management means with higher reliability and supportability in an electric power industry enterprise is formed, and the method comprises the following steps:
constructing a knowledge graph script based on a knowledge graph service system;
after receiving xml data, putting the xml data into a specified directory, wherein the xml data is a book;
processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object;
after the data in the book data object is analyzed, adding the entities in the data into a knowledge graph, and then matching semantic and word similarity so as to enable the entities with the same characteristics to be mutually associated;
the business system is obtained by splitting power resource information and specifically comprises a post manpower knowledge system, an equipment knowledge system, a capital construction knowledge system, a scientific and technological knowledge system, a marketing knowledge system, a power grid knowledge system and a legal knowledge system.
Optionally, the knowledge graph script processes and analyzes the xml data file to form a cookie data object, specifically:
after the xml data is put into the designated directory, the knowledge graph script processes and analyzes according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
Optionally, parsing the xml data file according to the path specifically includes:
reading a self-defined file name under a folder through the knowledge graph script;
after forming a Document object by reading data in a computer file input stream, reading each line of data by using xpath to finally form the bookmark data object;
and when the data in the bookmark data object is analyzed, the xml data file is analyzed.
Optionally, the method further comprises:
defining entities, relations and attributes existing in the knowledge graph based on the knowledge graph service system;
and determining the entity type, the entity basic attribute and the relationship among the entities in the data based on the defined entities, relationships and attributes.
Optionally, the basic attributes of the entities at least include teaching materials, parts, chapters, modules, work types and stations, and the relationships between the entities at least include relationships between teaching materials, parts, chapters, modules, work types and stations.
Optionally, after the entities in the data are added into the knowledge graph, matching of semantics and word similarity is performed, so that the entities with the same characteristics are associated with each other, specifically:
adding the entities of the books into the knowledge graph through the interface of the knowledge graph, matching the titles and abs of the books with the entries, and filtering by using the filter in java8 to establish the association relationship between the books and the entries;
and acquiring part and chapter data from the bookmark data, adding entities in the part and chapter data into the knowledge graph, and establishing a corresponding association relationship through matching of the title and the entry or/and matching of the semantics and the word similarity.
Correspondingly, the invention also discloses a data processing device based on the knowledge-graph script, which comprises:
the construction module is used for constructing a knowledge graph script based on a knowledge graph service system;
the receiving module is used for placing the xml data into a specified directory after receiving the xml data, wherein the xml data is a book;
the processing and analyzing module is used for processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object;
the matching module is used for adding the entities in the data into a knowledge graph and then matching the semantic and word similarity after the data in the bookmark data object is analyzed, so that the entities with the same characteristics are correlated;
the business system is obtained by splitting power resource information and specifically comprises a post manpower knowledge system, an equipment knowledge system, a capital construction knowledge system, a scientific and technological knowledge system, a marketing knowledge system, a power grid knowledge system and a legal knowledge system.
Optionally, the processing and parsing module is specifically configured to:
after the xml data is put into the designated directory, the knowledge graph script carries out processing and circular analysis according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
In order to achieve the above object, according to yet another aspect of the present application, there is provided an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
In order to achieve the above object, according to yet another aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method.
Compared with the prior art, the invention has the following beneficial effects:
the invention discloses a data processing method based on a knowledge graph script, which is characterized by comprising the following steps: constructing a knowledge graph script based on a knowledge graph service system; after receiving xml data, putting the xml data into a specified directory, wherein the xml data is a book; processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object; after the data in the book data object is analyzed, adding the entities in the data into a knowledge graph, and then matching semantic and word similarity so as to enable the entities with the same characteristics to be mutually associated; by the method, the analysis and the drawing operation of the xml data can be realized based on the knowledge graph script, so that the operation is simple, the complexity of the drawing storage of the knowledge is reduced, and finally, a personnel management means which is high in reliability and can support the personnel management means in the power industry enterprise is formed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
FIG. 1 is a flow diagram illustrating a method for data processing based on knowledge-graph scripts according to an embodiment of the present application;
FIG. 2 is a diagram illustrating a knowledge-graph business architecture page according to an embodiment of the present application;
FIG. 3 is a knowledge graph system visualization page view according to an embodiment of the present application;
FIG. 4 is a display of a knowledge-graph editing page according to an embodiment of the present application;
FIG. 5 is a display of another knowledge-graph editing page according to an embodiment of the present application;
FIG. 6 is a block diagram of a data processing apparatus based on a knowledge-graph script according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.
Fig. 1 is a schematic flow chart of a data processing method based on a knowledge-graph script according to an embodiment of the present invention, where the method includes:
s201, establishing a knowledge graph script based on a knowledge graph service system.
Specifically, a knowledge graph global knowledge classification system is established according to paper knowledge accumulation modeling related to resources in various fields of the power industry, knowledge is extracted from data from different sources and different structures, and knowledge is formed and stored in a knowledge graph. The method is characterized in that a knowledge graph script is constructed depending on seven large business systems in the power industry, the knowledge graph business systems are obtained by splitting power resource information and specifically comprise a post system, an equipment system, a capital construction system, a scientific and technological system, a marketing system, a power grid system and a legal system, and a page display diagram of the knowledge graph business systems is shown in fig. 2. The knowledge graph is constructed in a multi-dimensional mode from the angles of using teaching materials, using parts, using chapters, using modules, relevant work categories and the like of resource management in various large fields. Fig. 3 is a visualized page diagram of a knowledge graph system, which is constructed based on a knowledge graph. The knowledge map system can analyze and mine the resource management knowledge value of the power industry more deeply, improve the intelligent level of resource management in each field and realize the fusion application of different knowledge data of human resource management.
S202, after receiving the xml data, putting the xml data into a specified directory, wherein the xml data is a book.
Specifically, after receiving the book xml data, the prepared book xml data needs to be put into a specified directory, for example, into the sharing server: under the/upload folder. The following list is included in the complete book data: CHAPTER (CHAPTER), COVER, EPUB (electronic version), MOBI (electronic version), PDF, SOURCE, and XML files.
It should be noted that, in the present application, the xml data is a book, and includes an xml file of data such as a book in the human resource field, a capital construction book, a power generation volume, and the like, the scheme of the above preferred embodiment is only a specific implementation scheme provided by the present application, and other data in any form analyzed through a knowledge graph script all belong to the protection scope of the present application.
And S203, processing and analyzing the xml data file through the knowledge graph script to form a cookdata object.
Specifically, after the book xml data is stored in the customized folder and is arranged, the knowledge graph script is processed and analyzed according to the path and the type of the xml data file, and finally, a book data (customized object name) object is formed.
In order to process the xml data file, in some embodiments, the xml data file is processed and analyzed by the knowledge graph script to form a cookie data object, which specifically includes:
after the xml data is put into the designated directory, the knowledge graph script processes and analyzes according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
Specifically, after the xml data is put into the specified directory, the knowledge-graph script processes and analyzes according to the path and the type of the xml data file. If the xml data file type is in a format needing decompression, if the xml data file type is in a zip format, the script is automatically read by using ZipFile and is decompressed to an appointed directory, and then the xml data file is analyzed according to the path; and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file directly analyzes the file according to the path.
In order to parse an xml data file, in some embodiments, parsing the xml data file according to the path includes:
reading a self-defined file name under a folder through the knowledge graph script;
after forming a Document object by reading data in a computer file input stream, reading each line of data by using xpath to finally form the bookmark data object;
and when the data in the bookmark data object is analyzed, the xml data file is analyzed.
Specifically, the knowledge graph script reads a fileList (file name-self-defined on a server) below the whole folder, then performs circular analysis, the file analysis process uses SAXReader (technical read stream) and FileInputStream (computer file input stream) file streams in a matched manner, a read method of SAXReader is used for reading data in the FileInputStream stream to form a Document (computer Document) object, and then, an xpath (java read technology) is used for reading each row of data; for example: book title, directory, section, and chapter, etc. Finally, a cookie data (custom object name) object is formed. And when the data in the bookmark data object is analyzed, the process of analyzing the xml data file is completed.
S204, after the data in the book data object is analyzed, adding the entities in the data into a knowledge graph, and then matching semantic and word similarity, so that the entities with the same characteristics are correlated.
Specifically, after the data in the bookmark data object is analyzed, the entities in the data are added into the knowledge graph and then matched with the semantic similarity and the word similarity, so that the entities with the same characteristics are associated with each other. The XML data analysis and map entering operation are performed through the knowledge map script, the complexity of knowledge map entering storage can be reduced, and the knowledge map script is classified and compiled according to different services, so that the use scene is clearer; the knowledge map script is independently executed outside the system, so that the map entering speed is increased, and operations such as knowledge fusion, algorithm judgment of D2R mapping and the like are not needed; the knowledge map script has strong service, and the operations of adjustment, addition and the like are simpler.
In order to accurately determine the association relationship between the entities, in some embodiments, the method further includes:
defining entities, relations and attributes existing in the knowledge graph based on the knowledge graph service system;
and determining the entity type, the entity basic attribute and the relationship among the entities in the data based on the defined entities, relationships and attributes.
Specifically, based on the knowledge graph service system, data in the knowledge graph is defined according to the provided data and the application requirements, and entities and relations in the knowledge graph and attributes of the entities and relations are defined. And determining the entity type, the entity basic attribute and the relationship among the entities in the xml data and the data in the book data object based on the defined entities, the relationship and the attributes. Knowledge in the knowledge-graph exists in the form of (head entities, relationships, tail entities) and (entities, attributes, attribute values).
In order to obtain the interrelation among the data, in some embodiments, basic attributes of entities such as teaching materials, parts, chapters, modules, work types, posts and the like are obtained, including a unified book number, a middle drawing classification number, a book classification, an ISBN number and the like; and acquiring the relation among entities such as teaching materials, parts, chapters, modules, work types, posts and the like, including parts, related entry knowledge and the like.
In order to provide a means for managing personnel in an electric power industry enterprise, in some embodiments, after entities in the data are added into a knowledge graph, semantic and word similarity matching is performed, so that the entities with the same characteristics are associated with each other, specifically:
adding the entities of the books into the knowledge graph through the interface of the knowledge graph, matching the titles and abs of the books with the entries, and filtering by using the filter in java8 to establish the association relationship between the books and the entries;
and acquiring part and chapter data from the bookmark data, adding entities in the part and chapter data into the knowledge graph, and establishing a corresponding association relationship through matching of the title and the entry or/and matching of the semantics and the word similarity.
Specifically, a knowledge graph system interface is requested to request all entity information under a vocabulary entry in an http mode, and the bookmark data is circulated and supplemented with the rest fields, such as: cover, pdfPath, etc.; the COVER page under the COVER directory is uploaded to FastDFS (distributed storage system) and the path returned is finally filled. And then requesting addEntity (api of the knowledge graph) of the knowledge graph system to add the entity of the book into the knowledge graph, matching with the vocabulary entry according to title and abs of the book, and filtering by using a filter in java8, thereby establishing the association relationship between the book and the vocabulary entry (addRelation method-api of the knowledge graph system). And acquiring data of parts and chapters from the bookmark data, establishing entity data of the parts and chapters, and establishing a corresponding association relationship through the matching adaptation ratio (semantic/word similarity) of the title and the entries or the attributes. According to the method and the system, the knowledge graph script is used for analyzing and drawing the xml data, and finally the analyzed data are output to the knowledge graph. As shown in fig. 4 and 5, a display diagram of a knowledge graph page is shown, through the association of various attributes, terms, parts, chapters and other attributes in the knowledge graph, a user can input relevant knowledge points to search, then the searched content is converted into data, and finally a scheme which accords with actual judgment and solves problems can be provided according to the association between the data, so that a personnel management means in the power industry enterprise is provided. Meanwhile, the powerful display page also provides a relatively clear visual effect for a user, and provides a very good retrieval and visual platform for training new staff in an enterprise and acquiring experience data.
In order to further illustrate the technical idea of the present invention, the technical solution of the present invention will now be described with reference to specific application scenarios.
The method comprises the following steps: unstructured xml data parsing.
Firstly, putting prepared book xml data into a specified directory, such as a sharing server: under the/upload folder. The following list is included in the complete book data: CHAPTER (CHAPTER), COVER, EPUB (electronic version), MOBI (electronic version), PDF, SOURCE, and XML files.
And after the book xml data is stored in the self-defined folder and is arranged, processing and analyzing the knowledge graph script according to the path and the type of the xml data file, and finally forming a final formed book data (self-defined object name) object. And after the xml data is put into the specified directory, the knowledge graph script processes and analyzes according to the path and the type of the xml data file. If the xml data file type is in a format needing decompression, if the xml data file type is in a zip format, the script is automatically read by using ZipFile and is decompressed to an appointed directory, and then the xml data file is analyzed according to the path; and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file directly analyzes the file according to the path. The analysis process is as follows:
the knowledge graph script reads a fileList (file name-self-defined on a server) below the whole folder, then carries out circular analysis, the process of analyzing files uses SAXReader (technical read stream) and FileInputStream (computer file input stream) file streams in a matching way, the read method of SAXReader is used for reading data in the FileInputStream streams to form Document objects, and then the xpath (java read technology) is used for reading each row of data; for example: book title, directory, section, and chapter, etc. Finally, a cookie data (custom object name) object is formed. And when the data in the bookmark data object is analyzed, the process of analyzing the xml data file is completed.
Step two: and importing the data in the parsed bookmark data object into the map.
Firstly, requesting a knowledge graph system interface to request all entity information under a vocabulary entry in an http mode, circulating the bookmark data and supplementing the remaining fields for the bookmark data, for example: cover, pdfPath, etc.; the COVER page under the COVER directory is uploaded to FastDFS (distributed storage system) and the path returned is finally filled. And then requesting addEntity (api of the knowledge graph) of the knowledge graph system to add the entity of the book into the knowledge graph, matching with the vocabulary entry according to title and abs of the book, and filtering by using a filter in java8, thereby establishing the association relationship between the book and the vocabulary entry (addRelation method-api of the knowledge graph system). And acquiring data of parts and chapters from the bookmark data, establishing entity data of the parts and chapters, and establishing a corresponding association relationship through the matching adaptation ratio (semantic/word similarity) of the title and the entries or the attributes.
In order to achieve the above technical object, an embodiment of the present application further provides a data processing apparatus based on a knowledge-graph script, as shown in fig. 6, the apparatus including:
a construction module 401, configured to construct a knowledge graph script based on a knowledge graph service system;
the receiving module 402 is configured to, after receiving xml data, place the xml data into an appointed directory, where the xml data is a book;
the processing and analyzing module 403 is configured to process and analyze the xml data file through the knowledge graph script to form a bookmark data object;
a matching module 404, configured to, after the data in the cookdata object is analyzed, add entities in the data into a knowledge graph and perform semantic and word similarity matching, so that the entities with the same characteristics are associated with each other;
the business system is obtained by splitting power resource information and specifically comprises a post manpower knowledge system, an equipment knowledge system, a capital construction knowledge system, a scientific and technological knowledge system, a marketing knowledge system, a power grid knowledge system and a legal knowledge system.
In a specific application scenario of the present application, the processing and analyzing module is specifically configured to:
after the xml data is put into the designated directory, the knowledge graph script carries out processing and circular analysis according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
The present application further provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
According to yet another aspect of the application, there is also provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A data processing method based on a knowledge-graph script is characterized by comprising the following steps:
constructing a knowledge graph script based on a knowledge graph service system;
after receiving xml data, putting the xml data into a specified directory, wherein the xml data is a book;
processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object;
after the data in the book data object is analyzed, adding the entities in the data into a knowledge graph, and then matching semantic and word similarity so as to enable the entities with the same characteristics to be mutually associated;
the service system is obtained by splitting power resource information, and specifically comprises a post system, an equipment system, a capital construction system, a scientific and technological system, a marketing system, a power grid system and a legal system.
2. The method of claim 1, wherein the xml data file is processed and analyzed by the knowledge graph script to form a cookie data object, specifically:
after the xml data is put into the designated directory, the knowledge graph script processes and analyzes according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
3. The method according to claim 2, wherein parsing the xml data file according to the path specifically comprises:
reading a self-defined file name under a folder through the knowledge graph script;
after forming a Document object by reading data in a computer file input stream, reading each line of data by using xpath to finally form the bookmark data object;
and when the data in the bookmark data object is analyzed, the xml data file is analyzed.
4. The method of claim 1, further comprising:
defining entities, relations and attributes existing in the knowledge graph based on the knowledge graph service system;
and determining the entity type, the entity basic attribute and the relationship among the entities in the data based on the defined entities, relationships and attributes.
5. The method as claimed in claim 4, wherein the entity basic attributes include at least textbook, part, chapter, module, work category and post, and the relationship between the entity and the entity includes at least the relationship between the textbook, part, chapter, module, work category and post.
6. The method of claim 1, wherein the matching of semantic and word similarity is performed after the entities in the data are added to the knowledge-graph, so that the entities with the same characteristics are related to each other, specifically:
adding the entities of the books into the knowledge graph through the interface of the knowledge graph, matching the titles and abs of the books with the entries, and filtering by using the filter in java8 to establish the association relationship between the books and the entries;
and acquiring part and chapter data from the bookmark data, adding entities in the part and chapter data into the knowledge graph, and establishing a corresponding association relationship through matching of the title and the entry or/and matching of the semantics and the word similarity.
7. A data processing apparatus based on a knowledge-graph script, the apparatus comprising:
the construction module is used for constructing a knowledge graph script based on a knowledge graph service system;
the receiving module is used for placing the xml data into a specified directory after receiving the xml data, wherein the xml data is a book;
the processing and analyzing module is used for processing and analyzing the xml data file through the knowledge graph script to form a bookmark data object;
the matching module is used for adding the entities in the data into a knowledge graph and then matching the semantic and word similarity after the data in the bookmark data object is analyzed, so that the entities with the same characteristics are correlated;
the service system is obtained by splitting power resource information, and specifically comprises a post system, an equipment system, a capital construction system, a scientific and technological system, a marketing system, a power grid system and a legal system.
8. The apparatus of claim 7, wherein the processing and parsing module is specifically configured to:
after the xml data is put into the designated directory, the knowledge graph script carries out processing and circular analysis according to the path and the type of the xml data file;
if the type of the xml data file is in a format needing decompression, the knowledge graph script decompresses the xml data file to the specified directory and then analyzes the xml data file according to the path;
and if the type of the xml data file is in a format which does not need to be decompressed, the xml data file is directly analyzed according to the path.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 6 are implemented by the processor when executing the computer program.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202210103654.1A 2022-01-27 2022-01-27 Data processing method, device and equipment based on knowledge graph script and storage medium Pending CN114416102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210103654.1A CN114416102A (en) 2022-01-27 2022-01-27 Data processing method, device and equipment based on knowledge graph script and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210103654.1A CN114416102A (en) 2022-01-27 2022-01-27 Data processing method, device and equipment based on knowledge graph script and storage medium

Publications (1)

Publication Number Publication Date
CN114416102A true CN114416102A (en) 2022-04-29

Family

ID=81279772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210103654.1A Pending CN114416102A (en) 2022-01-27 2022-01-27 Data processing method, device and equipment based on knowledge graph script and storage medium

Country Status (1)

Country Link
CN (1) CN114416102A (en)

Similar Documents

Publication Publication Date Title
CN108920659B (en) Data processing system, data processing method thereof, and computer-readable storage medium
Park et al. Web-based collaborative big data analytics on big data as a service platform
CN111522927B (en) Entity query method and device based on knowledge graph
CN107943877B (en) Method and device for generating multimedia content to be played
CN110244941B (en) Task development method and device, electronic equipment and computer readable storage medium
CN110020358B (en) Method and device for generating dynamic page
CN109815448B (en) Slide generation method and device
CN109189395B (en) Data analysis method and device
WO2024099171A1 (en) Video generation method and apparatus
CN113419789A (en) Method and device for generating data model script
CN107908743B (en) Artificial intelligence application construction method and device
CN115660880A (en) Fee calculation management method and device, electronic equipment, storage medium and product
CN110110153A (en) A kind of method and apparatus of node searching
CN114416102A (en) Data processing method, device and equipment based on knowledge graph script and storage medium
CN115759029A (en) Document template processing method and device, electronic equipment and storage medium
CN115618034A (en) Mapping application of machine learning model to answer queries according to semantic specifications
CN114968917A (en) Method and device for rapidly importing file data
CN112131379A (en) Method, device, electronic equipment and storage medium for identifying problem category
CN110727897B (en) Geological survey information service rapid publishing method and system supporting multi-terminal operation
CN114499759B (en) Message generation method and device, electronic equipment and storage medium
CN113742496B (en) Electric power knowledge learning system and method based on heterogeneous resource fusion
CN117592561B (en) Enterprise digital operation multidimensional data analysis method and system
CN113297306B (en) Data processing method and device
CN111143694B (en) Information pushing method and device, storage device and program
CN116644104A (en) Search engine management method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination