CN113704491A - Road bureau configuration file error proofing system based on domain knowledge map - Google Patents

Road bureau configuration file error proofing system based on domain knowledge map Download PDF

Info

Publication number
CN113704491A
CN113704491A CN202110955610.7A CN202110955610A CN113704491A CN 113704491 A CN113704491 A CN 113704491A CN 202110955610 A CN202110955610 A CN 202110955610A CN 113704491 A CN113704491 A CN 113704491A
Authority
CN
China
Prior art keywords
entity
entities
configuration file
data
station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110955610.7A
Other languages
Chinese (zh)
Inventor
盛凯
张涛
许伟
王振一
苗长俊
曾壹
李伟
赵宏涛
周晓昭
孙延浩
李智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Railway Sciences Corp Ltd CARS
Signal and Communication Research Institute of CARS
Beijing Ruichi Guotie Intelligent Transport Systems Engineering Technology Co Ltd
Beijing Huatie Information Technology Co Ltd
Original Assignee
China Academy of Railway Sciences Corp Ltd CARS
Signal and Communication Research Institute of CARS
Beijing Ruichi Guotie Intelligent Transport Systems Engineering Technology Co Ltd
Beijing Huatie Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Railway Sciences Corp Ltd CARS, Signal and Communication Research Institute of CARS, Beijing Ruichi Guotie Intelligent Transport Systems Engineering Technology Co Ltd, Beijing Huatie Information Technology Co Ltd filed Critical China Academy of Railway Sciences Corp Ltd CARS
Priority to CN202110955610.7A priority Critical patent/CN113704491A/en
Publication of CN113704491A publication Critical patent/CN113704491A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a road bureau configuration file error-proofing system based on a domain knowledge graph, which constructs a knowledge graph of railway dispatching static data by using a domain knowledge graph method, inquires in a graph database and realizes automatic detection of error contents and file data correction, thereby not only improving the working efficiency, but also ensuring the correctness of road bureau configuration files.

Description

Road bureau configuration file error proofing system based on domain knowledge map
Technical Field
The invention relates to the technical field of railway science, in particular to a road bureau configuration file error-proofing system based on a domain knowledge graph.
Background
A Train Dispatching Command System (TDCS) center and a TDCS center, which use a cfg type road bureau configuration file when exchanging station field display data between TDCS centers. When the configuration file of the adjacent office road office is exchanged with other TDCS/CTC manufacturers, the problems of large data volume, non-compliance of format and non-compliance of configuration with the regulation of the department standard exist, and a plurality of difficulties are brought to data verification and system maintenance work.
At present, two schemes are mainly adopted for checking and correcting road bureau configuration files:
in the first scheme, a road bureau configuration file in a ministerial protocol stores a large amount of key data displayed by an operation diagram, each part of data has respective strict content definition and format specification, and the operation diagram display error is caused by data content error or format non-specification. In an actual application scene, due to the lack of a standardized method, the data inspection and correction work is very complicated, and the data inspection and correction work is mainly carried out manually.
However, the first scheme mainly depends on manual work, and has the obvious disadvantages of missing detection, high error rate, time consumption and manpower consumption.
And the second scheme is that the automatic road bureau configuration file standardization method completes data self-checking logic by means of a computer program, and is characterized in that the self-defined self-checking logic is used for checking the content and format of each item of data of the road bureau configuration file according to a ministerial standard protocol, and when an error is checked, error information can be output. The self-checking logic checks whether the data format of the row data is valid (format check) and whether the read-in data is legal (segment check) when executed, and performs association comparison with other types of data to ensure the consistency and validity of the data (association check). The main checking method comprises the steps of checking whether station configuration station codes are repeated or not, checking whether the distance between sub-image switching point lines cannot be 0 or not, checking whether a base image station name is consistent with the station configuration station name or not, checking whether the base image station code is corresponding to the station code or not and the like.
However, the error correction in the second scheme needs to be manually completed, and when the protocol and the data specification are changed, the established self-checking logic is no longer applicable, and a developer needs to continuously maintain the program code. Meanwhile, when the query based on the relational database faces the data verification of complex and multilevel correlation, the problem of low efficiency is easily caused.
Disclosure of Invention
The invention aims to provide a road bureau configuration file error-proofing system based on a domain knowledge graph, which can automatically check and correct the information of a road bureau configuration file, thereby improving the working efficiency.
The purpose of the invention is realized by the following technical scheme:
a road bureau configuration file error-proofing system based on a domain knowledge graph comprises:
the data acquisition unit is used for acquiring data for constructing a railway field knowledge graph from a data source;
the railway domain knowledge map construction and storage unit is used for acquiring related data from the data acquisition unit, extracting entities related to the railway, entity related information and relations among different entities from the data acquisition unit, taking the entity related information as attributes of the entities, constructing connecting edges among the different entities by using the relations among the entities, associating the entities, constructing a railway domain knowledge map and storing the railway domain knowledge map as a map database;
the information inquiry and correction unit is used for identifying dispatching section data from the input road bureau configuration file, extracting corresponding dispatching station information, inquiring corresponding dispatching station codes, corresponding stations managed by the dispatching stations and station codes in a database, comparing the inquiry result with the information in the road bureau configuration file, and correcting the road bureau configuration file by using the inquiry result if the inquiry result is inconsistent with the information in the road bureau configuration file; the specific dispatching desk, station, section and station track to be inquired are all entities.
According to the technical scheme provided by the invention, the knowledge map of the railway dispatching static data is constructed by using a domain knowledge map method, the automatic detection of wrong contents and the file data correction are inquired and realized in the map database, the working efficiency is improved, and the correctness of the road bureau configuration file can be ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a road bureau configuration file error proofing system based on a domain knowledge graph according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the main concepts and relationship modes of the railroad field provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of storing entities and relationships in a domain knowledge graph of railway dispatching to a Neo4j graph database according to an embodiment of the present invention;
fig. 4 is a flowchart of error correction of a road bureau configuration file according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The terms that may be used herein are first described as follows:
the terms "comprising," "including," "containing," "having," or other similar terms of meaning should be construed as non-exclusive inclusions. For example: including a feature (e.g., material, component, ingredient, carrier, formulation, material, dimension, part, component, mechanism, device, process, procedure, method, reaction condition, processing condition, parameter, algorithm, signal, data, product, or article of manufacture), is to be construed as including not only the particular feature explicitly listed but also other features not explicitly listed as such which are known in the art.
The term "consisting of … …" is meant to exclude any technical feature elements not explicitly listed. If used in a claim, the term shall render the claim closed except for the inclusion of the technical features that are expressly listed except for the conventional impurities associated therewith. If the term occurs in only one clause of the claims, it is defined only to the elements explicitly recited in that clause, and elements recited in other clauses are not excluded from the overall claims.
The road bureau configuration file error-proofing system based on the domain knowledge graph provided by the invention is described in detail below. Details which are not described in detail in the embodiments of the invention belong to the prior art which is known to the person skilled in the art. Those not specifically mentioned in the examples of the present invention were carried out according to the conventional conditions in the art or conditions suggested by the manufacturer. As shown in fig. 1, the system mainly includes:
the data acquisition unit is used for acquiring data for constructing a railway field knowledge graph from a data source;
the railway domain knowledge map construction and storage unit is used for acquiring related data from the data acquisition unit, extracting entities related to the railway, entity related information and relations among different entities from the data acquisition unit, taking the entity related information as attributes of the entities, constructing connecting edges among the different entities by using the relations among the entities, associating the entities, constructing a railway domain knowledge map and storing the railway domain knowledge map as a map database;
the information inquiry and correction unit is used for identifying dispatching section data from the input road bureau configuration file, extracting corresponding dispatching station information, inquiring corresponding dispatching station codes, corresponding stations managed by the dispatching stations and station codes in a database, comparing the inquiry result with the information in the road bureau configuration file, and correcting the road bureau configuration file by using the inquiry result if the inquiry result is inconsistent with the information in the road bureau configuration file; the specific dispatching desk, station, interval and station track to be inquired are all entities, and the relationship among the entities comprises: management relation between the dispatching desk and the station, connection relation between the section and the adjacent station, and subordination relation between the station track and the station.
In the embodiment of the invention, the dispatching desk, the stations, the station tracks, the sections and the station diagram are all concept types, and the individual specific dispatching desk (for example, jinghu dispatching desk, jinghuang dispatching desk, jinghha dispatching desk), the stations (for example, beijing south station, beijing east station), the sections (for example, beijing south-wuqing sections) and the station tracks (for example, a certain station track in beijing south station) are all examples, and the examples have own attributes, and the relations are established among the examples, so that all entities are suitable for being queried in the above manner.
In the above-mentioned scheme of the embodiment of the present invention, a domain knowledge graph method is used to construct a knowledge graph of railway dispatching static data, and the automatic detection of error content and the file data correction are queried and implemented in a graph database, and the above-mentioned scheme can at least solve the problems existing in the second scheme: 1) the problem that the error data can not be corrected; 2) the problem that the subsequent check logic code is not applicable exists due to the limitation of the current protocol data specification; 3) difficult to extend to the problem of other file checks; 4) the problem of low efficiency when the relational database queries multi-level associated data is solved.
In order to more clearly show the technical solutions and the technical effects provided by the present invention, a road bureau configuration file error-proofing system based on a domain knowledge graph provided by the embodiments of the present invention is described in detail with specific embodiments below.
The invention relates to a basic concept of a knowledge graph in the railway dispatching field and application thereof in the invention.
The knowledge graph is an application mode based on a semantic network, the nodes are used for representing structured semantic information, a network graph is formed according to the criss-cross relationship among the nodes, and visual display is achieved. The method comprises the steps of constructing and storing a knowledge map of the railway dispatching field based on a map database, firstly analyzing knowledge characteristics of the railway dispatching field to realize entity type division in the field, establishing relation types among knowledge entities, knowledge attributes and knowledge, completing model design of a railway dispatching knowledge base, and then utilizing the map database to complete construction, storage and query of the knowledge map.
In the embodiment of the invention, the construction process of the knowledge base is divided into 6 steps:
1) defining the field scope and construction requirement of railway scheduling knowledge, and determining important concepts and terms in the field; as mentioned previously, important concepts mainly include a dispatching desk, a station, an interval, a station track, etc.; the terms send and receive overrun cars, high stations, low stations, and the like.
2) Abstracting a core concept set in the railway dispatching field, and determining classes, entities and relations among the entities; the core concept set is a relation between types containing and defining important concept definitions. Such as the type of the dispatching desk, the type of the station, the relationship of the dispatching desk governing the station, etc.
3) The type and attributes of the entity are defined.
4) Defining other constraint relationships; for example, a track has multiple attributes, one of which is the station attribute of the track, and the value is constrained to be one of no station, low station, and high station.
5) And creating entities and relations of the railway dispatching field.
6) And correcting the railway scheduling knowledge.
Those skilled in the art will appreciate that the various concepts, terms, etc. described above in connection with the above steps are all technical features of the art, and all meanings and relationships thereof are common general knowledge.
Specific implementation of each step will be described later.
And II, data sources of the knowledge graph in the railway field.
In the embodiment of the invention, the data for constructing the knowledge map in the railway field is obtained from the data source, and the knowledge elements are extracted.
In the embodiment of the invention, the data for constructing the knowledge graph in the railway field mainly comprise: structured data, semi-structured data, and plain text data (e.g., a configuration file in Txt format is a type of plain text data).
The structured data includes: the relational database represents and stores two-dimensional form data, and the data can be directly extracted into a knowledge graph. For example, the scheduler orchestrator system produces the static data of the runtime diagram of the database.
The semi-structured data includes: the relevant tags are used to separate semantic elements, and there is no data in the form of a database, e.g., data in the provisions of the regional label appendix C railroad bureau related exchange documentation.
And thirdly, a mode of knowledge graph in the railway dispatching field.
Knowledge modeling requires determining the types of entities and relationships in a knowledge network, i.e., concepts and relationship schema designs. The basic goal of the railway dispatching domain knowledge graph is to give the computer basic conceptual knowledge of the railway dispatching domain. Basic concepts of a designated field and subclass relations among the concepts are required in the cognitive basic framework, as described above, the dispatching desk and the stations belong to the concepts, and the individual specific dispatching desk and the stations are examples, such as the jinghu desk, the jinghha desk, the beijing south station, the beijing station, and the like; the basic properties of the railway field need to be defined; defining applicable concepts of attributes; the category or range of attribute values is specified.
Examples are as follows:
1. the 'track' attribute is defined on the concept of a station, and the reasonable value is a track name.
2. The railway field also has a large number of constraints or rules, and whether the attribute can obtain multivalued constraint:
a) the "type of receiving and dispatching a stock track" as an attribute may take multiple values.
b) The "plus line" attribute of a track is a reciprocal pair of attributes with the "minus line" attribute of a track.
The metadata (namely, the stock receiving and dispatching type, the main line and the side line of the stock track) has important significance for eliminating the inconsistency of the knowledge base and improving the quality of the knowledge base.
In the embodiment of the invention, the entity related to the railway, the entity related information and the relationship among different entities are extracted to form a data structure of two types of triples:
the first type of triples contains relationships between different entities, and is represented as: < entity 1, relationship, entity 2 >; the entities 1 and 2 represent two different entities. The following are exemplary: station security belongs to the platform jurisdiction of becoming capital, wherein, station security, platform of becoming capital are different entities, show that the platform of becoming capital manages the relation at security. .
The second type of triple contains the relevant information of the entity, which is expressed as < entity, attribute, value >. Examples are as follows: the station code (stationcode) of the long child south station is 500, which can be expressed as < long child south, stationcode, 500 >; the safety station IG can receive and send out the ultralimit vehicle and can be expressed as < IG, limit, 1 >.
And thirdly, data relation of knowledge maps in the railway field.
In the embodiment of the invention, concepts in the railway field and the existing relations between the concepts are analyzed by means of an E-R (Entity Relationship Diagram) used by the traditional database design, and the directions of the relations are marked by using arrows. Knowledge maps usually form a large network by taking entities as nodes, the Schema (i.e. Schema, which can be understood as organization and structure) of the map corresponds to a data model, and describes types (Type) included in the domain and attributes (Property) describing the entities under the types, the relationship between the entities in the Property is an edge (relationship), and the information carried by the entities is an Attribute (Attribute). From actual service requirements, a data model is abstracted from an existing data table, a dispatching desk, a station, an interval and the like are used as main entities, basic information of the data model is used as attributes, association between the station and between the station and the dispatching desk is used as an edge, information in multiple fields is associated, and map information is filled, so that more diversified knowledge is provided.
As shown in fig. 2, is a main concept and relationship model of the railway field. As shown in fig. 2, the attributes of a track (track) mainly include: direction (dir), station code (statinid), station (passer), limit (limit), name (name), etc. Attributes of a station (station) mainly include: a station name (name), a station code (id), a dispatcher (ddt), a type (type), a bureau code (bureaucode), and the like. The attributes of the section (section) mainly include: station 1(stationid1), station 2(stationid2), section number (id), dispatching desk (ddt), direction (dir), and the like. The attributes of the node (dot) mainly include: sequence number (sequence), station code (station), uplink kilometer sign (upkm), downlink kilometer sign (downkm), position (position), and the like. The attributes of the dispatcher (dispatch) mainly include: station code (id), name (name), etc. Layout shows files for a single station. The station track and the display file belong to a station, the station belongs to dispatching desk management, and a subgraph, nodes and intervals are arranged on the dispatching desk. One office governs a plurality of dispatch stations.
And fourthly, storing the knowledge graph in the railway field.
After the railway domain knowledge graph is modeled (namely important concepts, definition classes and relations are determined by a knowledge graph method), and how to store the data after a data source is determined, the knowledge storage is to store acquired knowledge in a specific physical structure. Compared with the traditional 'table' database, the database taking the graph model as the storage unit is good at storing and processing a large amount of complex entity data, can effectively process complex 'relation' data of massive entities, and supports quick traversal of a graph algorithm, so that a Neo4j database is selected as a storage system of the knowledge graph in comprehensive consideration of the scale, operation complexity, popularity and maturity of the graph, and other databases can be used for replacing the Neo4j database.
The graph database takes the node and relation composition graph as a storage data model, and has natural advantages for processing entity (node) -relation (edge) calculation in the knowledge graph. The nodes and edges of the graph correspond to the nodes and relationships in the graph database, compared with the traditional relational database, the problem of complex relationships can be solved quickly, all characteristics of the database such as Create, Update, Read and Delete can be realized through a data relationship structure formed by the nodes and the relationships, and rich relational connection can be realized, so that the complex relational data can be operated in a relatively efficient mode when being inquired. Graph databases may describe complex relationships between data more succinctly and unambiguously than other types of databases.
The railway domain knowledge graph uses a storage structure of a Neo4j graph database, and comprises two data storage modes:
node (Node): representing the Entity (Entity) as described above, a node contains several attributes (Property) in the form of key-value pairs, and the different types of the node are distinguished according to the corresponding label of each node.
Relationship (Relationship): containing its own attributes and type tags, and the IDs of the start node and the end node. Illustratively, the categories of the relationship include: manage relationships, belongings, and containment relationships; taking the Beijing Hatai as an example, the Beijing platform comprises 5 stations, wherein the starting node is a Beijing station, the ending node is a Harbin station, and the Beijing platform is managed by the Beijing Hazai platform; 10 stock roads belong to Beijing station; the Beijing station and the Harbin station are stations, and the stations include the Beijing station, the Harbin station and the like.
As shown in fig. 3, an example of storing entities and relationships in a railway dispatch domain knowledge graph to a Neo4j graph database is provided, which is a graph automatically generated by a graph algorithm, and the text in the graph is for example only and not for limitation; in addition, since the computer-generated map contains a lot of information, only a part of the map is cut out in fig. 3, and some characters in the area of the dense arrow on the right side of fig. 3 are blocked, which can be used as evidence for increasing the richness of the test data.
And fifthly, inquiring the knowledge graph in the railway field.
Query on a map in many fields is traversed in a single step or two to three steps, and SQL is completely competent, but in the embodiment of the invention, correlation analysis of data information is involved, complex subgraph mining needs to be carried out, and the expression capability of SQL is relatively weak. In order to accurately and efficiently query data with complex relationships in the field of railway scheduling, a query language Cypher of a Neo4j graph database is innovatively introduced, a query relation subgraph can be traversed under the condition that traversal codes are not written, and Cypher query sentences are generated through feature words and classification labels of the query sentences to search answers from Neo4 j. Of course, the specific search mode can be replaced by other modes according to the actual situation.
The contents of operations and implementation methods for data storage, update and query of railway scheduling knowledge in Cypher language in the example are described in detail below.
1. An entity node is created.
The Cypher statement in this example to create a long child south entity is as follows:
CREATE (s: State title: 'Long child south', stationCode:500, delta: 22) } RETURNs
2. And updating the entity node.
The Cypher statement that updates the properties of the Long child south entity in this example is as follows:
MATCH(s)WHERE s.stationCode=500SET s.stationCode=501RETURN s
3. and querying the entity node.
The Cypher statement for the station with the query station code of 500 in this example is as follows:
MATCH(s{stationCode:500})RETURN s
4. entity relationships are created.
The Cypher statement of the relationship created by the kyo guannan tai and songchai in this example is:
MATCH (d: Ddt), (s: Station) WHERE d.title ═ Kyoto Guangtai Taiwan 'AND s.title ═ Song' CREATE (d) - [ r: RelationShip Ship { title: 'management' } > - >(s) RETURN r
5. And updating the entity relationship.
The relationship is read through MATCH graph schema and then the attribute content of the relationship is updated using SET. In this example, the relation is changed to Cypher statement of the double-vision station:
match (d) - [ r ] - >(s) WHERE id (r) ═ 13SET r
6. And querying entity relations.
In this example, all stations of the kyo south-southern table double view are queried:
match (d) - [ r ] - >(s) WHERE d.title ═ kyo guan south tai 'AND r.title ═ double vision' RETURN
Sixthly, correcting the error of the road bureau configuration file.
The method for constructing, storing and inquiring the knowledge graph in the railway field is introduced in the above five parts, on the basis of the introduction, the following introduction is carried out aiming at an error correction process, wherein the error correction mainly refers to information inquiry and correction, and the main steps are as follows:
step S1, identifying the scheduling section data from the input road bureau configuration file, and extracting the corresponding entity ID, for example, the scheduling station code (ddtCpde).
And S2, querying in the graph database by using the entity ID, if the corresponding entity ID is queried, switching to S3, and otherwise, switching to S4.
Step S3, query the entity name (title) corresponding to the entity ID, if the corresponding entity name is consistent with the road bureau configuration file, go to step S5, otherwise go to step S6. For example, if the entity ID is a dispatcher station code, the dispatcher station code is used to query the corresponding dispatcher station name.
Step S4, inquiring all entities managed by the entity ID and the related sub-entity IDs, comparing with the road bureau configuration file, if the number of the managed entities is consistent with the related entity IDs, correcting the entity ID of the road bureau configuration file (i.e. the entity ID in step S1), and if the number of the managed entities is not consistent with the related entity IDs, correcting the number of the managed entities in the road bureau configuration file and the related sub-entity IDs. For example, if the entity in step 1 is a dispatching desk, the entity managed by the dispatching desk is a station, and such logic is common knowledge and will not be described herein. Step S5, inquiring all entities managed by the entity corresponding to the entity ID and the related sub-entity ID, and comparing with the road bureau configuration file; if the managed entity number and the related sub-entity ID are consistent, the next check is performed, and if the entity number and the related sub-entity ID are not consistent, the managed entity number and the related sub-entity ID in the road bureau configuration file are corrected.
Step S6, inquiring all entities managed by the dispatching desk and the related sub-entity IDs, comparing with the road bureau configuration file, if the managed entities and the related sub-entity IDs are consistent, correcting the entity IDs of the road bureau configuration file, and if the managed entities and the related sub-entity IDs are inconsistent, correcting the number of the managed entities in the configuration file and the related sub-entity IDs.
Those skilled in the art can understand that the specific entities such as the dispatching desk, the station, etc. all have unique names and serial numbers (IDs), that is, the names of the desk, the codes of the station, the names of the station, and the codes of the station, so that the entities described above can be queried according to the above principles; fig. 4 shows the above-described inquiry and correction process only with the dispatching desk as an entity and the station managed by the dispatching desk as an example of a child entity, and the section is also managed by the dispatching desk, and a single section is composed of two adjacent stations.
According to the scheme of the embodiment of the invention, the road bureau configuration exchange file error prevention system based on the domain knowledge map is constructed, the concept and the model of the railway dispatching knowledge map are established on the basis of the existing railway knowledge, the map database based on Neo4j is established according to the existing data source (namely, a production database, a ministerial protocol file and the like), and the road bureau configuration file is inquired, retrieved and corrected by using Cypher, so that the problem that the existing scheme can only check the format and can not correct the content is solved, the problem that codes need to be rewritten after the subsequent protocol is upgraded is solved, the problem that data is updated and the maintenance cannot be carried out by continuously increasing manpower is solved, and the system can also be used for inquiring and retrieving other types of railway files.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the system is divided into different functional modules to perform all or part of the above described functions.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. A road bureau configuration file error proofing system based on a domain knowledge graph is characterized by comprising:
the data acquisition unit is used for acquiring data for constructing a railway field knowledge graph from a data source;
the railway domain knowledge map construction and storage unit is used for acquiring related data from the data acquisition unit, extracting entities related to the railway, entity related information and relations among different entities from the data acquisition unit, taking the entity related information as attributes of the entities, constructing connecting edges among the different entities by using the relations among the entities, associating the entities, constructing a railway domain knowledge map and storing the railway domain knowledge map as a map database;
the information inquiry and correction unit is used for identifying dispatching section data from the input road bureau configuration file, extracting corresponding dispatching station information, inquiring corresponding dispatching station codes, corresponding stations managed by the dispatching stations and station codes in a database, comparing the inquiry result with the information in the road bureau configuration file, and correcting the road bureau configuration file by using the inquiry result if the inquiry result is inconsistent with the information in the road bureau configuration file; the specific dispatching desk, station, section and station track to be inquired are all entities.
2. The territorial knowledge graph-based road bureau configuration file error proofing system of claim 1, wherein the data for constructing the railroad territorial knowledge graph comprises: structured data, semi-structured data, and plain text data;
the structured data includes: relational database representation and stored two-dimensional form data;
the semi-structured data includes: the use of the correlation tags serves to separate the semantic elements, and there is no data in the form of a database.
3. The system of claim 1, wherein the extracted entities related to the railway, information related to the entities, and relationships between different entities form a data structure of two types of triples:
the first type of triples contains relationships between different entities, and is represented as: < entity 1, relationship, entity 2 >; the entity 1 and the entity 2 represent two different entities;
the second type of triple contains the relevant information of the entity, which is expressed as < entity, attribute, value >.
4. The system of claim 1 or 3, wherein the single entities have unique names and IDs; relationships between entities include: management relation between the dispatching desk and the station, connection relation between the section and the adjacent station, and subordination relation between the station track and the station.
5. The system of claim 1, wherein the query and correction process of the information query and correction unit comprises:
step S1, identifying the data of the dispatching section from the input road bureau configuration file, and extracting the corresponding entity ID;
step S2, using entity ID to inquire in the graph database, if inquiring about the corresponding entity ID, then turning to step S3, otherwise, turning to step S4;
step S3, inquiring the entity name corresponding to the entity ID, if the corresponding entity name is consistent with the road bureau configuration file, then turning to step S5, otherwise, turning to step S6;
step S4, inquiring all entities managed by the entity corresponding to the entity ID and the related entity ID, comparing the entity ID with the road bureau configuration file, if the number of the managed entities is consistent with the related entity ID, correcting the entity ID of the road bureau configuration file, and if the number of the managed entities is not consistent with the related entity ID, correcting the number of the managed entities in the road bureau configuration file and the related entity ID;
step S5, inquiring all entities managed by the entity corresponding to the entity ID and the related entity ID, and comparing with the road bureau configuration file; if the number of the managed entities is consistent with the related entity ID, the next check is carried out, and if the number of the managed entities is inconsistent with the related entity ID, the number of the managed entities and the entity ID in the road bureau configuration file are corrected;
step S6, inquiring all entities managed by the entity corresponding to the entity ID and their related entity IDs, comparing with the road bureau configuration file, if the number of the managed entities is consistent with the related entity ID, correcting the entity name of the road bureau configuration file, and if not, correcting the number of the managed entities in the configuration file and the related entity ID.
6. The road bureau configuration file error-proofing system based on the domain knowledge graph as claimed in claim 1 or 5, wherein the railway domain knowledge graph uses a storage structure of a Neo4j graph database, and comprises two data storage modes:
and (3) node: taking an entity as a node, wherein the node comprises a plurality of attributes in a key-value pair form, and distinguishing different types of the node according to a corresponding label of each node;
the relationship is as follows: containing its own attributes and type tags, and the IDs of the start node and the end node.
7. The system of claim 6, wherein the database is queried using the Cypher language.
CN202110955610.7A 2021-08-19 2021-08-19 Road bureau configuration file error proofing system based on domain knowledge map Pending CN113704491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110955610.7A CN113704491A (en) 2021-08-19 2021-08-19 Road bureau configuration file error proofing system based on domain knowledge map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110955610.7A CN113704491A (en) 2021-08-19 2021-08-19 Road bureau configuration file error proofing system based on domain knowledge map

Publications (1)

Publication Number Publication Date
CN113704491A true CN113704491A (en) 2021-11-26

Family

ID=78653643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110955610.7A Pending CN113704491A (en) 2021-08-19 2021-08-19 Road bureau configuration file error proofing system based on domain knowledge map

Country Status (1)

Country Link
CN (1) CN113704491A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115092215A (en) * 2022-05-24 2022-09-23 卡斯柯信号有限公司 Connection relation-based intersection checking method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115092215A (en) * 2022-05-24 2022-09-23 卡斯柯信号有限公司 Connection relation-based intersection checking method

Similar Documents

Publication Publication Date Title
Vaisman et al. Data warehouse systems
TWI509435B (en) Method, device and computer program product for automatically performing join operations
Consoli et al. Producing linked data for smart cities: The case of Catania
CN111078897A (en) System for generating six-dimensional knowledge map
Pardillo et al. Using ontologies for the design of data warehouses
US11334549B2 (en) Semantic, single-column identifiers for data entries
CN110119395B (en) Method for realizing association processing of data standard and data quality based on metadata in big data management
CN110990585A (en) Multi-source data and time sequence processing method and device for constructing industry knowledge graph
CN107526804B (en) Railway investment project data statistics system and method
CN106372044A (en) Method for generating typed dimension XBRL (Extensible Business Reporting Language) report based on report form
Maté et al. Tracing conceptual models' evolution in data warehouses by using the model driven architecture
Seedah et al. Ontology for querying heterogeneous data sources in freight transportation
Chatfield et al. SCML: An information framework to support supply chain modeling
CN113704491A (en) Road bureau configuration file error proofing system based on domain knowledge map
CN106599216A (en) Computer based training courseware publishing system
CN112214609B (en) Audit method and system based on knowledge graph
CN116303641B (en) Laboratory report management method supporting multi-data source visual configuration
CN117076535A (en) Enterprise-level declarative domain model definition and storage model conversion method and system
Weinreich et al. A fresh look at codification approaches for sakm: A systematic literature review
CN115982329A (en) Intelligent generation method and system for engineering construction scheme compilation basis
CN115827885A (en) Operation and maintenance knowledge graph construction method and device and electronic equipment
Demirel An integrated approach to the conceptual data modeling of an entire highway agency geographic information system (GIS)
Mbala et al. Towards a Formal Modelling of Data Warehouse Systems Design
Olaru et al. Integrating Multidimensional Information for the Benefit of Collaborative Enterprises.
CN114265889A (en) Disciplinary knowledge data processing method and device based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination