CN117056304A - Method and device for constructing main database based on cloud platform and electronic equipment - Google Patents

Method and device for constructing main database based on cloud platform and electronic equipment Download PDF

Info

Publication number
CN117056304A
CN117056304A CN202310869889.6A CN202310869889A CN117056304A CN 117056304 A CN117056304 A CN 117056304A CN 202310869889 A CN202310869889 A CN 202310869889A CN 117056304 A CN117056304 A CN 117056304A
Authority
CN
China
Prior art keywords
data
main
main data
cloud platform
constructing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310869889.6A
Other languages
Chinese (zh)
Inventor
胡泓
吴海燕
徐健
贺莉娜
李成良
胡宇
戴祎
王琴琴
徐震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing MTR Construction Administration Corp
Original Assignee
Beijing MTR Construction Administration Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing MTR Construction Administration Corp filed Critical Beijing MTR Construction Administration Corp
Priority to CN202310869889.6A priority Critical patent/CN117056304A/en
Publication of CN117056304A publication Critical patent/CN117056304A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device, electronic equipment and a medium for constructing a main database based on a cloud platform, wherein the method comprises the following steps: determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform; classifying the main data and determining attribute information of the main data; generating codes corresponding to the main data based on attribute information of the main data and preset coding rules; and constructing a main database based on the encoded and classified main data. According to the construction method of the main database based on the cloud platform, unified access of the network-level cross-system rail traffic data is achieved through the cloud platform, the main data are accurately identified according to the application range, the application period and the application scene of the access data, and are classified and encoded, so that the main database is constructed, unified management of the main data is achieved, and the data resource utilization rate and the service level of multi-line integrated operation and maintenance are improved.

Description

Method and device for constructing main database based on cloud platform and electronic equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for constructing a main database based on a cloud platform, and an electronic device.
Background
At present, with the rapid development of domestic rail transit construction and the rapid expansion of networked operation scale, the operation and maintenance data lack of unified access integration due to line individuation and technology level difference.
The cloud platform technology, the main data management technology and the big data technology are gradually applied to informatization construction of the rail transit industry, but the independent construction or repeated construction condition generally exists, the business of each system is not perfect and relatively independent, the enterprise privately-owned cloud platform does not realize network-level cross-network section application, a large amount of data generated in a long-term application process is ignored, the data standard is inconsistent, the main data application is not standard, and huge application value of the data is not exerted.
The data standards of different regions and different enterprises in the rail transit industry are not uniform, and data management is not standard, so that the utilization rate of operation and maintenance data of different professions of each line is low.
Disclosure of Invention
Aiming at the problems existing in the prior art, the embodiment of the invention provides a method and a device for constructing a main database based on a cloud platform and electronic equipment.
The invention provides a method for constructing a main database based on a cloud platform, which comprises the following steps:
determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform;
classifying the main data and determining attribute information of the main data;
generating codes corresponding to the main data based on the attribute information of the main data and a preset coding rule;
and constructing a main database based on the encoded and classified main data.
In some embodiments, the classifying the main data, determining attribute information of the main data, includes:
classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
and determining attribute information of the main data based on the main data associated with each preset theme zone.
In some embodiments, the constructing a master database based on the encoded and categorized master data includes:
performing data cleaning on the classified main data;
and constructing a main database based on the codes, the preset theme zone and the main data after data cleaning.
In some embodiments, the application scope of the target data includes:
the number of rail transit routes to which the target data is applied;
the number of specialized fields of the target data application;
the number of business departments to which the target data is applied.
In some embodiments, the theme zone includes at least one of:
finance, projects, contracts, suppliers, wire nets, organisers, locations, assets, supplies, equipment, production data, security, affiliated resources.
The invention also provides a device for constructing the main database based on the cloud platform, which comprises the following steps:
the identification module is used for determining main data based on the application range, the application period and the application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform;
the classification module is used for classifying the main data and determining attribute information of the main data;
the encoding module is used for generating an encoding corresponding to the main data based on the attribute information of the main data and a preset encoding rule;
and the construction module is used for managing the classified main data based on the codes.
In some embodiments, the classification module is specifically configured to:
Classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
and determining attribute information of the main data based on the main data associated with each preset theme zone.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor realizes the construction method of the main database based on the cloud platform when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of constructing a cloud platform based master database as described in any of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements the method for constructing a cloud platform-based master database as described in any one of the above.
According to the cloud platform-based method, device and electronic equipment for constructing the main database, unified access of network-level cross-system rail traffic data is achieved through the cloud platform, the main data are accurately identified according to the application range, the application period and the application scene of the access data, and are classified and encoded, so that the main database is constructed, unified management of the main data is achieved, and the data resource utilization rate and the service level of multi-line integrated operation and maintenance are improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow diagram of a method for constructing a main database based on a cloud platform according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a main data management platform according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a master data management platform according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a big data analysis platform according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a big data analysis platform according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a data flow of a big data analysis platform provided by an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a device for constructing a main database based on a cloud platform according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the invention are capable of operation in sequences other than those illustrated or otherwise described herein, and that the "first" and "second" distinguishing between objects generally are not limited in number to the extent that the first object may, for example, be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/" generally means a relationship in which the associated object is an "or" before and after.
With the rapid development of domestic rail transit construction and rapid expansion of networked operation scale, the problems of line individuation, technical level differentiation, lack of unified access integration of operation data, low operation data utilization rate, poor fusion linkage of each service flow, low resource utilization rate, high maintenance cost and the like are faced, and the main pain and difficulty are as follows:
professional systems and information data lack a unified integration and management platform;
the software and hardware resources are repeatedly built, the utilization rate is low, and the overall construction and maintenance cost is high;
the unified management difficulty of the multi-line assets and the materials is high;
the data standards of different regions and different enterprises in the rail traffic industry are not uniform, and the data management is not standard;
the comprehensive utilization rate of the operation and maintenance data of different professions of each line is low, and the intelligent analysis degree is low;
how to improve the business level of multi-line integrated operation and maintenance, strengthen the whole life cycle management of mass assets of rail transit, improve the utilization ratio of software and hardware resources, comprehensively reduce operation and maintenance cost, further ensure the operation safety of lines, the health degree of equipment and the service reliability, and are the problems to be solved urgently and the problems to be studied deeply.
Fig. 1 is a flow chart of a method for constructing a main database based on a cloud platform according to an embodiment of the present invention, as shown in fig. 1, where the method for constructing a main database based on a cloud platform according to an embodiment of the present invention includes:
Step 101, determining main data based on an application range, an application period and an application scene of target data; the target data are cross-system rail traffic data acquired through a cloud platform;
102, classifying the main data, and determining attribute information of the main data;
step 103, generating codes corresponding to the main data based on attribute information of the main data and preset coding rules;
and 104, constructing a main database based on the encoded and classified main data.
It should be noted that, the execution subject of the method for constructing a main database based on a cloud platform provided by the present invention may be an electronic device, a component in the electronic device, an integrated circuit, or a chip. The electronic device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a cell phone, tablet computer, notebook computer, palm computer, vehicle mounted electronic device, wearable device, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device may be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., without limitation of the present invention.
In step 101, main data is determined based on an application range, an application period and an application scene of target data; the target data are cross-system rail traffic data acquired through a cloud platform.
The target data are data of track traffic industry of cross systems acquired through the cloud platform, and the cross systems can be different database systems or equipment systems of different acquisition equipment. Track traffic data includes data during the course of track traffic, such as light signals, heat signals, equipment status signals, etc.
Optionally, cross-region, cross-line, cross-professional and cross-business data acquisition and interaction can be realized by utilizing the multi-network fusion of the cloud platform.
For example, the target data may include: business office data, various devices, vehicle professional data, operation and maintenance management data, resource management data, other professional production system data and the like.
The target data covers all specialized ranges of vehicles, communication signals, power supply, electromechanics, civil engineering lines and the like related to rail transit operation.
The main data is inter-system shared data (such as client, provider, account and organization related data), is core business data which spans multiple business departments, multiple systems and is frequently used, and has the characteristics of high sharing performance, long-term stability and high business value.
High sharing, which is the existence or potential need of multiple business, professional or department systems, requires core business data to be shared among the multiple systems. Such as: the main data of the suppliers are related to purchase management, material management and contract management, and have the characteristics of high sharing performance and main data. For example, personnel personal payroll information is only used in a human resource system, and is core service data of human resource system service and private data.
Long-term stability, the main data is typically even longer throughout the life of the business object. The basic information of the main data is basically unchanged or changed with low frequency once entering the system. The main data is substantially different from the transaction data. The transaction data belongs to data reflecting enterprise real-time business records. The primary data is relatively stable, stationary or stationary for a long period of time, while the transaction data is data that changes in real time, typically describing the transaction behavior or business state that occurs at a point in time. Such as: basic information such as contract coding, contract signing time and the like in the contract main data is relatively stable, the sharing requirement is high, the basic information can be used as the basic information of the contract main data, and contract payment and contract change information are frequently changed and are not included in the basic information of the main data.
The main data are core business entity data with high business value, and the business and production activities of enterprises are spread around the core business data. Whether the data resources have high service value is difficult to quantitatively evaluate, and specific analysis is needed according to actual conditions. The high service value characteristic of the main data needs to be evaluated in combination with the current state investigation situation of the enterprise service. Main data with high service value are as follows: staff, organizations, suppliers, contracts, etc.
Whether the collected target data meets the requirement of high sharing performance can be judged according to the application range of the target data, whether the target data meets the requirement of long-term stability is judged according to the application period of the target data, whether the target data meets the requirement of high service value characteristics is judged according to the application scene of the target data, and when the requirements of the three are met at the same time, the target data can be judged to be main data.
In some embodiments, the application scope of the target data includes:
the number of rail transit routes to which the target data is applied;
the number of specialized fields of the target data application;
the number of business departments to which the target data is applied.
When judging whether or not the target data satisfies the high sharing requirement according to the application range of the target data, the judgment can be made by:
Judging whether the target data is available in a plurality of lines (more than 2) according to the number of the track traffic lines to which the target data are applied;
judging whether the target data are used for a plurality of (more than 2) professions according to the number of professional fields of the target data application;
according to the number of business departments to which the target data is applied, it is determined whether the target data is for a plurality of (more than 2) business departments.
When at least 2 of the above conditions are satisfied, it can be discriminated that the target data satisfies high sharing.
When judging whether the target data satisfies the long-term stability requirement according to the application period of the target data, the judgment can be made by:
it is determined whether the target data is valid for a service life, and whether the target data remains unchanged for a long time period (e.g., more than 1 year).
And when the 2 conditions are met at the same time, judging that the target data meets long-term stability.
When judging whether the target data meets the high service value characteristic requirement according to the application scene of the target data, the judgment can be performed by the following modes:
whether the target data belongs to the enterprise core business range can be judged according to the historical experience data, whether the target data is used for cost benefit analysis or not is judged, and whether the target data is used for decision assistance or not is judged.
The core service refers to a service with higher importance, for example, a rail operation company, and the core service includes: line operation and maintenance management, equipment and facility management, and the like. Decision assistance refers to comprehensive analysis and presentation of data, and provides data assistance for various decisions made by a company management layer.
When at least 2 of the above conditions are satisfied, it can be discriminated that the target data satisfies high value.
According to the main data identification method provided by the embodiment of the invention, the main data is accurately identified by comprehensively judging the application range, the application period and the application scene of the target data, so that the utilization rate of the main data is improved.
Alternatively, a main data recognition model may be constructed according to the above-mentioned judgment conditions, the collected target data may be recognized by the main data recognition model, and table 1 is the recognition result of the main data recognition model.
TABLE 1
In step 102, the main data is classified, and attribute information of the main data is determined.
In some embodiments, the classifying the main data, determining attribute information of the main data, includes:
classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
And determining attribute information of the main data based on the main data associated with each preset theme zone.
In some embodiments, the preset theme zone includes at least one of:
finance, projects, contracts, suppliers, wire nets, organisers, locations, assets, supplies, equipment, production data, security, affiliated resources.
And for the identified main data, the service range covered by the main data is wider, and the main data comprises a plurality of service types. The main data may be classified according to a preset topic field, which is a collection of data topics that are more closely related.
In the embodiment of the invention, based on the current situation of rail traffic operation, the theme zone can be divided into finance, projects, contracts, suppliers, wire networks, organization personnel, positions, assets, materials, equipment, production data, safety, auxiliary resources and the like according to the division situation of the business plates.
According to the preset topic domain, the identified main data can be analyzed to classify the main data, the main data is classified under the corresponding topic domain, and the classified main data is divided into a hierarchy of topic domain-main data.
After the main data are classified according to the topic domains, the main data under each topic domain can be obtained, the main data are returned to the management department, and the access authority of the main data is set according to the management department, namely, the main data under each topic domain can be accessed by the user corresponding to the corresponding management department.
And determining attribute information of the main data according to the main data under each theme zone. The attribute information of the main data is used to describe an attribute of the main data, for example, an organization manages the main data under the subject domain, and the attribute information may include: name, age, gender, etc.; main data under the financial topic domain, the attribute information of which may include: cost, benefit, etc.
In step 103, based on the attribute information of the main data and a preset encoding rule, an encoding corresponding to the main data is generated.
According to the attribute information of each main data, the main data can be subjected to unified standardization management, the main data with the same attribute are sequentially encoded in the same encoding mode by adopting a preset unified encoding rule, and each main data corresponds to a unique encoding. By carrying out unified coding specification and verification logic on main data under each theme zone in the track traffic industry, the consistency of important data among systems can be maintained, and the service coupling degree among systems is improved.
Alternatively, the master data model may be built based on attribute information of the master data. The main data model is used for defining the identified main data, such as defining coding rules, checking rules, data authorities, etc., and is a definition result of the main data model on the main data as shown in table 2.
TABLE 2
In step 104, a master database is constructed based on the encoded and categorized master data.
After the encoding is completed, each main data has a unique encoding, so that a main database can be constructed according to the encoded and classified main data.
The constructed main database can be used for unified management of the identified main data, and can comprise: data quality management, data statistical analysis data distribution management, and the like.
The data quality management may include: based on the configured check rule, a quality auditing plan is formulated, normal quality analysis and monitoring management are carried out on main data according to the quality auditing plan, accurate check and fuzzy check of data (such as project coding) are supported, and a data checking function is provided; a verification function supporting the coding rule of data (such as rule verification of an identity card number and a mobile phone number); check for data uniqueness, integrity and consistency checksums, etc.
The data statistical analysis may include: carrying out statistical analysis on the integration type, the acquisition state and the like of the main data through a data management report; and carrying out statistical analysis on the authorization mode, the source system, the data quantity and the like of the main data through the main data statistical analysis report.
The data distribution management may include: according to the interface content of the consumption system in the main data service requirement range, the interface registration function of the consumption system is provided, including the description of distribution interfaces, parameters, services and the like. After the configuration is completed, the system interface is called through the configuration of the distribution task, so that the main data platform can send data to the data consumption system.
According to the construction method of the main database based on the cloud platform, unified access of network-level cross-system rail traffic data is achieved through the cloud platform, the main data are accurately identified according to the application range, the application period and the application scene of the access data, and are classified and encoded, so that the main database is constructed, unified management of the main data is achieved, and the data resource utilization rate and the service level of multi-line integrated operation and maintenance are improved.
In some embodiments, the constructing a master database based on the encoded and categorized master data includes:
performing data cleaning on the classified main data;
and constructing a main database based on the codes, the preset theme zone and the main data after data cleaning.
After classifying the main data, performing data cleaning on the classified main data, wherein the data cleaning may include: standardized data format, missing value processing, outlier processing, duplicate value processing, data type conversion, data filtering, data normalization, data conversion, and verification data, etc. By cleaning the classified main data, the accuracy and reliability of the main data can be ensured.
Therefore, a main database which can serve multiple lines and multiple management centers can be established according to the preset theme zone, the encoding of the main data and the main data after data cleaning.
According to the method for constructing the main database based on the cloud platform, provided by the embodiment of the invention, the accuracy and the reliability of the main data are ensured by cleaning the classified main data, so that a more reliable main database is constructed, and the utilization rate of the main data is improved.
FIG. 2 is a schematic diagram of a main data management platform according to an embodiment of the present invention, where the design of the main data management platform is based on meeting the requirements of enterprise core data management and applications, and considering multiple layers of data sources, data access, data management, data assets and application dimensions, data services, etc. The main data platform architecture follows basic principles of integrity, integration, advancement, expandability, safety and the like, and is matched with the construction of a data standardization system.
As shown in fig. 2, the architecture of the master data management platform is composed of 7 layers, including: the cloud platform comprises a data source layer, a cloud platform layer, a data access layer, a data management layer, a data asset layer, a data service layer and an application layer.
A data source layer for interfacing the source systems;
The data access layer is used for realizing the access of various main data, and realizing the access, management and distribution of various main data of network level cross-region, cross-line and cross-professional by utilizing the cloud platform;
the data management layer is used for managing the modeling, standard, quality, safety and full life cycle of the main data slave;
the data resource layer establishes a huge main database according to the topic domain, manages inquiry and browsing data in the form of a data resource directory tree, forms a data directory by classifying and hierarchically managing the data, establishes a data management system, and establishes a main database capable of serving multiple lines and multiple management centers according to the topic domain.
Fig. 3 is a schematic structural diagram of a main data management platform according to an embodiment of the present invention, and as shown in fig. 3, specific functional modules of the main data management platform include: data standard management, data model management, main data management, integrated management, data acquisition management, data quality management, data cleaning management and the like.
And the data model management, which carries out data model development on line through a visual model design tool, supports creation, modification, deletion, hierarchical relationship definition, data entity definition, data attribute definition and version control of the main data model.
And the data acquisition management provides an interface registration function of the source system according to the interface content of the source system in the main data service requirement range, and comprises acquisition interface, parameter, service and other descriptions. And after the configuration is completed, calling a corresponding system interface through the configuration of the acquisition plan, and realizing the automatic acquisition of external source system data.
And the data distribution management provides the interface registration function of the consumption system according to the interface content of the consumption system in the main data service requirement range, and comprises the description of distribution interfaces, parameters, services and the like. After the configuration is completed, the system interface is called through the configuration of the distribution task, so that the main data platform can send data to the data consumption system.
Data quality management, namely, making a quality auditing plan based on configured checking rules, carrying out normal quality analysis and monitoring management on main data according to the quality auditing plan, supporting accurate check and fuzzy check of data (such as item codes), and providing a data checking function; a verification function supporting the coding rule of data (such as rule verification of an identity card number and a mobile phone number); check for data uniqueness, integrity and consistency checksums, etc.
The data cleaning management, the data cleaning function adopts a mode of combining automatic cleaning and manual intervention of a system to clean data, realizes multi-source data integration, cleaning and deduplication, and solves the problems of data loss value, boundary crossing value, inconsistent code, repeated data and the like from the aspects of data accuracy, integrity, consistency and effectiveness.
And (3) analyzing the data quality, configuring a check rule for each main data according to the field requirement of each main data, making a quality auditing plan, and regularly generating a data quality report, so that the data is based, manual investigation is avoided, and the data quality is improved.
Data statistical analysis, which is to carry out statistical analysis on the integration type, the acquisition state and the like of the main data through a data management report form; and carrying out statistical analysis on the authorization mode, the source system, the data quantity and the like of the main data through the main data statistical analysis report.
The main data management platform provided by the embodiment of the invention has the data range covering all the professions of vehicles, communication signals, power supply, electromechanics, civil engineering lines and the like related to rail transit operation, is suitable for all the professional main data specifications of rail transit operation enterprises, forms each theme zone suitable for the rail transit industry, ensures the consistency of important data among systems, improves the service coupling degree among systems, and lays a foundation for big data analysis.
The architecture of the main data management platform is based on a track traffic multi-cloud platform, can support cloud centralized management and control of different places and multiple centers, introduces an advanced multi-network integration concept in the aspect of network connection, and realizes cross-network safety integration of system data by connecting a wired network, a wireless network, a narrow-band internet of things and the like through safe and effective isolation protection. In the aspect of data management, an enterprise service bus OSB is established as an intermediate platform for main data management and big data application interface communication.
The main data management platform can realize distributed high-reliability redundancy deployment of the multi-tenant hybrid cloud, provides infrastructure resource support for application systems of different lines and various service levels in the track traffic industry and the like as required, realizes virtualized unified management and resource elastic allocation of resources (CPU, memory, storage and network) required by enterprise digital construction, improves the stability and data safety of track traffic informatization service continuity, and plays a great role in improving the aspects of track traffic informatization management, service continuity and elasticity, information safety level and the like.
The framework is suitable for synchronous construction of new lines, can be used for upgrading and reforming existing lines, realizes application integrated deployment and data centralized management, realizes multi-aspect safety management and control of an infrastructure layer, an application development layer, a user access layer and the like, ensures application safety, data safety and operation and maintenance safety, and reduces initial construction cost and operation and maintenance cost.
In addition, the embodiment of the invention also provides a big data analysis platform, and fig. 4 is a schematic architecture diagram of the big data analysis platform provided by the embodiment of the invention.
The big data analysis platform is built by the Internet+, the Internet of things and the big data technology base through unified collection, unified storage, unified management, unified operation and unified service, and rail transit production management and service intellectualization are promoted.
The system organizes a unified data access mode and access protocol of structured, semi-structured and unstructured data services, provides unified data security authentication access, integrates access modes of various data forms, shields the difference of a bottom heterogeneous distributed data storage system, provides a unified standard interface for an external application system to access data, and provides data and service sharing capability for rail transit applications.
The Hadoop sampling technology is built, development is carried out based on data driving, software definition, platform supporting, service value increasing and intelligent leading principles, comprehensive improvement of safety, efficiency, benefit and service level is achieved, and a new system for data collection, storage, analysis and sharing is created.
The big data analysis platform is composed of a big data basic module, a data integration module, a data warehouse and a data sharing module.
And the big data base module is responsible for providing basic big data capacity including data storage capacity, message flow middleware, real-time computing framework, offline computing framework and the like, and simultaneously providing big data component deployment operation and maintenance capacity.
The data integration module is responsible for completing data acquisition and cleaning access, supporting data source types of various devices and points in production management service, and relates to a plurality of service systems, and the data integration module comprises: equipment status, faults, assets, service work orders, etc. The platform satisfies the collection of real-time data and offline data, structured and unstructured data.
And the data warehouse is responsible for providing data history data storage, index settlement result storage and the like, and is the basis of upper layer data service.
The large data platform is a service platform taking data sharing and release as a core purpose, covers the full-chain content of data operation, and realizes multiple functions of data management, data sharing service and the like.
Fig. 5 is a schematic structural diagram of a big data analysis platform according to an embodiment of the present invention, where, as shown in fig. 5, a technical architecture includes: the front-end UI interacts with the service system through Restful and websocket; the business system is developed based on SpringCloud; springdata-jpa is adopted as a database operation library; storing service data by adopting a relational database; redis provides data caching capability and real-time data storage.
The big data component comprises:
and a storage component: historical data is stored by Hive, and hot data is stored by HBase.
A computing framework: the method comprises the steps of providing near-line data query capability by adopting Impala, providing offline data processing capability by adopting a MapReduce framework, and providing near-line data processing capability by adopting Flink.
Fig. 6 is a schematic diagram of a data flow of a big data analysis platform provided by the embodiment of the present invention, as shown in fig. 6, data is provided for the big data platform according to main data, production data and statistical analysis data, and an acquisition layer performs standardization on related data, where subsequent flows are distinguished according to different data types.
A system module, comprising: the system comprises a data access module, a data storage module, production data, statistical analysis data and a data service module.
The data access module completes the functions of main data and production data access, ETL conversion and data quality inspection, and the data access content comprises the deflection information and the alarm information of the equipment point. The data access module provides data conversion between the data source end and the large data platform and a data quality check function. The data access module synchronizes the main data to the big data platform, and the production data is accessed from the source end in a mode of Kafka or Restful API or FTP, is converted into a format required by the big data platform, and is sent to a Kafka message channel inside the big data platform.
The data storage module is a stream processing task in the Hadoop platform and is responsible for accessing data from a kafka channel in the system, performing ETL conversion of the data, checking data quality, completing data storage and Redis real-time refreshing, data analysis and statistics and real-time data pushing functions.
The production data describes data related to internal or external events or production records in the operation process of the organization service, also called transaction data. In the invention, hive is adopted to store historical data, and HBase is adopted to buffer.
The statistical analysis data is numerical data for performing statistical analysis on business activities of enterprises, and is generally an index or the like. In order to combine big data computing power, the invention adopts Hive to store result data and uses impala to carry out query support.
The data service module consists of a series of micro services, accesses the components of the big data platform through the JDBC interface and provides data access service to the outside through the API gateway.
According to the method for constructing the main data management and big data analysis platform based on the track cloud, the big data analysis platform based on the cloud and the main data management is constructed, mathematical statistics and big data algorithm are comprehensively applied on the basis of mass experience data of the track traffic, and research, design and realization are carried out on the operation and maintenance data; according to the maintenance experience of each specialty for many years, the correlation analysis among the data of different theme domains is tried, the data value is deeply mined, the intelligent management of the whole life cycle of the rail transit specialty equipment is promoted, the operation and maintenance cost is reduced, and the comprehensive operation and maintenance management level is improved.
The device for constructing the main database based on the cloud platform, which is provided by the invention, is described below, and the device for constructing the main database based on the cloud platform, which is described below, and the method for constructing the main database based on the cloud platform, which is described above, can be correspondingly referred to each other.
Fig. 7 is a schematic structural diagram of a device for constructing a master database based on a cloud platform according to an embodiment of the present invention, where as shown in fig. 7, the device for constructing a master database based on a cloud platform according to an embodiment of the present invention includes:
an identification module 710, configured to determine main data based on an application range, an application period, and an application scenario of the target data; the target data are cross-system rail traffic data acquired through a cloud platform;
the classification module 720 is configured to classify the main data and determine attribute information of the main data;
the encoding module 730 is configured to generate an encoding corresponding to the main data based on attribute information of the main data and a preset encoding rule;
and a construction module 740, configured to construct a main database based on the encoded and classified main data.
It should be noted that, the device for constructing a main database based on a cloud platform provided by the embodiment of the present invention can implement all the method steps implemented by the embodiment of the method for constructing a main database based on a cloud platform, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the embodiment of the method in the embodiment are omitted.
Optionally, the classification module 720 is specifically configured to:
classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
and determining attribute information of the main data based on the main data associated with each preset theme zone.
Optionally, the building module 740 is specifically configured to:
performing data cleaning on the classified main data;
and constructing a main database based on the codes, the preset theme zone and the main data after data cleaning.
Optionally, the application range of the target data includes:
the number of rail transit routes to which the target data is applied;
the number of specialized fields of the target data application;
the number of business departments to which the target data is applied.
Optionally, the preset theme zone includes at least one of:
finance, projects, contracts, suppliers, wire nets, organisers, locations, assets, supplies, equipment, production data, security, affiliated resources.
Fig. 8 illustrates a physical structure diagram of an electronic device, as shown in fig. 8, which may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. The processor 810 may call logic instructions in the memory 830 to perform a method of constructing a cloud platform based master database, the method comprising: determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform; classifying the main data and determining attribute information of the main data; generating codes corresponding to the main data based on the attribute information of the main data and a preset coding rule; and constructing a main database based on the encoded and classified main data.
Further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program may be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor is capable of executing the method for constructing a main database based on a cloud platform provided by the above methods, where the method includes: determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform; classifying the main data and determining attribute information of the main data; generating codes corresponding to the main data based on the attribute information of the main data and a preset coding rule; and constructing a main database based on the encoded and classified main data.
In still another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the method for constructing a cloud platform based master database provided by the above methods, the method comprising: determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform; classifying the main data and determining attribute information of the main data; generating codes corresponding to the main data based on the attribute information of the main data and a preset coding rule; and constructing a main database based on the encoded and classified main data.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The method for constructing the main database based on the cloud platform is characterized by comprising the following steps of:
determining main data based on an application range, an application period and an application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform;
classifying the main data and determining attribute information of the main data;
generating codes corresponding to the main data based on the attribute information of the main data and a preset coding rule;
and constructing a main database based on the encoded and classified main data.
2. The method for constructing a cloud platform-based master database according to claim 1, wherein the classifying the master data to determine attribute information of the master data includes:
classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
and determining attribute information of the main data based on the main data associated with each preset theme zone.
3. The method for constructing a main database based on a cloud platform according to claim 1, wherein the constructing a main database based on the encoded and classified main data comprises:
Performing data cleaning on the classified main data;
and constructing a main database based on the codes, the preset theme zone and the main data after data cleaning.
4. The method for constructing a main database based on a cloud platform according to claim 1, wherein the application range of the target data includes:
the number of rail transit routes to which the target data is applied;
the number of specialized fields of the target data application;
the number of business departments to which the target data is applied.
5. The method for constructing a main database based on a cloud platform according to claim 2, wherein the preset theme zone includes at least one of the following:
finance, projects, contracts, suppliers, wire nets, organisers, locations, assets, supplies, equipment, production data, security, affiliated resources.
6. The utility model provides a construction device of main database based on cloud platform which characterized in that includes:
the identification module is used for determining main data based on the application range, the application period and the application scene of the target data; the target data are cross-system rail traffic data acquired through a cloud platform;
the classification module is used for classifying the main data and determining attribute information of the main data;
The encoding module is used for generating an encoding corresponding to the main data based on the attribute information of the main data and a preset encoding rule;
and the construction module is used for constructing a main database based on the encoded and classified main data.
7. The cloud platform-based master database construction device according to claim 6, wherein the classification module is specifically configured to:
classifying the main data based on preset topic domains, and determining main data associated with each preset topic domain;
and determining attribute information of the main data based on the main data associated with each preset theme zone.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the method of constructing a cloud platform based master database according to any one of claims 1 to 5 when executing the program.
9. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a method of constructing a cloud platform based master database according to any one of claims 1 to 5.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements a method of constructing a cloud platform based master database according to any of claims 1 to 5.
CN202310869889.6A 2023-07-14 2023-07-14 Method and device for constructing main database based on cloud platform and electronic equipment Pending CN117056304A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310869889.6A CN117056304A (en) 2023-07-14 2023-07-14 Method and device for constructing main database based on cloud platform and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310869889.6A CN117056304A (en) 2023-07-14 2023-07-14 Method and device for constructing main database based on cloud platform and electronic equipment

Publications (1)

Publication Number Publication Date
CN117056304A true CN117056304A (en) 2023-11-14

Family

ID=88668259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310869889.6A Pending CN117056304A (en) 2023-07-14 2023-07-14 Method and device for constructing main database based on cloud platform and electronic equipment

Country Status (1)

Country Link
CN (1) CN117056304A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119167030A (en) * 2024-09-09 2024-12-20 交通运输部公路科学研究所 Method for determining infrastructure master data elements for the entire life cycle of highways

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119167030A (en) * 2024-09-09 2024-12-20 交通运输部公路科学研究所 Method for determining infrastructure master data elements for the entire life cycle of highways

Similar Documents

Publication Publication Date Title
Bhattarai et al. Big data analytics in smart grids: state‐of‐the‐art, challenges, opportunities, and future directions
CN114398669B (en) Combined credit scoring method and device based on privacy protection calculation and cross-organization
CN110549336A (en) Transformer substation patrols and examines robot centralized control main website system
CN110347719A (en) A kind of enterprise's foreign trade method for prewarning risk and system based on big data
US20190050435A1 (en) Object data association index system and methods for the construction and applications thereof
CN112883001A (en) Data processing method, device and medium based on marketing and distribution through data visualization platform
CN111538720B (en) Method and system for cleaning basic data of power industry
CN110147470B (en) Cross-machine-room data comparison system and method
Ma et al. Design and implementation of smart city big data processing platform based on distributed architecture
CN118411195A (en) Big data-based sales power quantity information plan management system
Wu et al. An Auxiliary Decision‐Making System for Electric Power Intelligent Customer Service Based on Hadoop
CN113869589A (en) A transmission line accident prediction method and inspection system based on knowledge graph
Guo et al. Multi-source heterogeneous data access management framework and key technologies for electric power Internet of Things
CN117056304A (en) Method and device for constructing main database based on cloud platform and electronic equipment
Xu et al. Cloud computing boosts business intelligence of telecommunication industry
WO2021143463A1 (en) Data cleaning method and apparatus
CN112001539A (en) A high-precision passenger transport forecasting method and passenger transport forecasting system
CN117436740A (en) Asset benefit evaluation method, device and storage medium
CN116776543A (en) A power big data application method for smart grid
CN106651145A (en) Spare part management system and method
CN114880303A (en) Business data output method, device, equipment, medium and product
CN114139979A (en) Service platform for specific research and development mechanism
CN114490634A (en) Multi-energy complementary data asset management method and system based on big data
CN113722305A (en) Analysis application system and method
Panda et al. Hadoop in Banking: Event‐Driven Performance Evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination