CN105825318B - Power communication network data quality monitoring method - Google Patents

Power communication network data quality monitoring method Download PDF

Info

Publication number
CN105825318B
CN105825318B CN201610133088.3A CN201610133088A CN105825318B CN 105825318 B CN105825318 B CN 105825318B CN 201610133088 A CN201610133088 A CN 201610133088A CN 105825318 B CN105825318 B CN 105825318B
Authority
CN
China
Prior art keywords
data
network
quality monitoring
data quality
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610133088.3A
Other languages
Chinese (zh)
Other versions
CN105825318A (en
Inventor
王彦波
吴秋晗
黄红兵
张利军
刘俊毅
柴谦益
俞红生
章毅
贺琛
彭瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201610133088.3A priority Critical patent/CN105825318B/en
Publication of CN105825318A publication Critical patent/CN105825318A/en
Application granted granted Critical
Publication of CN105825318B publication Critical patent/CN105825318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention aims to provide a power communication network data quality monitoring method, which continuously and circularly verifies the data accuracy in the data acquisition process and ensures that the data acquired in the final data extraction process is accurate and correct. In order to solve the above technical problem, the present invention at least comprises the steps of: step 1: determining an information source; step 2: analyzing; and step 3: performing attribute missing check; and 4, step 4: the data quality monitoring system checks the data matching degree; and 5: and when the data hierarchical check event is triggered, re-entering the step 3 after correction. Compared with the prior art, the method ensures the reliability of the data and avoids the possibility of correcting the correct numerical value in the correction process in the prior art. An effective power communication network data quality monitoring mechanism with guiding significance is formed through continuous improvement.

Description

Power communication network data quality monitoring method
Technical Field
The invention relates to the field of power systems, in particular to a power communication network data quality monitoring method.
Background
At present, no systematic data quality assessment index is formed in the power communication industry, and data quality assessment is often only performed on relatively important quality indexes in a system in a scattered manner, such as consistency problems, complexity problems, integrity problems and the like. Products already in use in the field of data quality assessment, such as integrity analyzer ia (integrity analyzer) of CRG, can implement a strict check on data integrity, where integrity includes entity integrity, referential integrity, domain integrity, and user-defined integrity.
The description of data quality can be generally divided into different hierarchies, but up to now, no uniform term has been formed specifically for representing such hierarchies, e.g., some represent such hierarchies by classes and fields, while ISO/TC21l is represented by data quality elements and sub-elements. The description of data quality in different application fields is also different, so that establishing a data quality framework reflecting the characteristics of the application fields is a primary problem to be solved by data quality assessment.
Data quality assessments are application-oriented, and the acceptability of the same data is different in different application contexts, e.g., for data mining, the same data performs well on one mining topic, but does not yield meaningful results on another mining topic. Thus, demand analysis is actually a process of dimension selection, with data quality assessment reviewing data in a dynamic or static manner, starting from one or several dimensions.
The dynamic evaluation mode refers to the evaluation of data quality from the data generation mechanism, and the static mode only considers the data. Although the dynamic evaluation mode can evaluate the data quality more thoroughly and comprehensively, in many application contexts, such as data mining, the condition is often limited, and the information of the data generation mechanism cannot be known.
A system and method for data quality monitoring is disclosed in prior art patent No. 201410258757.0 which discloses a system and method for data quality monitoring. Data quality monitoring refers to measuring the data quality of loaded data relative to a predetermined data quality metric. Data quality is measured by applying the logic calculus defined in the quality rules to the loaded data. The prior art makes use of at least one of the following for data quality measurements: incremental changes to the loaded data and incremental changes to the quality rules. Data Mining Oriented Data Quality assessment DM-DQA (Data Mining organized Data Quality assessment) is of practical significance, because Data Mining is often a huge project and needs to invest more time, labor and material resources, the feasibility analysis of Data Mining is particularly important before the Data Mining project really starts, and the significance of Data Quality assessment is provided for providing guidance for the feasibility of Data Mining.
The service management systems of the power communication network have a great deal of problems in data quality, and the dirty data cannot effectively support the effective development of communication analysis work. According to the application summary of each service management system, the data quality problem can be divided into four problem domains of information, technology, flow and management according to the source and the specific reason. Among them, the information-like problem is a data quality problem due to the description, understanding, and measurement standard deviation of the data itself; the technical problem refers to the data quality problem caused by the abnormality of each technical link of specific data processing, and the direct reason for the technical problem is a certain defect in technical implementation; the flow problem refers to the data quality problem caused by improper setting of the system operation flow and the manual operation flow; the management problem refers to the data quality problem caused by the personnel quality and the management mechanism.
Disclosure of Invention
The invention aims to provide a power communication network data quality monitoring method, which continuously and circularly verifies the data accuracy in the data acquisition process and ensures that the data acquired in the final data extraction process is accurate and correct.
In order to solve the technical problems, the invention is realized by the following technical scheme:
a power communication network data quality monitoring method at least comprises the following steps:
step 1: determining a network source needing to acquire information;
step 2: analyzing the network properties obtained in the step 1, and acquiring data by adopting different data acquisition methods according to the network properties; the data acquisition method at least comprises one of equipment acquisition and equipment network management acquisition;
and step 3: performing attribute missing verification, extracting data from different sources according to equipment factory IDs to form a complete data chain of a single device, and verifying whether key attributes of each system data are completely filled through the data chain; sending an attribute missing alarm for missing preset key attributes, recording a source system and missing conditions of the attribute missing alarm, and not continuing to perform next verification before completing the supplement of the key attributes;
and 4, step 4: the data quality monitoring system checks the data matching degree; if the attributes of the multiple data sources are consistent or the similarity is high, marking the data as accurate data; if the attributes of a plurality of data sources are inconsistent, triggering a data hierarchical verification event;
and 5: and (3) when the data hierarchical verification event is triggered, carrying out credibility classification on the data of different sources according to the importance degree preset by the data quality monitoring system, controlling the error information data source after classification, and re-entering the step 3 after correction.
Preferably, the network source in step 1 includes an SDH transmission network, an OTN transmission network, a data network, a digital synchronization network, and a switching network. The data may be obtained from devices in different networks and also from respective databases managing the devices. And mutually verify data reliability.
Preferably, the data extracted in step 3 includes configuration information, alarm information, performance information, service information, and operation and maintenance information of the device. These are basic information and devices for additional functions will also set additional parameter requirements at the time of extraction.
Preferably, the similarity calculation in step 4 is performed in the following manner:
S=((P1+P2+…+Pn)/n)*100%
s is set as a similarity index, P is a similarity result of a certain rule segment of the single data, and the result of the similarity of a plurality of rule segments is added and divided by the number of the rule segments to obtain a similarity average index; wherein:
P(A,B)=sqrt(A*B)/(|A|×|B|)
a is string 1, B is string 2, A, B is converted into vectors of the same dimension, and then the similarity of the vectors is calculated. The similarity parameter is obtained in such a way, and a reference index is given to the subsequent step.
Preferably, the credibility in the step 5 is graded as: professional network management > resource management system > operation management system. When data conflict occurs, adaptation change is firstly carried out through the form, but if a plurality of low-reliability data are unified and are not unified with high-reliability data, the steps of control and correction are entered.
Preferably, the entire monitoring is automatically repeated, each time at a fixed time interval of N, and if the warned system does not modify the data for three fixed time intervals in succession and passes the next quality monitoring rule audit, the system will be deducted the monthly data quality score, which involves entering the next monthly score over the monthly period.
Compared with the prior art, the method ensures the reliability of the data and avoids the possibility of correcting the correct numerical value in the correction process in the prior art. The corresponding relation between the data quality evaluation score vector and the mining result is worked out through accumulation of the mining result with a certain amount of leads, then the tolerance value of the mining result is specified, the evaluation score vector corresponding to the tolerance value is the reference value of the evaluation score vector, and the evaluation result can be explained according to the reference value, namely whether the data set is suitable for mining or not, and the suitable mining degree is. An effective power communication network data quality monitoring mechanism with guiding significance is formed through continuous improvement.
Detailed Description
The communication service of the power system is mainly divided into power grid operation and enterprise management service according to the functions and characteristics of the communication service. The power grid operation type service is divided into an operation control service and an operation information service; the enterprise management services are further classified into information services and office services. These services rely on the support of the communication network, but the requirements for communication are not consistent. The operation control service is used as a link of power grid control, which is directly related to the power grid safety, and because the service has extremely high requirements on communication transmission delay and channel reliability, a special power communication network, namely an optical cable transmission network, is mainly used at present. The main types of these services are wired protection service, security service, scheduling automation service, scheduling telephone service, video conference service, administrative telephone service, and information service 7, which are core services of the power communication network, and the devices bearing these services are core devices, and the importance degree of these services is higher than that of devices bearing other services (such as services bearing tv conference call, administrative telephone call, etc.). When a certain device does not relate to the core service, the device is independently classified into other service device evaluation.
The invention provides a data quality monitoring mechanism aiming at a power communication transmission network based on the characteristics of communication services of a power system, which generally comprises the following steps:
001. the data acquisition aims at the equipment network management, the resource management system and the operation management system to acquire data. The professional network manager is responsible for providing configuration data of the equipment, such as equipment ID, slot position, board card, port information and the like, wherein the data is generally provided by the equipment network manager; the resource management system is responsible for providing maintenance data of the equipment, such as a network to which the equipment belongs, bearing service information and the like, wherein the data not only comprises data collected from the equipment, but also comprises data manually maintained by operating personnel; the operation management system is responsible for providing operation and maintenance data of the equipment, such as maintenance condition, fault condition information and the like, and is manually input by an operator; the data from multiple sources has partially identical data, such as slot, board, port occupation, and device running state information. The data are uniformly incorporated into a data quality monitoring system database for storage, and key information such as data sources, acquisition time and the like is also stored. In addition, the system can also regularly collect the information carried by the equipment from the network, and the situation that the network manager does not collect the information in time after the equipment is replaced is avoided.
002. The data quality monitoring system is used for attribute missing verification, and data from different sources are extracted according to the factory ID of the equipment to form a complete data chain of the single equipment, wherein the complete data chain comprises configuration information, alarm information, performance information, service information, operation and maintenance information and the like of the equipment. During the period, whether the key attribute of each system data is completely filled is checked, an attribute missing alarm is sent out for the missing of the preset key attribute, the source system and the missing condition of the system are recorded, and the next checking is not carried out.
003. Furthermore, the data quality monitoring system checks the data matching degree, and compares the data chain information of the equipment one by one according to the configured checking attribute. And if the attributes of a plurality of data sources are inconsistent, triggering a data hierarchical verification event.
004. When the data hierarchical check event is triggered, the data of different sources are graded in reliability according to the preset importance degree of the data quality monitoring system, such as professional network management > resource management system > operation management system. For example, if the data of the operation management system cannot be matched with the data of the resource management system, and if the data of the professional network management system is consistent with the data of the operation management system, it can be determined that the data of the resource management system has a problem, and the system sends an alarm that the data of the resource management system is inaccurate. For example, in a professional network management system, a certain system name is: in the national network/jinghu optical transmission system, in the resource management system, the same system name is: national net/jinghu optical transport net. If the data similarity of the two is to be compared, firstly dividing the data into 2 segments according to the sign of the rule "/", and respectively substituting the 2 segments into a formula for calculation, the result is: s ═ ((P1(1) + P2(0.3081))/2) × 100% ═ 65%.
Character string 1: a b c d
Character string 2: a b d e
Convert the above 2 strings into 2 sets of vectors for comparison:
1 1 1 1 0 0
1 1 0 0 1 1
then P is 0.3535-sqrt (2)/(sqrt (4) × sqrt (4)).
It can be seen that the result does not reach the similarity rule of more than 95% built in the system, and the system sends out an alarm and is modified by the resource management system.
In conclusion, the invention can realize the whole-process evaluation of data generation, use, operation and maintenance by comprehensively analyzing and comparing the data from different sources, can judge the data source with problems and provides suggestions on what system and what action should be used for data improvement.
The foregoing is illustrative of only a few embodiments of the present invention. It is obvious that the invention is not limited to the above embodiments, but may have many applications, all of which can be derived or suggested by the person skilled in the art from the disclosure of the invention should be considered as the protection scope of the invention.

Claims (3)

1. A power communication network data quality monitoring method is characterized by at least comprising the following steps:
step 1: determining a network source needing to acquire information;
step 2: analyzing the network properties obtained in the step 1, and acquiring data by adopting different data acquisition methods according to the network properties; the data acquisition method at least comprises one of equipment acquisition and equipment network management acquisition;
and step 3: performing attribute missing verification, extracting data from different sources according to equipment factory IDs to form a complete data chain of a single device, and verifying whether key attributes of each system data are completely filled through the data chain; sending an attribute missing alarm for missing preset key attributes, recording a source system and missing conditions of the attribute missing alarm, and not continuing to perform next verification before completing the supplement of the key attributes;
and 4, step 4: the data quality monitoring system checks the data matching degree; if the attributes of the multiple data sources are consistent or the similarity is high, marking the data as accurate data; if the attributes of a plurality of data sources are inconsistent, triggering a data hierarchical verification event;
and 5: when the data hierarchical check event is triggered, the data of different sources are classified according to the degree of importance preset by the data quality monitoring system, the error information data source is controlled after classification, and the step 3 is re-executed after correction;
the similarity calculation in the step 4 is performed in the following manner:
S=((P1+P2+…+Pn)/n)*100%
s is set as a similarity index, P is a similarity result of a certain rule segment of the single data, and the result of the similarity of a plurality of rule segments is added and divided by the number of the rule segments to obtain a similarity average index; wherein:
P(A,B)=sqrt(A*B)/(|A|×|B|)
a is a character string 1, B is a character string 2, A, B is converted into vectors with the same dimensionality, and then the similarity of the vectors is calculated;
the credibility in the step 5 is graded as follows: professional network management > resource management system > operation management system;
the whole monitoring is automatically and repeatedly implemented, each implementation interval is a fixed period N, if the warned system does not modify data in three fixed periods continuously and passes the next quality monitoring rule audit, the system is deducted the monthly data quality score, and the next monthly score is added in the monthly process.
2. The method for monitoring the data quality of the power communication network according to claim 1, wherein the network source in step 1 includes an SDH transmission network, an OTN transmission network, a data network, a digital synchronization network, and a switching network.
3. The method for monitoring the data quality of the power communication network according to claim 1, wherein the data extracted in the step 3 includes configuration information, alarm information, performance information, service information and operation and maintenance information of the equipment.
CN201610133088.3A 2016-03-09 2016-03-09 Power communication network data quality monitoring method Active CN105825318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610133088.3A CN105825318B (en) 2016-03-09 2016-03-09 Power communication network data quality monitoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610133088.3A CN105825318B (en) 2016-03-09 2016-03-09 Power communication network data quality monitoring method

Publications (2)

Publication Number Publication Date
CN105825318A CN105825318A (en) 2016-08-03
CN105825318B true CN105825318B (en) 2021-11-30

Family

ID=56987551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610133088.3A Active CN105825318B (en) 2016-03-09 2016-03-09 Power communication network data quality monitoring method

Country Status (1)

Country Link
CN (1) CN105825318B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918218A (en) * 2019-01-28 2019-06-21 广州供电局有限公司 A kind of error data analysis method based on electrically charge
CN112990689B (en) * 2021-03-10 2024-07-05 华泰证券股份有限公司 Information data quality detection method and device
CN114037196A (en) * 2021-07-14 2022-02-11 北京天元创新科技有限公司 Data quality auditing method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933052A (en) * 2014-03-17 2015-09-23 华为技术有限公司 Data true value estimation method and data true value estimation device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933052A (en) * 2014-03-17 2015-09-23 华为技术有限公司 Data true value estimation method and data true value estimation device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
电力通信网网管数据采集框架的设计与实现;王晓莉;《中国优秀硕士论文全文数据库 信息科技辑》;20150415;第2015年卷(第4期);第4,5,20,37页 *

Also Published As

Publication number Publication date
CN105825318A (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN111858120B (en) Fault prediction method and device, electronic equipment and storage medium
CN101614781B (en) Intelligent diagnosis method of radio and television equipment based on spatial rule index
CN110088744B (en) Database maintenance method and system
CN109002391A (en) The method of automatic detection embedded software interface testing data
CN105825318B (en) Power communication network data quality monitoring method
US20110313978A1 (en) Plan-based compliance score computation for composite targets/systems
CN109765447A (en) A kind of intelligent substation relay protection automatic test approach
KR102232876B1 (en) Breakdown type analysis system and method of digital equipment
EP1657670A1 (en) System and method for the control of the state and progress of technical processes or a technical project
CN111444051A (en) Complete machine production testing method and system for product
CN114924990A (en) Abnormal scene testing method and electronic equipment
CN114429256A (en) Data monitoring method and device, electronic equipment and storage medium
CN111190817B (en) Method and device for processing software defects
CN112182233B (en) Knowledge base for storing equipment fault records, and method and system for assisting in positioning equipment faults by using knowledge base
CN113825162B (en) Method and device for positioning fault reasons of telecommunication network
CN113706098B (en) Business-based deviation reason identification method and device and electronic equipment
CN115576831A (en) Test case recommendation method, device, equipment and storage medium
US11526775B2 (en) Automatically evaluating application architecture through architecture-as-code
CN109426576A (en) Fault-tolerance processing method and fault-tolerant component
CN109062639B (en) Method and system for displaying upgrading information of SCD (substation configuration description) file of transformer substation
CN116300574B (en) Industrial control information mixed control system and method based on big data
CN115062995B (en) Performance quantitative measurement management system and method based on MES
CN111460018B (en) Physical ID data penetration method and system
WO2023007578A1 (en) Information processing device, information processing method, and program
CN112947940A (en) Software fault positioning system based on machine learning algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant