CN111553550A - An evaluation method for data quality of electric power big data based on user behavior analysis - Google Patents
An evaluation method for data quality of electric power big data based on user behavior analysis Download PDFInfo
- Publication number
- CN111553550A CN111553550A CN201911255343.1A CN201911255343A CN111553550A CN 111553550 A CN111553550 A CN 111553550A CN 201911255343 A CN201911255343 A CN 201911255343A CN 111553550 A CN111553550 A CN 111553550A
- Authority
- CN
- China
- Prior art keywords
- data
- accuracy
- user behavior
- behavior analysis
- quality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011156 evaluation Methods 0.000 title claims abstract description 40
- 238000005070 sampling Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000011157 data evaluation Methods 0.000 claims abstract description 8
- 230000010354 integration Effects 0.000 claims abstract description 7
- 238000012800 visualization Methods 0.000 claims description 8
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 238000013480 data collection Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 230000008030 elimination Effects 0.000 claims description 2
- 238000003379 elimination reaction Methods 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000005611 electricity Effects 0.000 description 6
- 238000013441 quality evaluation Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Health & Medical Sciences (AREA)
- Educational Administration (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Theoretical Computer Science (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明公开了一种针对用户行为分析的电力大数据数据质量的评估方法,包括如下步骤:S1:通过数据采集模块采集若干用户的历史网络数据,并通过数据整合模块将历史网络数据进行整合;S2:对特征数据进行分层;S3:对每层数据采用简单随机抽样方法进行抽样,并获得多组分层抽样数据,汇总所述多组分层抽样数据,获得数据样本;S4:根据中央处理模块预设的规则对数据样本进行多个维度下的评估,获得每个评估指标对应的指标评价结果,然后按照权重对多个评价结果进行综合的评估;本发明通过通过权重的分配,提高了数据评估的准确性;然后按照权重对多个评价结果进行综合的评估,这样有利于提高评估结果的准确性。
The invention discloses a method for evaluating the data quality of electric power big data for user behavior analysis, comprising the following steps: S1: collecting historical network data of several users through a data acquisition module, and integrating the historical network data through a data integration module; S2: stratify the characteristic data; S3: use the simple random sampling method to sample each layer of data, and obtain multi-group stratified sampling data, summarize the multi-group stratified sampling data, and obtain data samples; S4: according to the central The preset rules of the processing module evaluate the data samples in multiple dimensions, obtain the index evaluation result corresponding to each evaluation index, and then comprehensively evaluate the multiple evaluation results according to the weight; The accuracy of the data evaluation is improved; then the multiple evaluation results are comprehensively evaluated according to the weight, which is beneficial to improve the accuracy of the evaluation results.
Description
技术领域technical field
本发明属于电力大数据数据质量的评估技术领域,具体涉及一种针对用户行为分析的电力大数据数据质量的评估方法。The invention belongs to the technical field of data quality evaluation of electric power big data, and in particular relates to a data quality evaluation method of electric power big data aiming at user behavior analysis.
背景技术Background technique
随着社会的进步与发展,电力的使用越来越广泛,各地不同程度地出现的用电紧张的问题,需对客户的用电行为进行分析,进而通过分析结果来控制供电以及制定出科学、合理、个性的用电引导策略。各种用电数据汇集后形成大数据,一旦大数据的质量不合格,或者不准确,很难得到准确的分析结果,为此,我们提出一种针对用户行为分析的电力大数据数据质量的评估方法,以解决上述背景技术中提到的问题。With the progress and development of society, the use of electricity has become more and more extensive, and the problem of electricity shortage in various places needs to analyze the electricity consumption behavior of customers, and then use the analysis results to control the power supply and formulate scientific, Reasonable and personalized electricity guiding strategy. All kinds of electricity consumption data are collected to form big data. Once the quality of big data is unqualified or inaccurate, it is difficult to obtain accurate analysis results. For this reason, we propose an assessment of the data quality of electricity big data based on user behavior analysis. method to solve the problems mentioned in the above background art.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种针对用户行为分析的电力大数据数据质量的评估方法,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a method for evaluating the data quality of electric power big data for user behavior analysis, so as to solve the problems raised in the above background art.
为实现上述目的,本发明提供如下技术方案:一种针对用户行为分析的电力大数据数据质量的评估方法,包括如下步骤:In order to achieve the above purpose, the present invention provides the following technical solutions: a method for evaluating the data quality of electric power big data for user behavior analysis, comprising the following steps:
S1:通过数据采集模块采集若干用户的历史网络数据,并通过数据整合模块将历史网络数据进行整合;S1: Collect historical network data of several users through the data collection module, and integrate the historical network data through the data integration module;
S2:在整合分类后的数据中,根据中央处理模块预设的数据特征调取特征数据,并对特征数据进行分层;S2: in the integrated and classified data, retrieve feature data according to the data features preset by the central processing module, and stratify the feature data;
S3:对每层数据采用简单随机抽样方法进行抽样,并获得多组分层抽样数据,汇总所述多组分层抽样数据,获得数据样本;S3: Sampling each layer of data using a simple random sampling method, and obtain multi-group stratified sampling data, summarize the multi-group stratified sampling data, and obtain a data sample;
S4:根据中央处理模块预设的规则对数据样本进行多个维度下的评估,获得每个评估指标对应的指标评价结果,然后按照权重对多个评价结果进行综合的评估;S4: Evaluate the data samples in multiple dimensions according to the rules preset by the central processing module, obtain an index evaluation result corresponding to each evaluation index, and then comprehensively evaluate the multiple evaluation results according to the weight;
S5:通过可视化模块对综合的评估结果进行展示。S5: Display the comprehensive evaluation results through the visualization module.
优选的,所述步骤S1中的数据整合模块用于对历史网络数据进行过滤,该过滤包括对异常数据的剔除、对剔除后的数据的分类、对分类后的数据按照类别进行权重分配。Preferably, the data integration module in the step S1 is used to filter the historical network data, the filtering includes removing abnormal data, classifying the data after the removal, and assigning weights to the classified data according to the categories.
优选的,所述对异常数据的剔除包括对剔除掉不具有样本意义的数据、剔除掉不准确的数据、剔除掉前后浮动较大的数据。Preferably, the removing of abnormal data includes removing data that does not have sample significance, removing inaccurate data, and removing data with large fluctuations before and after.
优选的,所述步骤S2中特征数据包括多个特征参数对应的所述历史网络数据。Preferably, the feature data in the step S2 includes the historical network data corresponding to a plurality of feature parameters.
优选的,所述步骤S4中多个维度包括数据接入情况、准确性、完备性、一致性以及及时性,所述准确性包括数据句法准确性、数据语义准确性、数据准确性测量覆盖率、元数据准确性、数据范围的准确性以及数据值精度。Preferably, in the step S4, multiple dimensions include data access, accuracy, completeness, consistency and timeliness, and the accuracy includes data syntax accuracy, data semantic accuracy, and data accuracy measurement coverage , metadata accuracy, data range accuracy, and data value precision.
优选的,所述可视化模块方便的查看综合的评估结果,适合对算法和接口无深入了解的数据评估业务人员对综合的评估结果的查看。Preferably, the visualization module can conveniently view the comprehensive evaluation results, and is suitable for data evaluation business personnel who have no in-depth understanding of algorithms and interfaces to view the comprehensive evaluation results.
与现有技术相比,本发明的有益效果是:本发明提供的一种针对用户行为分析的电力大数据数据质量的评估方法,本发明通过对历史网络数据进行整合、根据预设的数据特征调取特征数据,并对特征数据进行分层,由于分类后的数据,按照类别的不同,可能具有不同的权重,通过权重的分配,提高了数据评估的准确性。Compared with the prior art, the beneficial effects of the present invention are as follows: the present invention provides a method for evaluating the data quality of electric power big data for user behavior analysis. The characteristic data is retrieved, and the characteristic data is stratified. Since the classified data may have different weights according to different categories, the accuracy of data evaluation is improved through the allocation of weights.
对每层数据采用简单随机抽样方法进行抽样,并获得多组分层抽样数据,汇总所述多组分层抽样数据,获得数据样本,根据中央处理模块预设的规则对数据样本进行多个维度下的评估,获得每个评估指标对应的指标评价结果,然后按照权重对多个评价结果进行综合的评估,这样有利于提高评估结果的准确性;Sampling each layer of data using a simple random sampling method, and obtain multi-group stratified sampling data, summarize the multi-group stratified sampling data, obtain data samples, and perform multiple dimensions on the data samples according to the rules preset by the central processing module. The evaluation results of each evaluation index corresponding to each evaluation index are obtained, and then the multiple evaluation results are comprehensively evaluated according to the weight, which is conducive to improving the accuracy of the evaluation results;
可视化模块方便的查看综合的评估结果,适合对算法和接口无深入了解的数据评估业务人员对综合的评估结果的查看。The visualization module can conveniently view the comprehensive evaluation results, which is suitable for data evaluation business personnel who have no in-depth understanding of algorithms and interfaces to view the comprehensive evaluation results.
附图说明Description of drawings
图1为本发明一种针对用户行为分析的电力大数据数据质量的评估方法的流程示意图。FIG. 1 is a schematic flowchart of a method for evaluating data quality of electric power big data for user behavior analysis according to the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施例1Example 1
本发明提供了如图1的一种针对用户行为分析的电力大数据数据质量的评估方法,包括如下步骤:The present invention provides a method for evaluating the data quality of electric power big data for user behavior analysis as shown in FIG. 1, including the following steps:
S1:通过数据采集模块采集若干用户的历史网络数据,并通过数据整合模块将历史网络数据进行整合;S1: Collect historical network data of several users through the data collection module, and integrate the historical network data through the data integration module;
S2:在整合分类后的数据中,根据中央处理模块预设的数据特征调取特征数据,并对特征数据进行分层;S2: in the integrated and classified data, retrieve feature data according to the data features preset by the central processing module, and stratify the feature data;
S3:对每层数据采用简单随机抽样方法进行抽样,并获得多组分层抽样数据,汇总所述多组分层抽样数据,获得数据样本;S3: Sampling each layer of data using a simple random sampling method, and obtain multi-group stratified sampling data, summarize the multi-group and stratified sampling data, and obtain a data sample;
S4:根据中央处理模块预设的规则对数据样本进行多个维度下的评估,获得每个评估指标对应的指标评价结果,然后按照权重对多个评价结果进行综合的评估;S4: Evaluate the data samples in multiple dimensions according to the rules preset by the central processing module, obtain an index evaluation result corresponding to each evaluation index, and then comprehensively evaluate the multiple evaluation results according to the weight;
S5:通过可视化模块对综合的评估结果进行展示。S5: Display the comprehensive evaluation results through the visualization module.
具体的,所述步骤S1中的数据整合模块用于对历史网络数据进行过滤,该过滤包括对异常数据的剔除、对剔除后的数据的分类、对分类后的数据按照类别进行权重分配,对数据进行分类,便于后续的调取,且由于分类后的数据,按照类别的不同,可能具有不同的权重,通过权重的分配,提高了数据评估的准确性。Specifically, the data integration module in the step S1 is used to filter the historical network data, and the filtering includes the elimination of abnormal data, the classification of the eliminated data, and the weight distribution of the classified data according to the categories. The data is classified to facilitate subsequent retrieval, and since the classified data may have different weights according to different categories, the accuracy of data evaluation is improved through the allocation of weights.
具体的,所述对异常数据的剔除包括对剔除掉不具有样本意义的数据、剔除掉不准确的数据、剔除掉前后浮动较大的数据,这样有利于提高数据的准确性,提高后续评估结果的准确性。Specifically, the removal of abnormal data includes removal of data that does not have sample significance, removal of inaccurate data, and removal of data with large fluctuations before and after, which is conducive to improving the accuracy of the data and improving the subsequent evaluation results. accuracy.
具体的,所述步骤S2中特征数据包括多个特征参数对应的所述历史网络数据。Specifically, the feature data in the step S2 includes the historical network data corresponding to a plurality of feature parameters.
具体的,所述步骤S4中多个维度包括数据接入情况、准确性、完备性、一致性以及及时性,所述准确性包括数据句法准确性、数据语义准确性、数据准确性测量覆盖率、元数据准确性、数据范围的准确性以及数据值精度。Specifically, the multiple dimensions in step S4 include data access, accuracy, completeness, consistency, and timeliness, and the accuracy includes data syntax accuracy, data semantic accuracy, and data accuracy measurement coverage , metadata accuracy, data range accuracy, and data value precision.
具体的,所述可视化模块方便的查看综合的评估结果,适合对算法和接口无深入了解的数据评估业务人员对综合的评估结果的查看。Specifically, the visualization module can conveniently view the comprehensive evaluation results, and is suitable for data evaluation business personnel who have no in-depth understanding of algorithms and interfaces to view the comprehensive evaluation results.
综上所述,与现有技术相比,本发明通过对历史网络数据进行整合、根据预设的数据特征调取特征数据,并对特征数据进行分层,这样提高对数据处理后的准确性;To sum up, compared with the prior art, the present invention improves the accuracy of data processing by integrating historical network data, retrieving characteristic data according to preset data characteristics, and layering the characteristic data. ;
对每层数据采用简单随机抽样方法进行抽样,并获得多组分层抽样数据,汇总所述多组分层抽样数据,获得数据样本,根据中央处理模块预设的规则对数据样本进行多个维度下的评估,获得每个评估指标对应的指标评价结果,然后按照权重对多个评价结果进行综合的评估,这样有利于提高评估结果的准确性;Sampling each layer of data using a simple random sampling method, and obtain multi-group stratified sampling data, summarize the multi-group stratified sampling data, obtain data samples, and perform multiple dimensions on the data samples according to the rules preset by the central processing module. To obtain the evaluation results corresponding to each evaluation index, and then conduct a comprehensive evaluation of multiple evaluation results according to the weight, which is conducive to improving the accuracy of the evaluation results;
可视化模块方便的查看综合的评估结果,适合对算法和接口无深入了解的数据评估业务人员对综合的评估结果的查看。The visualization module can easily view the comprehensive evaluation results, which is suitable for data evaluation business personnel who have no in-depth understanding of algorithms and interfaces to view the comprehensive evaluation results.
最后应说明的是:以上所述仅为本发明的优选实施例而已,并不用于限制本发明,尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Finally, it should be noted that the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, the The technical solutions recorded in the foregoing embodiments can be modified, or some technical features thereof can be equivalently replaced, and any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included. within the protection scope of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911255343.1A CN111553550A (en) | 2019-12-10 | 2019-12-10 | An evaluation method for data quality of electric power big data based on user behavior analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911255343.1A CN111553550A (en) | 2019-12-10 | 2019-12-10 | An evaluation method for data quality of electric power big data based on user behavior analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111553550A true CN111553550A (en) | 2020-08-18 |
Family
ID=72007215
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911255343.1A Pending CN111553550A (en) | 2019-12-10 | 2019-12-10 | An evaluation method for data quality of electric power big data based on user behavior analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111553550A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529677A (en) * | 2020-12-22 | 2021-03-19 | 四川新网银行股份有限公司 | Automatic data quality evaluation method and readable storage medium |
CN113779150A (en) * | 2021-09-14 | 2021-12-10 | 杭州数梦工场科技有限公司 | Data quality evaluation method and device |
-
2019
- 2019-12-10 CN CN201911255343.1A patent/CN111553550A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112529677A (en) * | 2020-12-22 | 2021-03-19 | 四川新网银行股份有限公司 | Automatic data quality evaluation method and readable storage medium |
CN113779150A (en) * | 2021-09-14 | 2021-12-10 | 杭州数梦工场科技有限公司 | Data quality evaluation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108763490A (en) | Patent information management analysis system | |
CN110503570A (en) | A method, system, device and storage medium for detecting abnormal power consumption data | |
CN110046792B (en) | Zero-battery user investigation method based on radar chart comprehensive evaluation method | |
CN108345981A (en) | A kind of typical taiwan area line loss per unit mark post value calculating method and its application based on load classification | |
CN109558467B (en) | Method and system for identifying electricity user categories | |
CN111553550A (en) | An evaluation method for data quality of electric power big data based on user behavior analysis | |
CN111950913B (en) | A comprehensive evaluation method for microgrid power quality based on node voltage sensitivity | |
CN111242430A (en) | Power equipment supplier evaluation method and device | |
CN111461521A (en) | Residential housing vacancy rate analysis method based on electric power big data | |
CN105205341A (en) | Power distribution network reconstruction demand model building method based on customer demands | |
CN111126865A (en) | Technology maturity judging method and system based on scientific and technological big data | |
CN117277566B (en) | Power grid data analysis and power dispatching system and method based on big data | |
CN111552686A (en) | Power data quality assessment method and device | |
CN116050712A (en) | A land ecological status assessment method based on the improved entropy weight method weighting model | |
CN111553720A (en) | Analysis method of user's electricity consumption behavior based on improved k-means algorithm | |
CN111178676A (en) | Power distribution network project investment assessment method and system | |
CN111080089A (en) | Method and device for determining critical factors of line loss rate based on random matrix theory | |
CN113538011A (en) | A method for associating non-registered contact information and registered users in a power system | |
CN110070256B (en) | Zero-power user investigation priority weight calculation method based on CRITIC method | |
CN111127186A (en) | Application method of customer credit rating evaluation system based on big data technology | |
CN116910655A (en) | A smart energy meter fault prediction method based on device measurement data | |
CN116662860A (en) | A user portrait and classification method based on energy big data | |
CN109871998A (en) | A method and device for predicting line loss rate of distribution network based on expert sample library | |
CN116306005A (en) | A distribution network engineering evaluation method based on the whole life cycle theory | |
CN115659203A (en) | Detection method for abnormal behavior of power grid users |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200818 |