CN117828496A - Abnormal data identification method, device, computer equipment and storage medium - Google Patents

Abnormal data identification method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN117828496A
CN117828496A CN202311838282.8A CN202311838282A CN117828496A CN 117828496 A CN117828496 A CN 117828496A CN 202311838282 A CN202311838282 A CN 202311838282A CN 117828496 A CN117828496 A CN 117828496A
Authority
CN
China
Prior art keywords
user
identified
electric quantity
abnormal
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311838282.8A
Other languages
Chinese (zh)
Inventor
杨峰
苏扬
郭彤彤
李旺军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Original Assignee
China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Digital Platform Technology Guangdong Co ltd filed Critical China Southern Power Grid Digital Platform Technology Guangdong Co ltd
Priority to CN202311838282.8A priority Critical patent/CN117828496A/en
Publication of CN117828496A publication Critical patent/CN117828496A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present application relates to an abnormal data identification method, apparatus, computer device, storage medium and computer program product. The method comprises the following steps: acquiring power consumption associated data of a user to be identified; the electricity utilization associated data comprises electric quantity characteristic data; determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified; determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs; if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics. By adopting the method, the abnormal electric quantity characteristics of the abnormal user can be accurately positioned.

Description

Abnormal data identification method, device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of smart grid anomaly detection technologies, and in particular, to a method, an apparatus, a computer device, and a storage medium for identifying anomaly data.
Background
With the rapid development of the power industry, artificial intelligence technology is applied to research on abnormal electricity consumption diagnosis, and the abnormal electricity consumption diagnosis is performed through a data modeling mode. However, as the total amount of the electric power data is continuously increased, the complexity of management is greatly increased, and in the existing machine learning and deep learning algorithms, the model is inevitably fitted due to the increase of the electric power data, so that the diagnosis accuracy is reduced, and the positioning of specific abnormal electric characteristics of abnormal users cannot be effectively realized.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an abnormal data identification method, apparatus, computer device, and storage medium capable of achieving accurate positioning of abnormal electrical characteristics.
In a first aspect, the present application provides an abnormal data identification method, including:
acquiring power consumption associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to the electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In one embodiment, the determining, according to the electricity association data of the user to be identified, whether the user to be identified is an abnormal user includes: based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
In one embodiment, the abnormal user identification model is trained in the following manner: acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm; acquiring power utilization associated data and corresponding abnormal labels of users with different samples; inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user; and carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels.
In one embodiment, the determining the reference electric quantity feature according to the electric quantity feature data of the group category to which the user to be identified belongs includes: performing cluster analysis on the electric quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain a group category to which the user to be identified belongs; and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In one embodiment, the clustering analysis is performed on the electric quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain a group category to which the user to be identified belongs, including: generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified; based on a preset clustering algorithm, determining the group category of the user to be identified according to each electric quantity characteristic time sequence of the user to be identified.
In one embodiment, the acquiring the electricity utilization association data of the user to be identified includes: acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified; and carrying out feature fusion on the electric quantity feature data of the user to be identified and corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
In a second aspect, the present application further provides an abnormal data identification apparatus, including:
the first acquisition module is used for acquiring electricity utilization associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
the first determining module is used for determining whether the user to be identified is an abnormal user or not according to the electricity utilization associated data of the user to be identified;
the second determining module is used for determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs;
and the first selecting module is used for selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics if the user to be identified is an abnormal user.
In a third aspect, the present application also provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring power consumption associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
Determining reference electric quantity characteristics according to the electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In a fourth aspect, the present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring power consumption associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to the electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:
Acquiring power consumption associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to the electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
According to the method, the device, the computer equipment and the storage medium for identifying the abnormal data, whether the user to be identified is the abnormal user is judged by introducing the electricity utilization associated data, and the reference electricity quantity characteristic is determined according to the electricity quantity characteristic data of the group category of the user to be identified and is used as the identification basis of the abnormal electricity quantity characteristic, so that the abnormal electricity quantity characteristic is selected from the electricity quantity characteristic data of the user to be identified according to the reference electricity quantity characteristic under the condition that the user to be identified is the abnormal user, and the accurate positioning of the abnormal electricity quantity characteristic of the abnormal user is realized.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the related art, the drawings that are required to be used in the embodiments or the related technical descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for a person having ordinary skill in the art.
FIG. 1 is a diagram of an application environment for an anomaly data recognition method in one embodiment;
FIG. 2 is a flow chart of a method for identifying abnormal data in one embodiment;
FIG. 3 is a flowchart illustrating steps for determining whether a user to be identified is an abnormal user in one embodiment;
FIG. 4 is a flow chart of determining a reference charge characteristic in one embodiment;
FIG. 5 is a flow chart of obtaining a group category to which a user to be identified belongs in one embodiment;
FIG. 6 is a flow chart of acquiring electricity usage related data of a user to be identified in one embodiment;
FIG. 7 is a block diagram showing an apparatus for recognizing abnormal data in one embodiment;
fig. 8 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The abnormal user identification method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server.
The power consumption associated data of the user to be identified can be obtained through the terminal 102/the server 104; the electricity utilization associated data comprises electric quantity characteristic data; whether the user to be identified is an abnormal user or not can be determined through the terminal 102/the server 104 according to the power consumption associated data of the user to be identified; the reference electric quantity characteristics can be determined through the terminal 102/the server 104 according to the electric quantity characteristic data of the group category to which the user to be identified belongs; if the user to be identified is an abnormal user, the terminal 102/server 104 may select the abnormal power characteristic from the power characteristic data of the user to be identified according to the reference power characteristic.
The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
In an exemplary embodiment, as shown in fig. 2, an abnormal data identification method is provided, and the method is applied to the server 104 in fig. 1 for illustration, and includes the following S210 to S240. Wherein:
S210, acquiring power utilization associated data of a user to be identified; wherein the electricity usage-related data includes electrical quantity characteristic data.
The power utilization associated data are used for representing data which can influence the power utilization process of the user or representing data of actual power utilization conditions of the user. For example, the electricity usage-related data may include at least one of electricity usage characteristic data and phase application electrical environment data, and the like. The electricity characteristic data is data for describing electricity consumption attributes, for example, the electricity characteristic data can comprise at least one of electricity characteristics of different electricity attributes such as daily electricity consumption, monthly electricity consumption, real-time current, real-time voltage, real-time power and power factors; the electricity consumption environment data is environment data influencing electricity consumption in the process of electricity consumption of a user, and for example, the electricity consumption environment data can comprise at least one of user attribute information, external weather data, holiday attribute information and the like.
As an example, a current-voltage local database wide table based on table bureau organization can be built by using a big data Hadoop (open computing) platform HDFS (Hadoop Distributed File System, distributed file system) distributed storage technology, which stores data by using an Oracle (Oracle) database in the early stage and stores data by using a hive (a data warehouse tool based on Hadoop) of a later alternative big data platform; the current and voltage local database wide table based on the table bureau is a wide table for metering current and voltage data, and the table bureau can be used for acquiring the power utilization related data of the user to be identified from the wide table, so that the data acquisition efficiency and the richness and diversity of the acquired data can be improved under the condition of large data volume.
The user to be identified may include at least one of users of different user categories, such as public line private transformer clients, private line private transformer clients, public transformer clients, wharfs selling gateway users, line checking gateway users, station area checking gateway users, station checking gateway users, substation checking gateway users, electricity larceny black users, and the like.
It can be appreciated that, in order to facilitate the subsequent data processing, the acquired electrical quantity characteristic data may be discretized data or characteristic data obtained by discretizing the acquired original electrical quantity data.
S220, determining whether the user to be identified is an abnormal user or not according to the power utilization associated data of the user to be identified.
The abnormal user refers to a user with electricity data in a certain electricity type different from the reference value of the electricity data in the type.
As an example, based on the trained abnormal user identification model, it may be determined whether the user to be identified is an abnormal user according to the electricity usage association data of the user to be identified. The abnormal user identification model can be obtained by training a pre-constructed neural network model based on electricity consumption associated data of a large number of sample users and pre-labeled abnormal labels, so that the abnormal user identification model has the identification capability for users with abnormal electricity consumption.
Specifically, the power consumption associated data of the user to be identified can be input into the trained abnormal user identification model to obtain the prediction probability that the user to be identified is the abnormal user, and whether the user to be identified is the abnormal user is determined according to the prediction probability.
S230, determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs.
The users with similar electric quantity characteristic data are the same group category, and it is noted that the group category can be obtained by classifying the electric quantity characteristic data of different users. The reference electric quantity characteristic refers to an electric characteristic data reference value of each user in the group category, and is used for representing the electricity consumption commonality condition of different users in the group category.
As an example, the electric quantity characteristic data of the user to be identified may be subjected to cluster analysis based on a preset cluster algorithm to obtain a group category to which the user to be identified belongs; and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In a specific implementation manner, the daily electricity quantity of the user to be identified on the holiday can be subjected to cluster analysis based on a preset clustering algorithm to obtain that the group class of the user to be identified is a public line private change client; and determining the reference daily electricity consumption value of the user to be identified on the holiday under the contribution private change category according to the daily electricity consumption of different users on the holiday under the public line private change client category.
S240, if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
As an example, if the user to be identified is an abnormal user, the daily electricity consumption amount of the abnormal holiday is selected from the daily electricity consumption amounts of the user to be identified on the holiday according to the reference daily electricity consumption amount of the user to be identified on the holiday.
According to the method and the device for determining the abnormal electric quantity characteristics, whether the user to be identified is the abnormal user or not is judged by introducing the electricity consumption associated data, the reference electric quantity characteristics are determined according to the electric quantity characteristic data of the group category to which the user to be identified belongs and serve as the identification basis of the abnormal electric quantity characteristics, so that the abnormal electric quantity characteristics are selected from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics under the condition that the user to be identified is the abnormal user, and the accurate positioning of the abnormal electric quantity characteristics of the abnormal user is achieved.
On the basis of the technical solutions provided by the above embodiments, the present application further provides an alternative embodiment. In this alternative embodiment, the abnormal user identification procedure corresponding to S220 described above is described in detail. As shown in fig. 3, this step includes:
S310, based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
The abnormal user identification model can be realized by adopting at least one neural network model in the traditional technology, and the specific network structure of the abnormal user identification model is not limited in the application.
Illustratively, the abnormal user identification model is trained in the following manner: acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm; acquiring power utilization associated data and corresponding abnormal labels of users with different samples; inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user; according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels, model training is carried out on the abnormal user identification model to be trained to obtain an abnormal user identification model, and therefore reliability of the power consumption abnormal identification model is improved.
The sample users are other users different from the user to be identified, the number of the sample users is usually a plurality, and the different sample users can belong to the same or different user categories, which is not limited in the application. The user category can comprise at least one of public line private transformer clients, private line private transformer clients, public transformer clients, wharfs selling gateway households, line checking households, station area checking households, provincial gateway households, transformer substation checking households, electricity stealing black households and the like.
The abnormal label is used for representing the real abnormal condition of the sample user and can comprise an abnormal user and a non-abnormal user, the abnormal label can be set by at least one of the traditional technologies or is realized by manual labeling, and the application is not limited in any way.
Inputting the power consumption associated data of different sample users into an abnormal user identification model with training to obtain the predicted abnormal category of the corresponding sample user; according to a preset loss function, carrying out numerical quantization on the predicted abnormal categories of users with different samples and the difference conditions before corresponding abnormal labels to obtain target loss; and adjusting network parameters of the abnormal user identification model to be trained according to the target loss, so that the abnormal user identification model gradually has the abnormal user identification capacity until the training cut-off condition is met.
The training cutoff condition may be that the number of sample users reaches a preset number threshold, the target loss converges, or the target loss is smaller than a preset loss value, etc. The values of the preset number threshold and the preset loss value may be set or adjusted by a technician according to needs or experience, or determined through a plurality of experiments, which are not limited in any way in the application.
According to the technical scheme, the abnormal user identification model is introduced to judge the abnormal user, so that the abnormal user identification model can be reused, the training of the abnormal user identification model is not required to be repeated, and the universality and the mobility of the abnormal user identification model are improved.
On the basis of the technical solutions provided by the above embodiments, the present application further provides an alternative embodiment. In this alternative embodiment, the reference power characteristic determining process corresponding to S230 described above is described in detail. As shown in fig. 4, this step includes:
s410, carrying out cluster analysis on the electric quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain the group category of the user to be identified.
As an example, at least one electricity characteristic time sequence is generated according to electricity time information of electricity characteristic data of a user to be identified; based on a preset clustering algorithm, determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified.
The electricity utilization time information may represent generation time of original electricity data corresponding to electricity quantity characteristic data of the user to be identified. The electric quantity characteristic time sequence is a sequence obtained by combining electric quantity characteristic data of a user to be identified according to the corresponding electric consumption time information sequence. The clustering algorithm is a statistical analysis method for researching (sample or index) classification problems.
The clustering algorithm may be, for example, a k-shape clustering algorithm (a time-series clustering algorithm). Correspondingly, the similarity between different electric quantity characteristics can be determined based on a k-shape clustering algorithm according to the distance between the time sequences of the different electric quantity characteristics; and obtaining the group category of the user to be identified based on the similarity between the electric quantity characteristic data of the user to be identified. Wherein the distance may be a euclidean distance.
S420, determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In an alternative embodiment, statistical feature data of the power feature data of different users of the user to be identified in the group category to which the user to be identified belongs can be used as the reference power feature in the group category to which the user to be identified belongs. The statistical feature data may be an average value or a median of the respective electric quantity feature data.
In another alternative embodiment, the electric quantity characteristics of the to-be-identified user in the electric quantity characteristic data distribution set of different users in the group category can be used as the reference electric quantity characteristics in the group category of the to-be-identified user.
According to the technical scheme, the group category of the user to be identified is determined by carrying out cluster analysis on the electric quantity characteristic data of the user to be identified, so that the reference electric quantity characteristic is determined according to the electric quantity characteristic data of different users under the group category, the reference electric quantity characteristic is more representative of the electricity consumption condition of the group category, and further the accuracy and the rationality of the abnormal electricity consumption characteristic determined according to the reference electric quantity characteristic are improved.
On the basis of the technical solutions provided by the above embodiments, the present application further provides an alternative embodiment. In this alternative embodiment, the abnormal user identification procedure corresponding to S210 described above is described in detail. As shown in fig. 5, this step includes:
s510, acquiring electric quantity characteristic data and corresponding electricity utilization environment data of the user to be identified.
The power consumption environment data is used for representing environment factors corresponding to a power consumption process of a user, and specifically may include: at least one of weather, holidays, geographic location, and the like.
The electric quantity characteristic data and the corresponding electric consumption environment data of the user to be identified can be obtained from the wide table by using the table bureau.
And S520, carrying out feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified. Feature fusion is the operation of combining feature graphs of different scales in a mode of splicing, adding, multiplying or attentional mechanism in the depth dimension so as to improve the performance and the perceptibility of the model. Feature fusion may include at least one of stitching, adding, multiplying, attention mechanisms, pyramid pooling, and deconvolution.
In a specific embodiment, based on pyramid pooling feature fusion, pyramid pooling is carried out on electric quantity feature data with different scales, so that feature expression of different electric consumption environment data is obtained, and the electric quantity feature data of different electric consumption environment data can be effectively captured in the mode, so that the performance of an anomaly identification model is improved.
According to the technical scheme, the power consumption associated data of the user to be identified is determined by carrying out feature fusion on the power consumption characteristic data of the user to be identified and the corresponding power consumption environment data, so that the diversity and the richness of the power consumption associated data are improved, abnormal user identification is carried out according to the power consumption associated data instead of the power consumption characteristic data, and the accuracy of a user identification result is improved.
On the basis of the technical solutions provided by the above embodiments, the present application further provides an alternative embodiment. In this alternative embodiment, a more detailed method of anomaly data identification is provided. Referring to fig. 6, a method for identifying abnormal data includes:
s610, acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified;
and S620, carrying out feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
S630, based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power utilization correlation data of the user to be identified.
S640, generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified;
s650, determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified based on a preset clustering algorithm.
S660, determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
S670, if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides an abnormal data identification device for realizing the above related abnormal data identification method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiment of the one or more abnormal data recognition devices provided below may refer to the limitation of the abnormal data recognition method hereinabove, and will not be repeated herein.
In an exemplary embodiment, as shown in fig. 7, there is provided an abnormal data recognition apparatus including: a first acquisition module 710, a first determination module 720, a second determination module 730, and a first selection module 740, wherein:
a first obtaining module 710, configured to obtain electricity usage association data of a user to be identified; the electricity utilization associated data comprises electric quantity characteristic data;
the first determining module 720 is configured to determine, according to the power consumption association data of the user to be identified, whether the user to be identified is an abnormal user;
a second determining module 730, configured to determine a reference electrical quantity feature according to electrical quantity feature data of a group category to which the user to be identified belongs;
the first selecting module 740 is configured to select, if the user to be identified is an abnormal user, an abnormal power characteristic from power characteristic data of the user to be identified according to the reference power characteristic.
In one embodiment, the first acquisition module 710 includes:
the first obtaining unit obtains electricity consumption associated data of the user to be identified, and specifically, the first obtaining unit may include:
the first acquisition subunit is used for acquiring electric quantity characteristic data and corresponding electricity utilization environment data of the user to be identified;
and the second acquisition subunit performs feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
In one embodiment, the first determining module 720 includes:
the first determining unit is used for determining whether the user to be identified is an abnormal user or not according to the trained abnormal user identification model and the power utilization associated data of the user to be identified.
In one embodiment, the first determining module 720 further includes a model training unit, configured to train the abnormal user identification model;
specifically, the model training unit includes:
the first acquisition subunit is used for acquiring an abnormal user identification model to be trained, which is constructed based on a naive Bayesian algorithm;
the second acquisition subunit is used for acquiring the power utilization associated data and the corresponding abnormal labels of the users with different samples;
The transmission subunit is used for inputting the power consumption associated data of the users with different samples into the abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user;
and the training subunit is used for carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of the users with different samples and the difference conditions before the corresponding abnormal labels.
In one embodiment, the second determining module 730 includes:
the clustering unit is used for carrying out clustering analysis on the electric quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain the group category of the user to be identified;
and the determining unit is used for determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In one embodiment, the first selecting module 740 includes:
the generating subunit is used for generating at least one electric quantity characteristic time sequence according to the electric quantity time information of the electric quantity characteristic data of the user to be identified;
the determining subunit is used for determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified based on a preset clustering algorithm.
The respective modules in the above-described abnormal data recognition apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one exemplary embodiment, a computer device is provided, which may be a server, and the internal structure thereof may be as shown in fig. 8. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing the power utilization association data of the user to be identified. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method of anomaly data identification.
It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one exemplary embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:
acquiring power consumption associated data of a user to be identified; the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In one embodiment, the processor when executing the computer program further performs the steps of: based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm; acquiring power utilization associated data and corresponding abnormal labels of users with different samples; inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user; and carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels.
In one embodiment, the processor when executing the computer program further performs the steps of: based on a preset clustering algorithm, carrying out clustering analysis on the electric quantity characteristic data of the user to be identified to obtain the group category of the user to be identified; and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In one embodiment, the processor when executing the computer program further performs the steps of: generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified; based on a preset clustering algorithm, determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified.
In one embodiment, the processor when executing the computer program further performs the steps of: acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified; and carrying out feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring power consumption associated data of a user to be identified; the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In one embodiment, the computer program when executed by the processor further performs the steps of: based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm; acquiring power utilization associated data and corresponding abnormal labels of users with different samples; inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user; and carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels.
In one embodiment, the computer program when executed by the processor further performs the steps of: based on a preset clustering algorithm, carrying out clustering analysis on the electric quantity characteristic data of the user to be identified to obtain the group category of the user to be identified; and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In one embodiment, the computer program when executed by the processor further performs the steps of: generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified; based on a preset clustering algorithm, determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified; and carrying out feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:
acquiring power consumption associated data of a user to be identified; the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs;
if the user to be identified is an abnormal user, selecting the abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
In one embodiment, the computer program when executed by the processor further performs the steps of: based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm; acquiring power utilization associated data and corresponding abnormal labels of users with different samples; inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user; and carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels.
In one embodiment, the computer program when executed by the processor further performs the steps of: based on a preset clustering algorithm, carrying out clustering analysis on the electric quantity characteristic data of the user to be identified to obtain the group category of the user to be identified; and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
In one embodiment, the computer program when executed by the processor further performs the steps of: generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified; based on a preset clustering algorithm, determining the group category of the user to be identified according to the time sequence of each electric quantity characteristic of the user to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified; and carrying out feature fusion on the electric quantity feature data of the user to be identified and the corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use, and processing of the related data are required to meet the related regulations.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not thereby to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (10)

1. A method of identifying anomalous data, the method comprising:
acquiring power consumption associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified;
determining reference electric quantity characteristics according to the electric quantity characteristic data of the group category to which the user to be identified belongs;
If the user to be identified is an abnormal user, selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics.
2. The method according to claim 1, wherein determining whether the user to be identified is an abnormal user according to the electricity usage association data of the user to be identified comprises:
based on the trained abnormal user identification model, determining whether the user to be identified is an abnormal user or not according to the power consumption associated data of the user to be identified.
3. The method of claim 2, wherein the abnormal user identification model is trained by:
acquiring an abnormal user identification model to be trained constructed based on a naive Bayes algorithm;
acquiring power utilization associated data and corresponding abnormal labels of users with different samples;
inputting the power consumption associated data of different sample users into an abnormal user identification model to be trained to obtain the predicted abnormal category of the corresponding sample user;
and carrying out model training on the abnormal user identification model to be trained according to the predicted abnormal categories of different sample users and the difference conditions before corresponding abnormal labels.
4. A method according to any one of claims 1-3, wherein said determining a reference electrical quantity feature from electrical quantity feature data of a group category to which said user to be identified belongs comprises:
performing cluster analysis on the electric quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain a group category to which the user to be identified belongs;
and determining the reference electric quantity characteristics of the group category to which the user to be identified belongs according to the electric quantity characteristic data of different users under the group category to which the user to be identified belongs.
5. The method of claim 4, wherein the clustering analysis is performed on the electrical quantity characteristic data of the user to be identified based on a preset clustering algorithm to obtain a group category to which the user to be identified belongs, and the clustering analysis comprises:
generating at least one electric quantity characteristic time sequence according to the electric quantity characteristic data of the user to be identified;
based on a preset clustering algorithm, determining the group category of the user to be identified according to each electric quantity characteristic time sequence of the user to be identified.
6. A method according to any one of claims 1-3, wherein said obtaining electricity usage association data of a user to be identified comprises:
Acquiring electric quantity characteristic data and corresponding electricity utilization environment data of a user to be identified;
and carrying out feature fusion on the electric quantity feature data of the user to be identified and corresponding electricity utilization environment data to obtain electricity utilization associated data of the user to be identified.
7. An abnormal data identification apparatus, the apparatus comprising:
the first acquisition module is used for acquiring electricity utilization associated data of a user to be identified; wherein the electricity utilization associated data comprises electric quantity characteristic data;
the first determining module is used for determining whether the user to be identified is an abnormal user or not according to the electricity utilization associated data of the user to be identified;
the second determining module is used for determining reference electric quantity characteristics according to electric quantity characteristic data of the group category to which the user to be identified belongs;
and the first selecting module is used for selecting abnormal electric quantity characteristics from the electric quantity characteristic data of the user to be identified according to the reference electric quantity characteristics if the user to be identified is an abnormal user.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
CN202311838282.8A 2023-12-28 2023-12-28 Abnormal data identification method, device, computer equipment and storage medium Pending CN117828496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311838282.8A CN117828496A (en) 2023-12-28 2023-12-28 Abnormal data identification method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311838282.8A CN117828496A (en) 2023-12-28 2023-12-28 Abnormal data identification method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117828496A true CN117828496A (en) 2024-04-05

Family

ID=90507373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311838282.8A Pending CN117828496A (en) 2023-12-28 2023-12-28 Abnormal data identification method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117828496A (en)

Similar Documents

Publication Publication Date Title
CN110363449B (en) Risk identification method, device and system
CN115563477B (en) Harmonic data identification method, device, computer equipment and storage medium
CN111898247A (en) Landslide displacement prediction method, equipment and storage medium
Yin et al. Non-intrusive load monitoring by load trajectory and multi-feature based on DCNN
CN113343711B (en) Work order generation method, device, equipment and storage medium
CN116596574A (en) Power grid user portrait construction method and system
CN116191398A (en) Load prediction method, load prediction device, computer equipment and storage medium
CN116091157A (en) Resource pushing method and device, storage medium and computer equipment
CN117828496A (en) Abnormal data identification method, device, computer equipment and storage medium
CN114492994A (en) Power information processing system, method and device based on power big data
CN113627514A (en) Data processing method and device of knowledge graph, electronic equipment and storage medium
CN116205376B (en) Behavior prediction method, training method and device of behavior prediction model
CN116611506B (en) User analysis model training method, user label determining method and device
CN116500335B (en) Smart power grid electricity larceny detection method and system based on one-dimensional features and two-dimensional features
CN116628494A (en) User electricity stealing behavior prediction method and device, storage medium and computer equipment
CN117540209A (en) Data processing method and system based on industrial data connection engine
CN116881348A (en) Multi-source heterogeneous data storage method, device, computer equipment and storage medium
CN118157211A (en) Static frequency conversion device parameter prediction method and device and computer equipment
Sakinoğlu et al. Predicting the Risk of Death of Cryptocurrencies
CN116613754A (en) Power distribution system reliability assessment method, model training method, device and equipment
CN117076930A (en) Training sample processing method, abnormal transaction detection method, device and equipment
CN117313952A (en) Load prediction method, device, equipment and storage medium
CN117407504A (en) Operation and maintenance processing method, apparatus, device, storage medium and program product
CN116739867A (en) Method and device for measuring carbon emission of electric power system and computer equipment
CN117930017A (en) Health state determining method and device based on machine learning and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination