CN117195026A - High-performance heterogeneous computation-based power large user portrait construction method and device - Google Patents

High-performance heterogeneous computation-based power large user portrait construction method and device Download PDF

Info

Publication number
CN117195026A
CN117195026A CN202311164860.4A CN202311164860A CN117195026A CN 117195026 A CN117195026 A CN 117195026A CN 202311164860 A CN202311164860 A CN 202311164860A CN 117195026 A CN117195026 A CN 117195026A
Authority
CN
China
Prior art keywords
matrix
power
user
vector group
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311164860.4A
Other languages
Chinese (zh)
Inventor
张殷
唐琪
李国伟
王俊波
熊仕斌
蒋维
罗容波
王博
李新
范心明
董镝
何子兰
曾庆辉
刘昊
梁年柏
李兰茵
王智娇
刘崧
赖艳珊
林雅俐
姜沛东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Foshan Power Supply Bureau of Guangdong Power Grid Corp
Original Assignee
Guangdong Power Grid Co Ltd
Foshan Power Supply Bureau of Guangdong Power Grid Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Foshan Power Supply Bureau of Guangdong Power Grid Corp filed Critical Guangdong Power Grid Co Ltd
Priority to CN202311164860.4A priority Critical patent/CN117195026A/en
Publication of CN117195026A publication Critical patent/CN117195026A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a high-performance heterogeneous computation-based power large user portrait construction method and device, which are characterized in that power consumption data of a large power user are subjected to feature processing to obtain a matrix feature vector group, redundant data is reduced, the computation efficiency is improved, further, first clustering processing and second clustering processing are carried out on the matrix feature vector group, the local feature vector group in the power consumption data is determined, the user portrait is obtained according to the processing result, the difference in the power consumption data is fully explored, the full mining of the power consumption data is realized, the power consumption features of the user are analyzed in a multi-dimensional mode, and the precision of obtaining the user portrait is improved; and the processing efficiency is further improved through the heterogeneous acceleration of the CPU-GPU.

Description

High-performance heterogeneous computation-based power large user portrait construction method and device
Technical Field
The application belongs to the technical field of power user portrait construction, and particularly relates to a high-performance heterogeneous computing-based power large user portrait construction method and device.
Background
The power consumer consumes units or individuals of electrical energy through the grid. There are various classification modes for power consumers, and different power supply strategies and pricing standards can be set based on different types of power consumers. In order to meet the requirements of different types of power users, the power utilization characteristics of the power users are required to be divided, and corresponding user figures are obtained.
In the prior art, a power grid management main body can acquire hundreds of millions of electricity consumption data information of various power users in the management process, and how to determine effective user figures of different power users, especially user figures of large power users, from massive electricity consumption data is a problem to be solved currently.
Disclosure of Invention
In view of the above, the present application aims to provide a method and an apparatus for constructing a user figure of a large power user based on high-performance heterogeneous computation, which are used for determining the user figure of the large power user from massive power consumption data.
In order to solve the technical problems, the application provides the following technical scheme:
in a first aspect, the present application provides a method for constructing a large electric power user portrait based on high performance heterogeneous computing, which is applied to a CPU-GPU heterogeneous computing platform, and the method includes:
acquiring electricity consumption data of a large-power user;
performing feature processing based on the electricity consumption data to obtain a matrix feature vector group;
performing first clustering treatment on the matrix characteristic vector group, and determining a local characteristic vector group according to a clustering treatment result;
and taking the local feature vector group and the electricity consumption data as inputs, performing second aggregation processing, and obtaining the user portrait of the large-power user according to the processing result.
Further, performing feature processing based on the electricity consumption data to obtain a matrix feature vector group, including:
processing the power consumption data by a double-random normalization method, and obtaining normalized power consumption data according to a processing result;
and carrying out matrix decomposition on the normalized power consumption data, and obtaining a matrix characteristic vector group according to a matrix decomposition result.
Further, performing matrix decomposition on the normalized power consumption data, and obtaining a matrix characteristic vector group according to a matrix decomposition result, including:
performing matrix decomposition on the normalized power consumption data through a singular value decomposition algorithm to obtain an initial matrix eigenvector group and an initial eigenvalue matrix;
based on the magnitudes of the eigenvalues in the initial eigenvalue matrix, a matrix eigenvector set is determined from the initial matrix eigenvector set and the initial eigenvalue matrix.
Further, performing a first clustering process on the matrix eigenvector set, and determining a local eigenvector set according to a clustering result, including:
clustering is carried out on the matrix characteristic vector group, and a distance matrix is obtained; the clustering processing mode comprises at least one of a row clustering processing mode and a column clustering processing mode, wherein the row clustering processing mode corresponds to a row distance matrix; the column clustering processing mode corresponds to a column distance matrix;
and determining a preset number of eigenvectors from the matrix eigenvector group according to Euclidean norms in the distance matrix, and taking the eigenvectors as a local eigenvector group.
Further, determining a preset number of eigenvectors from the matrix eigenvector set according to euclidean norms of each row in the distance matrix, as a local eigenvector set, including:
acquiring Euclidean norms in a distance matrix;
determining a preset number of Euclidean norms and corresponding row sequence numbers or column sequence numbers from the Euclidean norms according to the numerical value;
and determining the local feature vector group from the matrix feature vectors according to the row sequence number or the column sequence number.
Further, the second clustering process is performed by using the local feature vector group and the electricity consumption data as inputs, and a user portrait of the large-power user is obtained according to the processing result, including:
processing the power consumption data by a double-random normalization method, and obtaining normalized power consumption data according to a processing result;
multiplying the local feature vector group by the normalized power consumption data to obtain a target feature vector group;
processing the target feature vector group through a clustering algorithm to obtain the power utilization features of the large power users corresponding to the power data;
and determining the user portrait of the corresponding large power user of the power data according to the power utilization characteristics.
Further, according to the electricity utilization characteristics, user figures of corresponding large power users of the power data are determined:
determining a user portrait of a large-power user according to the power utilization characteristics and a predetermined user label division standard; or alternatively, the first and second heat exchangers may be,
and carrying out cluster analysis on a plurality of large power users based on the electricity utilization characteristics, and determining user portraits of the large power users according to user labels of the large power users in the cluster group.
In a second aspect, the present application provides a high-performance heterogeneous computing-based power large user portrait construction device, which is applied to a CPU-GPU heterogeneous computing platform, and the device comprises:
the power consumption data acquisition module is used for acquiring power consumption data of a large power user;
the characteristic vector group module is used for carrying out characteristic processing based on the electricity consumption data to obtain a matrix characteristic vector group;
the local feature vector group module is used for carrying out clustering treatment on the matrix feature vector group and determining the local feature vector group according to a clustering treatment result;
and the user portrait acquisition module is used for taking the local feature vector group and the electricity consumption data as inputs, performing second aggregation processing, and obtaining the user portrait of the large-power user according to the processing result.
In a third aspect, the present application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect when the processor executes the computer program.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect.
In summary, the application provides a method and a device for constructing a large electric power user portrait based on high-performance heterogeneous computation, which are characterized in that a matrix characteristic vector group is obtained after characteristic processing is performed on electric power data of a large electric power user, redundant data is reduced, computation efficiency is improved, further, a first clustering process and a second clustering process are performed on the matrix characteristic vector group, a local characteristic vector group in the electric power data is determined, the user portrait is obtained according to a processing result, the difference in the electric power data is fully explored, the electric power data is fully mined, the electric power characteristics of the user are analyzed in a multi-dimensional manner, and the precision of obtaining the user portrait is improved; and the processing efficiency is further improved through the heterogeneous acceleration of the CPU-GPU.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a large electric power user portrait based on high-performance heterogeneous computing according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for constructing a large electric power user portrait based on high performance heterogeneous computing according to another embodiment of the present application;
FIG. 3 is a schematic illustration of a test providing functional electrical behavior clustering in one embodiment of the application;
FIG. 4 is a schematic illustration of a test providing a cluster of non-functional electrical behavior in one embodiment of the application;
FIG. 5 is a block diagram of a high-performance heterogeneous computing-based power large user representation construction device according to an embodiment of the present application;
fig. 6 is an internal structural diagram of a computer device provided in one embodiment of the present application.
Detailed Description
In order to make the objects, features and advantages of the present application more obvious and understandable, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the embodiments described below are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The high-performance heterogeneous computation-based power large user portrait construction method provided by the application can be applied to a CPU-GPU heterogeneous computation platform. The CPU and the GPU may be implemented by a separate server or a server cluster formed by a plurality of servers. The relevant terms appearing in the present application are explained below.
Heterogeneous computing refers to a hybrid system composed of various computing units such as CPU, DSP, GPU, ASIC, coprocessors, FPGA and the like, computing units using different types of instruction sets and different architectures to execute computing. The CPU-GPU architecture is one of the commonly used heterogeneous computing platforms. A central processing unit (Central Processing Unit, abbreviated as CPU) is used as an operation and control core of the computer system, and is a final execution unit for information processing and program running. Compared with a CPU, the GPU has higher parallelism, higher single-machine calculation peak value and higher calculation efficiency. GPU (Graphic Processing Unit), dedicated image display devices for personal computers, workstations and game consoles, display cards or motherboard integration.
The large power consumers refer to the power consumers who develop higher voltage levels or larger power consumption in the areas with conditions. The power consumption characteristics are complex, the power consumption characteristics are multidimensional, and the requirements on the reliability of power consumption are high. For example, large users of different types of power have large differences in terms of energy usage level, local energy usage fluctuations.
The user portrait refers to a labeled user model abstracted according to information such as user attributes, user preferences, living habits, user behaviors and the like, and one user portrait can comprise a plurality of user labels which are used for specifically characterizing the characteristics of the user. For a particular scene or application, a user typically corresponds to a user representation. And a user representation may be used to characterize the same or similar information users.
In one embodiment, as shown in fig. 1, a method for constructing a large-power user portrait based on high-performance heterogeneous computing is provided, and the method is applied to a CPU-GPU heterogeneous computing platform for explanation, and comprises the following steps:
s110, acquiring electricity consumption data of a large-power user.
The power grid operation company can pre-determine the users belonging to the type of large power users among the power users according to the type of the clients, and obtain the power consumption data of the users in a targeted manner.
The user electricity consumption data, the data for representing the user electricity consumption characteristics and the data for constructing the user portrait. From the content, it may include, but is not limited to, electricity usage type, electricity usage behavior characteristics, etc., from the data type, it may include descriptive data, proportional data, etc., quantitative data.
The electricity consumption data can be actively monitored by an electric power operation main body or can be actively reported by an electric power user. The data acquisition may be performed in real time or based on historical stored data extraction. That is, the user representation may be generated in real-time or may be deposited and modified based on historical data.
In the specific implementation, the platform can conditionally screen and acquire the power consumption data of the large power user according to the data type. All electricity consumption data of a large-power user in a certain time period can be obtained, and screening is performed to obtain data required by analysis.
The amount of initial power usage data can affect the redundant information content and thus the efficiency of feature processing. Therefore, when the amount of electricity is acquired, comprehensive and correlative relationship is required.
And S120, performing feature processing based on the electricity consumption data to obtain a matrix feature vector group.
The electricity utilization data is quantitative data, can be converted into matrix expression, and is subjected to characteristic processing to obtain a matrix characteristic vector group corresponding to the electricity utilization data.
Wherein the feature processing includes, but is not limited to, normalization and singular value (Singular Value Decomposition, SVD) decomposition. The normalization is used for removing the data information irrelevant to the local features of the large power user, the singular value decomposition is used for decomposing the data matrix into a matrix feature vector group and a feature value matrix, so that the local feature vector is determined from the matrix feature vector group, and the multi-dimensional feature extraction is realized.
In the specific implementation, the platform can convert the electricity data of a certain large power user with a certain period into a matrix, and the matrix characteristic vector group is obtained by performing characteristic processing on the matrix.
In some embodiments, the platform may normalize rows and columns of the matrix of the power data simultaneously by a double-random normalization method, and reject global feature information, so as to reduce interference of the global feature information while not damaging local feature information.
In some embodiments, the platform may perform SVD decomposition on the normalized power consumption data to obtain a matrix eigenvector set and a corresponding eigenvalue matrix. Through normalization and matrix decomposition, the interference of the global features can be eliminated, and the local feature vector is determined so as to mine the local features of the large power user.
S130, performing first clustering processing on the matrix eigenvector group, and determining a local eigenvector group according to a clustering processing result.
The first clustering process may use a K-means clustering algorithm, and performing the K-means clustering algorithm on the matrix feature vector set may cluster values with the same features into a cluster, and remove redundant values or filter valuable vectors from the cluster values.
The number of vectors in the local feature vector group is smaller than or equal to that of the matrix feature vector group.
In the specific implementation, the platform can perform first clustering processing on the matrix feature vector group through a K-means clustering algorithm, and determine at least one vector according to a clustering result to obtain a local feature vector group.
For example, the platform may perform a first clustering process on the rows and columns of the matrix feature vector set by using a K-means clustering algorithm, and cluster the first N most compact vectors according to the difference in clusters of the clusters, to obtain the local feature vector set.
In some embodiments, according to the basic principle of linear algebra, redundant data in the matrix can be deleted by performing linear transformation on the matrix rows and columns to obtain a local feature vector set.
And step S140, taking the local characteristic vector group and the electricity consumption data as inputs, performing second clustering processing, and obtaining the user portrait of the large-power user according to the processing result.
The included angle of the vectors is an important influencing factor of the point multiplication operation among the vectors, so that a certain numerical difference can occur in the result of multiplication of different local feature vectors. Thus, to reduce this variance, the filtered set of local feature vectors may be used as input to a cluster with the electrical data.
The clustering algorithm of the second clustering process can be a K-means clustering algorithm, the K-means clustering algorithm can cluster matrix features to determine multi-dimensional user features, and further user tags of large power users are determined to obtain user portraits.
In some embodiments, the platform may perform K-means clustering on the local feature vector set and the electricity consumption data, respectively, to obtain user tags of large electric users in the local dimension and the global dimension, respectively.
In some embodiments, the platform may multiply the local feature vector set with the normalized data matrix to obtain a new local feature vector set, and perform K-means clustering on the new local feature vector set to obtain a user tag of a power large user with a local dimension.
In some embodiments, normalization, SVD decomposition algorithms, and clustering algorithms may be pre-deployed on a model of a CPU-GPU heterogeneous computing platform, which may run and perform heterogeneous acceleration to obtain a computing result when performing power-hungry user portrayal construction.
According to the high-performance heterogeneous computing-based power large user portrait construction method, the power consumption data of the large power user are subjected to feature processing to obtain the matrix feature vector group, redundant data is reduced, computing efficiency is improved, further, first clustering processing and second clustering processing are conducted on the matrix feature vector group, the local feature vector group in the power consumption data is determined, the user portrait is obtained according to the processing result, the difference in the power data is fully explored, the power consumption features of the user are fully mined, and the accuracy of obtaining the user portrait is improved.
In one embodiment, the step of determining in S120 that the feature processing is performed based on the electricity consumption data, and obtaining the matrix feature vector set includes:
processing the electricity utilization data by a double-random normalization method, and obtaining normalized electricity utilization data according to a processing result; and carrying out matrix decomposition on the normalized power consumption data, and obtaining the matrix characteristic vector group according to a matrix decomposition result.
In this embodiment, the dual random normalization method may be used to normalize the rows and columns of the matrix at the same time, and the method may reduce the interference of the global feature information while not damaging the local feature information.
Specifically, the double-random normalization method is realized by the following steps:
step 1, setting the iteration number as k, and enabling k=1;
step 2, setting a matrix of the power data as A kRepresentation A k The data of the ith row and the jth column in the matrix have a diagonal matrix R k ,C k These two matrices represent the data matrix A respectively k The sum of the two directions of row and column, wherein +.>Is A k The sum of the values of the i-th row,is A k The sum of the values in column j assists in calculating the matrix An k Is An k =R -1 A k C -1
Step 3, calculating a difference matrix D before and after normalization k =A k -An k . Calculating matrix D by row k Euclidean norms of (i.e.)
Step 4, order A k =An k I.e. the result obtained by iteration at this time is A k
Step 5, when Di is satisfied k Stopping iteration when epsilon or k=K is less than, and obtaining a normalization result A k Otherwise, repeating the processes of steps 2-4, wherein epsilon is a preset threshold value, and K is a preset maximum iteration number. The matrix A finally obtained k I.e. normalized dataA matrix.
The algorithm for matrix decomposition is SVD decomposition, which can decompose a data matrix into a product of a feature vector group and a feature value matrix, wherein the feature vector group contains feature vectors constituting local features, and further, the feature vectors constituting local features can be contained in the feature vector group through a second clustering process.
In the SVD decomposition algorithm, it is assumed that the normalized data matrix a=u·Σv T Wherein the dimension of matrix a is mxn, the dimension of matrix U is mxm, the dimension of matrix V is nxn, matrix Σ is a diagonal matrix, and the dimension is mxn. Then the matrix A obtained by normalization and the transposition A thereof T Multiplying, and deriving a feature matrix U according to a formula (1):
AA T =U∑V T V∑ T U T =U∑∑ T U T (1)
similarly, will A T Multiplying A, and according to formula (2), solving the feature matrix of the product matrix to obtain a matrix V.
A T A=V∑U T U∑ T V T =V∑∑ T V T (2)
The result of the SVD decomposition algorithm is three data matrices U, V, Σ. The matrix U and the matrix V are feature vector groups and respectively contain data feature information in the row and column directions of the initial data matrix; the value of the sum is a eigenvalue matrix, and the numerical value in the matrix reflects the importance of the information contained in the corresponding eigenvector to the reconstructed data matrix, and the larger the eigenvalue is, the more important the corresponding eigenvector is. Therefore, the feature vector corresponding to the maximum feature value can be deleted, so that the influence of global feature information is reduced, and the vectors of the row and column direction features of the data matrix required by subsequent analysis are selected from the matrix U and the matrix V respectively. Further, because there is a feature information matrix U on the row direction feature and a feature information matrix V on the column direction feature, the clustering manner of the first clustering process may include a row clustering manner and a column clustering manner, where the row clustering manner corresponds to the row distance matrix; the column clustering processing mode corresponds to a column distance matrix;
according to the scheme of the embodiment, redundant data and global characteristic information are removed through characteristic processing of the power data, pertinence of the local characteristic information is improved, and calculation efficiency is further improved.
In one embodiment, the step of determining in S130 that the matrix eigenvector group is clustered, and determining the local eigenvector group according to the result of the clustering includes:
performing row clustering and column clustering on the matrix characteristic vector vectors respectively to obtain a distance matrix;
and determining a preset number of eigenvectors from the matrix eigenvector group according to Euclidean norms in the distance matrix, and taking the eigenvectors as the local eigenvector group.
In this embodiment, the processing algorithm of the first clustering process is K-means clustering. The matrix characteristic vector group comprises data characteristic information in the array direction and the column direction, and the data characteristic values with the same data characteristic value can be clustered by executing a K-means clustering algorithm on the data vectors so as to reject relevant easy information and keep the needed local characteristic vectors.
In some embodiments, the K-means clustering algorithm may cluster the matrices U, V obtained by the feature decomposition, and select the first n_best vectors with the most compact clusters, i.e. the vector set v_tr used for data transformation, according to the intra-cluster differences of the clusters. The step of performing a row clustering process includes:
step 5, U is set T Calculating according to the line K-means, calculating the distance between the numerical value and the neutral point, and summing to obtain a distance matrix D i The method comprises the steps of carrying out a first treatment on the surface of the Executing a clustering algorithm on the kth line to obtain an array center formed by the classification labels and the center points of each column of the kth line vector; calculating the distance between the jth value in the kth row and the clustering center point of the jth value, and storing the result into a matrix D i At the kth row and at the jth column of (c).
Step 6, calculating D i Euclidean norms for the kth row of the matrixStoring the Euclidean norms and corresponding line numbers; according to EuclideanThe numerical value of the Euclidean norms is increased, the line numbers corresponding to the first N Euclidean norms are selected according to the numerical value from small to large, and U is determined according to the line numbers T These vectors are regarded as local feature vectors.
The extraction of the local feature vector group by the matrix V in the column direction may be performed by the method of the matrix U in the row direction.
According to the scheme of the embodiment, the local feature vector group is extracted from the matrix feature vector group through clustering, global features are reduced, redundant features are removed, the data contribution degree is improved, and the processing efficiency is improved.
In one embodiment, in step S140, the step of performing a second clustering process using the local feature vector set and the electricity consumption data as inputs, and obtaining a user portrait of the power consumer according to the processing result includes:
processing the electricity utilization data by a double-random normalization method, and obtaining normalized electricity utilization data according to a processing result; multiplying the local feature vector group by the normalized power consumption data to obtain a target feature vector group; processing the target feature vector group through a clustering algorithm to obtain the power utilization features of the large power users corresponding to the power data; and determining the user portrait of the corresponding large power user of the power data according to the power utilization characteristics.
In this embodiment, the dual random normalization method processes the lighting data in the above steps 1 to 5, and will not be described again. The local feature vector group and the normalized power consumption data are multiplied and then subjected to clustering analysis, so that numerical value difference caused by point multiplication among the local feature vectors can be reduced. The target feature vector group is calculated by the formula (3):
Pr=A·V_tr (3)
wherein V_tr is a local feature vector group, A is normalized power consumption data, and Pr is a target feature vector group.
The second clustering algorithm is K-means clustering algorithm processing, and the clustering result is the row or column label of the cluster. The row labels of the clusters are obtained through calculation of feature vectors screened by a matrix U of row feature information of the target feature vector group, and the column labels of the clusters are obtained through calculation of feature vectors screened by a matrix V of column feature information of the target feature vector group.
For example, if the feature vector represents the user electricity consumption type, determining a plurality of user electricity consumption type labels is completed by analyzing the user electricity consumption type every day, and then the labels are constructed into the user portrait of the large-power user.
For another example, if the feature vector further characterizes the electricity behavior feature of the user, the electricity behavior feature is defined as two dimensions (the user has a functional electricity class and the user has no functional electricity class) by performing multidimensional analysis from the angles of reactive power and active power due to different active and reactive power consumption modes of the user in different industries, and the electricity behavior label corresponding to the large-power user is determined according to the output of the second aggregation result. Each dimension corresponds to more than one label.
In some embodiments, the platform may determine a user representation of a power utility based on the power usage characteristics and predetermined user tag classification criteria. In the specific implementation, after the characteristics of the large-power user are determined to be in accordance with a certain type of electricity utilization characteristics, the specific labels in the electricity utilization characteristics are further determined to be in accordance with the electricity utilization characteristics, and the specific labels are added into the user portrait.
In some embodiments, the platform may also perform cluster analysis on the plurality of power utility customers based on the power usage characteristics, and determine a customer representation of the power utility customers based on the customer labels of the power utility customers in the cluster group. In the specific implementation, the labels of the existing user portraits can be referred to, and the labels of other users with large electric power can be determined according to the clustering result of the characteristic dimension of the users with large electric power.
In some embodiments, the user image of a power utility user may include: user basic information, electricity utilization feature class 1 and corresponding at least one tag, electricity utilization feature class 2 and corresponding at least one tag, and so on.
According to the scheme of the embodiment, through second aggregation processing, the user portrait of the user with high power is determined based on the local feature vector group, the depth of feature mining is improved, and the accuracy of the user portrait is further improved.
In one embodiment, as shown in fig. 2, a method for constructing a large power consumer representation based on high performance heterogeneous computing is provided, the method comprising:
s210, electricity consumption data are acquired.
S220, carrying out double-random normalization processing on the power consumption data to obtain a matrix A;
s230, performing SVD decomposition on the processing result matrix a in step S220 to obtain a feature vector group U, V and a feature value matrix Σ.
S240, K-Means clustering is performed on the feature vector group U, V, and a screened local feature vector group V_tr is obtained.
S250, multiplying the local feature vector set V_tr by the matrix A of S220 to obtain a target feature vector set Pr.
S260, K-Means clustering is carried out on the target feature vector set Pr, and a clustering result is obtained.
According to the method, the characteristic vector group U, V is obtained through double-random normalization processing and SVD decomposition of the power consumption data of the large-power user, redundant data is reduced, calculation efficiency is improved, further, K-Means clustering processing is conducted twice on the characteristic vector group U, V, the local characteristic vector group in the power consumption data is determined, a user portrait is built according to the processing result, the difference in the power data is fully explored, the full mining of the power data is achieved, the power consumption characteristics of the user are analyzed in a multi-dimensional mode, and accuracy of obtaining the user portrait is improved.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required for the present application.
In order to further explain the effect of the high-performance heterogeneous computation-based power large user portrait construction method, the method is applied to monitoring of power behavior labels and passes through CPU-GPU heterogeneous acceleration test. It should be noted that this scenario data is an exemplary choice and is not a specific limitation on the inventive solution.
The test server is two servers containing GPU, and the hardware information is: GPU: NVIDIA a40 x 8; CPU: INTEL C621A series chipset 3.6GHZ; memory: 96G; hard disk: 2 x 960g; the software environment is as follows: linux Ubuntu18.04 operating system.
And selecting the data set as 9 ten thousand pieces of information of user electricity data. The same hardware calculation force is used for testing before and after heterogeneous acceleration.
The test results are shown in Table 1:
TABLE 1 user profile construction method test results
Model name Heterogeneous pre-acceleration run time Heterogeneous post acceleration runtime
User portrayal construction 3.24s 2.368s
From the experimental results, after the heterogeneous acceleration, the overall speed is improved by 26.9% compared with that before the heterogeneous acceleration, and the calculation efficiency of the model is effectively improved by the proposed scheme.
And after the power consumption data of the large power user is multiplied by the corresponding multiplying power and restored to the real data, the clustering algorithm of each embodiment is executed to obtain a clustering result in the row direction (namely the transverse time dimension). When the number of clusters of the clustering algorithm is 6 or 7, the internal difference of the classification results is large, and when it is 9 or 10, more similar types of classification results occur, so the number of clusters is selected to be 8. The analysis results of the active power and the reactive power are shown in fig. 3 and fig. 4, respectively. The electricity consumption data is sampled every 15 minutes, so that the abscissa in fig. 3 and 4 is 96 sampling points, the average value of the electricity consumption data of the same type is calculated according to the sampling points, the power shown by the vertical axis is obtained, the unit of an active clustering result is kW, and the unit of a reactive clustering result is kVar.
According to the change curves of the active power and the reactive power shown in fig. 3 and fig. 4, the correspondence between the small graph and the category is: left 1 map corresponds to category 1, right 1 map corresponds to category 2, left 2 map corresponds to category 3, right 2 map corresponds to category 4, and so on. It can be seen that the difference exists between the categories in terms of the magnitude and the change rule of the magnitude, and the clustering basis of the clustering algorithm is two dimensions of the magnitude and the change rule of the magnitude. As can be seen from the figure, 8 active behavior tags and 8 reactive behavior tags can be obtained by the method of each embodiment, so that the power consumption behavior characteristics of the user (the user has no power consumption category) are 64 in total, as shown in table 2.
TABLE 2 user Power consumption behavior characterization
Category(s) 1 2 3 4 5 6 7 8
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (1,7) (1,8)
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6) (2,7) (2,8)
3 (3,0) (3,2) (3,3) (3,4) (3,5) (3,6) (3,7) (3,8)
4 (4,0) (4,2) (4,3) (4,4) (4,5) (4,6) (4,7) (4,8)
5 (5,0) (5,2) (5,3) (5,4) (5,5) (5,6) (5,7) (5,8)
6 (6,0) (6,2) (6,3) (6,4) (6,5) (6,6) (6,7) (6,8)
7 (7,0) (7,2) (7,3) (7,4) (7,5) (7,6) (7,7) (7,8)
8 (8,0) (8,2) (8,3) (8,4) (8,5) (8,6) (8,7) (8,8)
By observing the change rule of the active power curve in fig. 4, the clustering result can be manually reclassified. The electricity consumption data of any sampling point in the clustering process is the average value of the electricity consumption data of the large power users at the sampling time under the category, so that the situation that the electricity consumption data of the users with similar electricity consumption data fluctuation rules are averaged possibly exists, and therefore, when the classification analysis is performed on the active power behaviors of the users again, the fluctuation situation of 8 category electricity consumption behavior curves is mainly focused. Based on this, the 8 categories of active behavior tags are divided into five large categories of tags: the method comprises the steps of carrying out full-day peak type (category 1 and category 8), cliff drop/increase type (category 3 and category 4), night peak type (category 2 and category 7), double peak type (category 5) and triple peak type (category 6), and configuring corresponding judgment standards for various labels so as to gradually construct and perfect a user portrait label system of a large-power user.
In one embodiment, as shown in fig. 5, there is provided a high-performance heterogeneous computing-based power large user portrait construction device, applied to a CPU-GPU heterogeneous computing platform, the device 500 comprising:
the electricity consumption data acquisition module 510 is used for acquiring electricity consumption data of a large electric power user;
the feature vector group module 520 is configured to perform feature processing based on the electricity consumption data to obtain a matrix feature vector group;
the local feature vector group module 530 is configured to perform clustering on the matrix feature vector group, and determine a local feature vector group according to a clustering result;
and a user portrait acquisition module 540 for performing a second aggregation process by taking the local feature vector group and the electricity consumption data as inputs, and obtaining a user portrait of a large-power user according to a processing result.
In one embodiment, the feature vector set module 520 includes a normalization unit, configured to process the electricity consumption data by using a double-random normalization method, and obtain normalized electricity consumption data according to a processing result; and carrying out matrix decomposition on the normalized power consumption data, and obtaining the matrix characteristic vector group according to a matrix decomposition result.
In one embodiment, the normalization unit is further configured to perform matrix decomposition on the normalized power consumption data through a singular value decomposition algorithm to obtain an initial matrix eigenvector set and an initial eigenvalue matrix; and determining the matrix eigenvector group from the initial matrix eigenvector group and the initial eigenvalue matrix based on the magnitude of eigenvalues in the initial eigenvalue matrix.
In one embodiment, the local feature vector group module 530 is further configured to perform clustering on the matrix feature vector group to obtain a distance matrix; the clustering method comprises at least one of a row clustering method and a column clustering method, wherein the row clustering method corresponds to a row distance matrix; the column clustering processing mode corresponds to a column distance matrix; and determining a preset number of eigenvectors from the matrix eigenvector group according to Euclidean norms in the distance matrix, and taking the eigenvectors as the local eigenvector group.
In one embodiment, the local feature vector group module 530 is further configured to obtain euclidean norms in the distance matrix; determining a preset number of Euclidean norms and corresponding row sequence numbers or column sequence numbers from the Euclidean norms according to the numerical value; and determining the local feature vector group from the matrix feature vectors according to the row sequence number or the column sequence number.
In one embodiment, the user portrait acquisition module 540 is further configured to process the electricity consumption data through a dual random normalization method, and obtain normalized electricity consumption data according to a processing result; multiplying the local feature vector group by the normalized power consumption data to obtain a target feature vector group; processing the target feature vector group through a clustering algorithm to obtain the power utilization features of the large power users corresponding to the power data; and determining the user portrait of the corresponding large power user of the power data according to the power utilization characteristics.
In one embodiment, the user portrait acquisition module 540 is further configured to determine a user portrait of the power consumer according to the power consumption characteristics and a predetermined user tag division criterion; or, based on the electricity utilization characteristics, carrying out cluster analysis on a plurality of large power users, and determining the user portrait of each large power user according to the user labels of the large power users in the cluster group.
The specific limitation of the high-performance heterogeneous computing-based power large user portrait construction device can be referred to as the limitation of the high-performance heterogeneous computing-based power large user portrait construction method, and the description thereof is omitted herein. The modules in the high-performance heterogeneous computing-based high-power user portrait construction device can be fully or partially realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
The application can be applied to computer equipment such as smart phones, tablet computers, notebook computers, desktop computers, rack-mounted servers, blade servers, tower servers or cabinet servers (comprising independent servers or server clusters formed by a plurality of servers) and the like which can execute programs. The computer device of the present embodiment includes at least, but is not limited to: memory, processor, may be communicatively coupled to each other via a system bus as shown in fig. 6. It is noted that fig. 6 only shows a computer device having memory, processor components, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may be implemented instead. The memory (i.e., readable storage medium) includes flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory may be an internal storage unit of a computer device, such as a hard disk or memory of the computer device. In other embodiments, the memory may also be an external storage device of a computer device, such as a plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card) or the like, which are provided on the computer device. Of course, the memory may also include both internal storage units of the computer device and external storage devices. In this embodiment, the memory is typically used to store an operating system and various application software installed on the computer device, such as power consumption data and user portrait data. In addition, the memory can be used to temporarily store various types of data that have been output or are to be output. The processor may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor is typically used to control the overall operation of the computer device. In this embodiment, the processor is configured to run the program code or process data stored in the memory, so as to implement a high-performance heterogeneous computing-based method for constructing a large-power user portrait.
It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. The high-performance heterogeneous computation-based power large user portrait construction method is characterized by comprising the following steps of:
acquiring electricity consumption data of a large-power user;
performing feature processing based on the electricity consumption data to obtain a matrix feature vector group;
performing first clustering treatment on the matrix characteristic vector group, and determining a local characteristic vector group according to a clustering treatment result;
and taking the local feature vector group and the electricity consumption data as inputs, performing second aggregation processing, and obtaining a user portrait of the large-power user according to a processing result.
2. The method for constructing a large electric power user portrait based on high-performance heterogeneous computing according to claim 1, wherein the performing feature processing based on the electricity consumption data to obtain a matrix feature vector group includes:
processing the electricity utilization data by a double-random normalization method, and obtaining normalized electricity utilization data according to a processing result;
and carrying out matrix decomposition on the normalized power consumption data, and obtaining the matrix characteristic vector group according to a matrix decomposition result.
3. The method for constructing a large electric power user portrait based on high performance heterogeneous computing according to claim 2, wherein the performing matrix decomposition on the normalized electric power consumption data, and obtaining the matrix eigenvector group according to a matrix decomposition result, includes:
performing matrix decomposition on the normalized power consumption data through a singular value decomposition algorithm to obtain an initial matrix eigenvector group and an initial eigenvalue matrix;
and determining the matrix eigenvector group from the initial matrix eigenvector group and the initial eigenvalue matrix based on the magnitude of eigenvalues in the initial eigenvalue matrix.
4. The method for constructing a large-power user portrait based on high-performance heterogeneous computing according to claim 2, wherein said performing a first clustering process on the matrix eigenvector set, determining a local eigenvector set according to a clustering result, includes:
clustering the matrix characteristic vector group to obtain a distance matrix; the clustering method comprises at least one of a row clustering method and a column clustering method, wherein the row clustering method corresponds to a row distance matrix; the column clustering processing mode corresponds to a column distance matrix;
and determining a preset number of eigenvectors from the matrix eigenvector group according to Euclidean norms in the distance matrix, and taking the eigenvectors as the local eigenvector group.
5. The method for constructing a large electric power user figure based on high-performance heterogeneous computing according to claim 4, wherein the determining a preset number of eigenvectors from the matrix eigenvector group according to euclidean norms of each row in the distance matrix as the local eigenvector group comprises:
acquiring Euclidean norms in the distance matrix;
determining a preset number of Euclidean norms and corresponding row sequence numbers or column sequence numbers from the Euclidean norms according to the numerical value;
and determining the local feature vector group from the matrix feature vectors according to the row sequence number or the column sequence number.
6. The method for constructing a high-performance heterogeneous computing-based user figure for a large power consumer according to claim 1, wherein the performing a second clustering process using the local feature vector group and the power consumption data as inputs, and obtaining the user figure for the large power consumer based on the processing result, comprises:
processing the electricity utilization data by a double-random normalization method, and obtaining normalized electricity utilization data according to a processing result;
multiplying the local feature vector group by the normalized power consumption data to obtain a target feature vector group;
processing the target feature vector group through a clustering algorithm to obtain the power utilization features of the large power users corresponding to the power data;
and determining the user portrait of the corresponding large power user of the power data according to the power utilization characteristics.
7. The high-performance heterogeneous computing-based power consumer representation construction method of claim 6, wherein the determining the corresponding power consumer representation of the power data based on the power usage characteristics:
determining the user portrait of the large power user according to the power utilization characteristics and a predetermined user label division standard; or alternatively, the first and second heat exchangers may be,
and carrying out cluster analysis on a plurality of large power users based on the electricity utilization characteristics, and determining user portraits of the large power users according to user labels of the large power users in the cluster group.
8. High-performance heterogeneous computing-based large-power user portrait construction device applied to a CPU-GPU heterogeneous computing platform, wherein the device comprises:
the power consumption data acquisition module is used for acquiring power consumption data of a large power user;
the characteristic vector group module is used for carrying out characteristic processing based on the electricity consumption data to obtain a matrix characteristic vector group;
the local feature vector group module is used for carrying out clustering processing on the matrix feature vector group and determining the local feature vector group according to a clustering processing result;
and the user portrait acquisition module is used for taking the local feature vector group and the electricity consumption data as inputs, performing second aggregation processing, and obtaining the user portrait of the large-power user according to the processing result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of the high performance heterogeneous computing based power large user representation construction method according to any of claims 1-7 when the computer program is executed.
10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps of the high-performance heterogeneous computing-based power large user representation construction method according to any of claims 1 to 7.
CN202311164860.4A 2023-09-08 2023-09-08 High-performance heterogeneous computation-based power large user portrait construction method and device Pending CN117195026A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311164860.4A CN117195026A (en) 2023-09-08 2023-09-08 High-performance heterogeneous computation-based power large user portrait construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311164860.4A CN117195026A (en) 2023-09-08 2023-09-08 High-performance heterogeneous computation-based power large user portrait construction method and device

Publications (1)

Publication Number Publication Date
CN117195026A true CN117195026A (en) 2023-12-08

Family

ID=88983036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311164860.4A Pending CN117195026A (en) 2023-09-08 2023-09-08 High-performance heterogeneous computation-based power large user portrait construction method and device

Country Status (1)

Country Link
CN (1) CN117195026A (en)

Similar Documents

Publication Publication Date Title
US8843422B2 (en) Cloud anomaly detection using normalization, binning and entropy determination
CN109891508B (en) Single cell type detection method, device, apparatus and storage medium
CN111737099B (en) Data center anomaly detection method and device based on Gaussian distribution
CN104516808A (en) Data preprocessing device and method thereof
CN116109121B (en) User demand mining method and system based on big data analysis
CN114091783A (en) Enterprise electricity utilization early warning method and device, computer equipment and storage medium
CN116226732A (en) Electric bus charging load curve classification method and system
Shi et al. Clustering framework based on multi-scale analysis of intraday financial time series
CN113591900A (en) Identification method and device for high-demand response potential user and terminal equipment
CN110851502B (en) Load characteristic scene classification method based on data mining technology
CN113094899A (en) Random power flow calculation method and device, electronic equipment and storage medium
Liang et al. Prediction method of energy consumption based on multiple energy-related features in data center
CN117195026A (en) High-performance heterogeneous computation-based power large user portrait construction method and device
CN109344875B (en) Method and device for generating solar wind power output time sequence based on cluster analysis
CN111797899A (en) Low-voltage transformer area kmeans clustering method and system
CN111402068A (en) Premium data analysis method and device based on big data and storage medium
CN108629356B (en) Data storage method and device for classified application of electric loads
CN116227127A (en) Method and device for determining performance of transformer, computer equipment and storage medium
CN114201369A (en) Server cluster management method and device, electronic equipment and storage medium
Liu et al. Mathematical Verification and Analysis of CUDA based Parallel Matrix Multiplication
Rădulescu et al. A performance and power consumption analysis based on processor power models
CN116541766B (en) Training method of electroencephalogram data restoration model, electroencephalogram data restoration method and device
CN116522105B (en) Method, device, equipment and medium for integrally constructing data based on cloud computing
CN117493904A (en) Power load identification method, device, equipment, medium and product
CN113723835B (en) Water consumption evaluation method and terminal equipment for thermal power plant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination