CN113205368B - Industrial and commercial customer clustering method based on time sequence water consumption data - Google Patents

Industrial and commercial customer clustering method based on time sequence water consumption data Download PDF

Info

Publication number
CN113205368B
CN113205368B CN202110569868.3A CN202110569868A CN113205368B CN 113205368 B CN113205368 B CN 113205368B CN 202110569868 A CN202110569868 A CN 202110569868A CN 113205368 B CN113205368 B CN 113205368B
Authority
CN
China
Prior art keywords
industrial
commercial
water consumption
value
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110569868.3A
Other languages
Chinese (zh)
Other versions
CN113205368A (en
Inventor
朱波
穆利
吴铭
王亚琦
陶鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Water Group Co ltd
Original Assignee
Hefei Water Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Water Group Co ltd filed Critical Hefei Water Group Co ltd
Priority to CN202110569868.3A priority Critical patent/CN113205368B/en
Publication of CN113205368A publication Critical patent/CN113205368A/en
Application granted granted Critical
Publication of CN113205368B publication Critical patent/CN113205368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention discloses a method for clustering industrial and commercial businesses based on time sequence water consumption data, which comprises the following steps: 1. building daily water consumption data of industrial and commercial enterprises and carrying out data preprocessing work; 2. learning and representing the time-series water data based on an LSTM model; 3. clustering industrial and commercial customers based on the water use trend; 4. clustering industrial and commercial customers based on the water use range on the basis of clustering according to the water use trend; 5. and visually displaying the clustering result. The invention can learn abundant water use patterns and trend information hidden in the time sequence water use data of the industrial and commercial enterprises through the LSTM model, the water use patterns and the trend information are used as the water use characteristic representation of the industrial and commercial enterprises, and the clustering of the industrial and commercial enterprises based on two factors of the water use trend and the water use range can be accurately and rapidly completed by combining with the kmeans algorithm.

Description

Industrial and commercial customer clustering method based on time sequence water consumption data
Technical Field
The invention relates to the technical field of user clustering, in particular to a time sequence data-based industrial and commercial customer clustering method.
Background
In the existing research on a user clustering method, a kmeans algorithm plays an excellent effect in static data clustering, but similarity between industrial and commercial customers is calculated by adopting Euclidean distance, the sequence of time points cannot be considered, only similarity of water consumption can be captured, and the trend of water consumption characteristics changing along with time cannot be described.
The water consumption data of the industrial and commercial enterprises is time sequence data, and the water consumption of the industrial and commercial enterprises on day, week and month is recorded according to fixed time intervals, such as day, week and month, and potential time sequence water consumption characteristics of a plurality of industrial and commercial enterprises, such as water consumption period, water consumption mode and the like, are hidden. The state change of a certain moment of a sample in time series data is related to the state of the previous moment and the next moment, so how to analyze the change rule in the time series data by combining the state of the previous moment and the next moment is a difficult point.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a time sequence data-based industrial and commercial customer clustering method, so that learning and characterization of a water use mode hidden in time sequence water use data of industrial and commercial customers can be realized through an LSTM model, clustering of characteristics of two aspects of water use trend and water use range of the industrial and commercial customers is performed by combining a kmeans algorithm, clustering accuracy is improved, and mining of rich change rules and trends hidden in the time sequence water use data is facilitated, so that the water use mode of the industrial and commercial customers is accurately and completely carved.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a clustering method of industrial and commercial enterprises based on time sequence water consumption data, which is characterized by comprising the following steps:
step1, constructing daily water consumption data of industrial and commercial enterprises;
step 1.1, obtaining remote water meter data of industrial and commercial enterprises, and extracting industrial and commercial enterprise id, water meter updating time, accumulated water flow, industrial and commercial enterprise remote water meter address and industrial and commercial enterprise name in the remote water meter data;
step 1.2, carrying out longitude and latitude conversion on the industrial and commercial tenant remote water meter address to obtain longitude and latitude information of the industrial and commercial tenant;
step 1.3, dividing the remote water meter data according to the industrial and commercial customer id to obtain m parts of water meter data files named by the industrial and commercial customer id, and arranging all data in the water meter data files according to the sequence of water meter updating time; wherein m represents the total number of industrial and commercial businesses;
step 1.4, carrying out difference processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companies
Figure BDA0003082264990000011
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003082264990000012
representing the daily water consumption value of the ith industrial and commercial company on the t day, wherein t represents the water consumption days, and marking the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
Step 1.5, carrying out detection and processing of abnormal values on the sample feature set A to obtain a sample feature set X' after abnormal processing;
step 1.6, processing the missing value of the processed sample feature set A 'to obtain a sample feature set X' subjected to missing processing;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded as
Figure BDA0003082264990000021
Wherein the content of the first and second substances,
Figure BDA0003082264990000022
expressing the normalized daily water consumption value of the ith industrial company t day, an
Figure BDA0003082264990000023
Figure BDA0003082264990000024
Expressing the normalized daily water consumption value of the ith industrial and commercial company on the tth day;
step 2.2, pre-training an LSTM model;
the normalized sample characteristic set
Figure BDA0003082264990000025
Dividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the training set into the LSTM model to obtain a prediction sequence of a verification set, then calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root-mean-square error, thereby completing one training of the LSTM model, and stopping training when the training times reach the epoch value, thereby obtaining the trained LSTM model and using the trained LSTM model as a merchant time sequence water use characteristic extraction model;
step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into the commercial business time sequence water use characteristic extraction model, and outputting the water use characteristic vector Y = { Y ] of each industrial and commercial business i I =1,2,. ·, m }; wherein, y i Representing the water use characteristic vector of the ith industrial and commercial company, an
Figure BDA0003082264990000026
Figure BDA0003082264990000027
Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector;
step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2, · m } performing industrial and commercial customer clustering based on water usage trends;
step 3.1, determining the optimal clustering quantity by combining an elbow method and a contour coefficient method, and recording the optimal clustering quantity as K;
step 3.2, based on the optimal clustering quantity K, using the water consumption characteristic vector y of the ith industrial and commercial company i The method is used as a sample to be detected and input into a kmeans algorithm, so that industrial and commercial businesses in the water use characteristic vector Y of each industrial and commercial business are gathered into K clusters, and the coordinates of the centers of the K clusters are randomly initialized by using the formula (1):
Figure BDA0003082264990000028
in the formula (1), the reaction mixture is,
Figure BDA0003082264990000029
the center of the k-th cluster is represented,
Figure BDA00030822649900000210
a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster center
Figure BDA00030822649900000211
European distance of
Figure BDA00030822649900000212
Thereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
Figure BDA0003082264990000031
step 3.4, according to the sample y to be measured i The Euclidean distance from the center of each cluster is used for measuring the sample y i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, after all samples to be tested are divided into the clusters to which the samples belong, K classes are obtained, and the set of industrial and commercial businesses in each class is obtained as
Figure BDA0003082264990000032
Wherein the content of the first and second substances,
Figure BDA0003082264990000033
represents the feature vector of the jth industrial business in the kth class, an
Figure BDA0003082264990000034
j=1,2,...,S k ,S k Representing the number of industrial businesses in the kth class;
Figure BDA0003082264990000035
representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)
Figure BDA0003082264990000036
Thereby obtaining an updated cluster center of
Figure BDA0003082264990000037
And assign a value to
Figure BDA0003082264990000038
k=1,2,...,K;
Figure BDA0003082264990000039
In the formula (6), the reaction mixture is,
Figure BDA00030822649900000310
coordinate value representing the nth dimension of the updated kth class center;
Step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
step4, clustering industrial and commercial businesses based on the water consumption range;
step 4.1, based on the result of the water use trend clustering of the industrial and commercial customers, acquiring a set B of the industrial and commercial customers in each class k ={b j |j=1,2,...,S′ k In which b j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial businesses in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k class after normalization j The real daily water consumption value on the t day is recorded
Figure BDA00030822649900000311
Figure BDA00030822649900000312
Represents the j industrial business b in the k class after normalization j Actual daily water usage value on day t, j =1, 2. k Repeating the process from the step 3.1 to the step 3.6, and re-clustering the real daily water consumption values of the industrial and commercial customers in each class, so as to cluster the industrial and commercial customers with similar water use trends and water consumption into a class;
step5, visualizing a clustering result;
step 5.1, calculating the average value of the daily water consumption of the industrial and commercial customers in each class by taking the average value vector of the daily water consumption vectors of all the industrial and commercial customers in each class in the clustering result of the step4 as a class center, and respectively classifying and visualizing the water consumption condition, the class center and the daily water consumption average value of the industrial and commercial customers in each class by drawing a two-dimensional coordinate system;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
and 5.3, acquiring the industrial and commercial tenant id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial tenant so as to visualize the geographical position of the industrial and commercial tenant in each class.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the difference processing is carried out on the accumulated flow data, and the data completion is carried out on the missing value and the abnormal value by adopting the adjacent data, so that the daily water consumption data of each industrial and commercial company is constructed, thereby completing the pretreatment process of the time-series water consumption data, greatly improving the quality of a data mining mode, and being beneficial to improving the efficiency of actual data mining.
2. The method constructs and trains an LSTM model, learns rich water consumption patterns hidden in the time sequence water consumption data of the industrial and commercial enterprises, and represents the time sequence data as a static feature vector with specified dimensionality, so that the feature representation of the water consumption trend of the industrial and commercial enterprises is realized, the water consumption change rule of the industrial and commercial enterprises can be accurately described, and the accuracy of a subsequent clustering algorithm is improved;
3. the method combines a kmeans algorithm to carry out similarity calculation on the water use characteristic vectors of the industrial and commercial customers represented by the LSTM model, thereby completing the clustering of the industrial and commercial customers based on the water use trend, clustering the industrial and commercial customers with similar water use trends into one class, and being beneficial to mining and analyzing different water use change rules presented in different clustering results;
4. the invention aims at industrial and commercial enterprises with similar water use trends, and uses the kmeans algorithm again based on the real daily water use data to finish the clustering of the industrial and commercial enterprises based on the aspect of the water use range, thereby clustering the industrial and commercial enterprises with similar water use trends and water use ranges into a class, being beneficial to analyzing and comparing the difference of the water use ranges of different industrial and commercial enterprises in the clustering results with similar water use trends, and further accurately and completely depicting the water use modes of the industrial and commercial enterprises.
Drawings
FIG. 1 is a process flow diagram for industrial and commercial customer clustering in accordance with the present invention;
FIG. 2 is a diagram of the single cell state structure of the long short term memory model (LSTM) of the present invention;
FIG. 3 is a flow chart of the kmeans clustering algorithm of the present invention.
Detailed Description
In this embodiment, a method for clustering industrial and commercial businesses based on time-series water consumption data, specifically, as shown in fig. 1, is performed according to the following steps:
step1, building daily water consumption data of industrial and commercial businesses;
step 1.1, obtaining remote water meter data of industrial and commercial customers, and extracting industrial and commercial customer id, water meter updating time, accumulated water flow, industrial and commercial customer remote water meter address and industrial and commercial customer name in the remote water meter data;
in the specific implementation, the data of the remote water meter records the accumulated water flow of all industrial and commercial enterprises in 366 days from 2020-01-01 to 2020-12-31, wherein the accumulated flow is updated every hour at the whole point of the water meter. Secondly, the original data are randomly and disorderly arranged and belong to a large csv file, so that required columns such as the industrial and commercial customers id, the water meter updating time, the accumulated water flow, the industrial and commercial customer remote water meter addresses, the industrial and commercial customer names and the like are extracted firstly to reduce the file memory;
step 1.2, carrying out longitude and latitude conversion on the remote water meter address of the industrial and commercial tenant to obtain longitude and latitude information of the industrial and commercial tenant;
in this embodiment, the method for converting the address name of the user into latitude and longitude information by calling the high-resolution map API includes the following steps:
step1, acquiring the URL of the address on the high-resolution map, and then entering the address in keywords of the URL, thereby obtaining the URL of the address.
Step2, sending a request to the URL, obtaining page information corresponding to the URL by using a request.get (URL) text method in python, and converting the page information into character string type data.
And Step3, analyzing json data by using a json loads () method in Python based on the json format of the return type of the page data obtained in Step2, and converting the json data into dictionary type data.
And Step4, extracting the data obtained in Step3, and extracting longitude and latitude information according to the key value and the value in the dictionary.
Step 1.3, remote water meter data are divided according to industrial and commercial customer id to obtain m water meter data files named by the industrial and commercial customer id, and all data in the water meter data files are arranged according to the sequence of water meter updating time; wherein m represents the total number of industrial and commercial businesses;
in specific implementation, the data after the key column is extracted is still huge, which may cause the operation efficiency to be greatly reduced. The processing method includes the steps that large csv files are divided into files according to the industrial and commercial company id columns, the divided files are named according to the id of each industrial and commercial company, and therefore the independent remote water meter recording data of each industrial and commercial company are obtained.
Step 1.4, carrying out difference calculation processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companies
Figure BDA0003082264990000051
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003082264990000052
representing the daily water consumption value of the ith industrial and commercial company on the t day, t representing the water consumption days, and recording the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
In this embodiment, a total 2158 industrial and commercial company with 366 days of complete water record from 2020-01-01 to 2020-12-31 is provided, and assuming that the accumulated flow of the industrial and commercial company a at 1 month and 1 day 00 in 2020 is x and the accumulated flow of the industrial and commercial company a at 1 month and 1 day 24 in 2020 is y, the daily water consumption value of the industrial and commercial company a at 1 month and 1 day in 2020 is (x-y), and so on, the daily water consumption data of 366 days of all the industrial and commercial companies can be calculated.
Step 1.5, carrying out detection and processing of abnormal values on the sample feature set A to obtain a sample feature set X' after abnormal processing;
due to the fact that the water consumption accumulated flow at the initial moment or the water consumption accumulated flow at the last moment of a certain day is lost in record of the remote water meter, an abnormal value occurs in the daily water consumption calculation process. Performing special value (null value) processing on the abnormal value;
in specific implementation, due to abnormal record of the remote water meter, the cumulative flow at 00 time or the cumulative flow at 24 time of a certain industrial and commercial company is 0, so that a correct daily water consumption value cannot be calculated. Therefore, when the daily water consumption data is calculated, the judgment of a conditional statement is required to be set: if (00 time cumulative water flow = =0or 24 time cumulative water flow = = 0), the daily water consumption can be first subjected to null value processing; else, daily water consumption =24 moment accumulated water consumption flow-00 moment accumulated water consumption flow;
step 1.6, processing the missing value of the processed sample feature set A 'to obtain a sample feature set X' subjected to missing processing;
the condition that the water consumption record of a certain industrial and commercial company in a certain day is lost exists in the record of the remote water meter, so that the daily water consumption of a changed day cannot be correctly calculated. Assigning a special value (null value) to the missing daily water consumption data for processing;
in specific implementation, due to abnormal record of the remote water meter, all data of certain industrial and commercial businesses in a certain day are lost, so that the daily water consumption value of the lost day cannot be calculated, and therefore, the judgment of condition statements is required to be set: if (xx year-xx month-xx day = null) can give a null value to the daily water consumption of the missing day;
in this embodiment, the empty value may be filled by using a fillna () method in Python, because the water consumption of the industrial and commercial enterprises is generally large-scale water users, the water consumption is relatively stable, and the daily water consumption values of adjacent days are relatively similar, so parameters in the method may be selected as fillna (method = "filll", axis = 1) for filling the empty value with the value as a previous (column) value of a same row, or selected as fillna (method = "backsfill", axis = 1) for filling the empty value with the value as a next (column) value of the same row;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded as
Figure BDA0003082264990000061
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003082264990000062
expressing the normalized daily water consumption value of the ith industrial company t day, and
Figure BDA0003082264990000063
Figure BDA0003082264990000064
expressing the normalized daily water consumption value of the ith industrial and commercial company on the tth day;
the formula of the normalization processing is as follows:
Figure BDA0003082264990000065
in the formula (1), the acid-base catalyst,
Figure BDA0003082264990000066
the data after the normalization for the variables is carried out,
Figure BDA0003082264990000067
in the case of the original data of the variables,
Figure BDA0003082264990000068
and
Figure BDA0003082264990000069
maximum and minimum values in the raw data, respectively;
step 2.2, pre-training an LSTM model;
normalizing the sample characteristic set
Figure BDA00030822649900000610
Dividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the training set into an LSTM model to obtain a prediction sequence of a verification set, calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root-mean-square error to finish one-time training of the model, and stopping training when the training times reach a predetermined epoch value, so as to obtain a trained LSTM model and serve as a merchant time sequence water use characteristic extraction model;
in specific implementation, the structure diagram of the single cell state of the LSTM model is shown in fig. 2, and the pre-training LSTM model comprises the following steps:
step1: training a forgetting gate, wherein the process is expressed as:
f t =σ(W ft x t +W fh h t-1 +b f )
in the formula, x t To input samples, f t To forget the gate sample, σ (-) represents the activation function, sigmod, W is used ft And W fh Respectively representing forgetting gate and x t And h t-1 Inter weight coefficient, h t-1 Representing a hidden state at time t-1, b f Is the forgetting gate bias coefficient;
step2: training input gates, whose process is represented as:
g t =σ(W gt x t +W gh h t-1 +b g )
in the formula, g t Represents the input Gate sample, W gt And W gh Respectively representing input gate and x t And h t-1 Inter weight coefficient, b g Is the input gate bias coefficient;
step3: updating the memory unit, wherein the process is represented as:
s t =f t s t-1 +g t tanh(W st x t +W sh h t-1 +b s )
in the formula s t In a cellular state, W st And W sh Respectively represent cell and x t And h t-1 Inter weight coefficient, b s The corresponding bias coefficient of the cell;
step4: updating the current state of the output gate, wherein the activation function is a tanh function;
step5: and repeating step1 to step4 until the model converges.
Step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into a commercial business time sequence water use characteristic extraction model, and outputting the water use characteristic vector Y = { Y } of each industrial and commercial business i I =1,2, ·, m }; wherein, y i Represents the water consumption characteristic vector of the ith industrial and commercial company, an
Figure BDA0003082264990000071
Figure BDA0003082264990000072
Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector; in one embodiment, the trained model is stored in a test.h5 file, and is used for storing information such as parameters of the finally determined LSTM model; and setting the dimension of the model output vector to be 64, namely converting 366-dimensional original daily water consumption data into 64-dimensional characteristic vectors to be used as the characteristics of each industrial and commercial company.
Step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2,..., m } for industrial and commercial business clustering based on water usage trends, as shown in fig. 3;
step 3.1, determining the optimal clustering number by combining an elbow method and a contour coefficient method, and recording the optimal clustering number as K;
determining the optimal clustering quantity by combining an elbow method and a contour coefficient method, and recording the optimal clustering quantity as K; the formula of the elbow method is as follows:
Figure BDA0003082264990000081
in the formula (2), K is the number of clusters, C i Denotes the ith cluster, p is C i Sample point of (1), m i Is C i Center of mass (C) i Mean of all samples), SSE is the sum of squared clustering errors for all samples;
with the increase of the clustering number k, the sample division is finer, the clustering degree of each cluster is gradually increased, and then the sum of squared errors SSE is gradually reduced; when k is smaller than the real clustering number, the descending amplitude of SSE is large because the aggregation degree of each cluster is greatly increased due to the increase of k, and when k reaches the real clustering number, the descending amplitude of SSE is suddenly reduced and then tends to be flat along with the continuous increase of the k value, the relation graph of the SSE and the k value is the shape of an elbow, and the k value corresponding to the elbow is the real clustering number of the data;
next, the formula of the contour coefficient method is as follows:
Figure BDA0003082264990000082
in the formula (3), S (i) represents the profile coefficient of the ith sample, a (i) is the intra-cluster dissimilarity representing the mean of the distances from the ith sample to other samples in the cluster to which the ith sample belongs, and b (i) is the inter-cluster dissimilarity representing the minimum value of the average distances from the ith sample to all samples in each cluster not in which the ith sample is located; the mean value of the contour coefficients S (i) of all samples is called the contour coefficient of the clustering result;
s (i) belongs to [ -1,1], the closer the outline coefficient is to 1, the more reasonable the sample i is clustered, so the value of k corresponding to the larger outline coefficient is selected;
in the embodiment, the relationship graphs of the SSE, the contour coefficient and the K value can be simultaneously calculated and drawn, and an optimal K value is determined by combining the elbow of the relationship graph of the SSE and the K value and the local optimal point of the relationship graph of the contour coefficient and the K value, so that the similarity among the classes in the clustering result is as high as possible, and the similarity among the classes is as low as possible;
step 3.2, based on the optimal clustering quantity K, using the water use characteristic vector y of the ith industrial and commercial company i The samples to be detected are input into a kmeans algorithm, so that the industrial companies in the water use characteristic vector Y of each industrial company are gathered into K clusters, and the coordinates of the centers of the K clusters are initialized randomly by using the formula (1):
Figure BDA0003082264990000091
in the formula (1), the reaction mixture is,
Figure BDA0003082264990000092
the center of the k-th cluster is indicated,
Figure BDA0003082264990000093
a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster center
Figure BDA0003082264990000094
European distance of
Figure BDA0003082264990000095
Thereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
Figure BDA0003082264990000096
step 3.4, according to the sample y to be measured i Euclidean distance from the center of each cluster to the sample y to be measured i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, dividing all samples to be tested into the clusters to which the samples belong to obtain K classes, and acquiring the set of industrial and commercial businesses in each class as
Figure BDA0003082264990000097
Wherein the content of the first and second substances,
Figure BDA0003082264990000098
represents the feature vector of the jth industrial business in the kth class, and
Figure BDA0003082264990000099
j=1,2,...,S k ,S k representing the number of industrial businesses in the kth class;
Figure BDA00030822649900000910
representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)
Figure BDA00030822649900000911
Thereby obtaining an updated cluster center
Figure BDA00030822649900000912
And assign a value to
Figure BDA00030822649900000913
k=1,2,...,K;
Figure BDA00030822649900000914
In the formula (6), the reaction mixture is,
Figure BDA00030822649900000915
a coordinate value representing the updated nth dimension of the kth class center;
step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
in specific implementation, step3 is based on the water use trend feature vector of the industrial and commercial customers learned and output by the long-short term memory model in step2, clustering is performed on the industrial and commercial customers by adopting a kmeans algorithm with Euclidean distance as a similarity measurement method, at the moment, the network of the LSTM model learns the content stored, discarded and read in the long-term state of the time sequence water sequence of the industrial and commercial customers, and the long-term water use trend in the time sequence water use data is detected, so that the clustering of the industrial and commercial customers based on the water use trend aspect can be realized by combining the kmeans algorithm based on the 64-dimensional feature vector output by the model;
step4, clustering industrial and commercial customers based on the range of water consumption;
step 4.1, acquiring the industrial and commercial businesses in each class based on the result of the water use trend clustering of the industrial and commercial businessesSet of (A) is B k ={b j |j=1,2,...,S′ k In which b is j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial businesses in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k class after normalization j The real daily water consumption value on the t day is recorded as
Figure BDA0003082264990000101
Figure BDA0003082264990000102
Represents the j industrial business b in the k class after normalization j True daily water usage value on day t, j =1,2. k Repeating the process from the step 3.1 to the step 3.6, and re-clustering the real daily water consumption values of the industrial and commercial customers in each class, so as to cluster the industrial and commercial customers with similar water use trends and water consumption into a class;
in this embodiment, based on the clustering result in step3, many industrial and commercial customers with very similar water use trends and widely different water use ranges are clustered into one class, and we aim to cluster industrial and commercial customers with similar water use trends and slightly different water use ranges into one class, so that on the basis that clustering of the industrial and commercial customers with similar water use trends is completed in step3, we further adopt the kmeans algorithm again based on the original daily water use data of the industrial and commercial customers in the class, and at this time, do not need to capture the water use mode in the time sequence water use data, so that the feature that the kmeans algorithm is sensitive to numerical values so as to distinguish the water use sizes is utilized, and clustering is performed again according to the water use range on the basis of the clustering with similar water use trends, and in the clustering process, the optimal clustering number is determined by still combining the elbow method and the contour coefficient method, so that the similarity in the class is as high as possible, and the inter-class similarity is as low as possible. Finally, industrial and commercial enterprises with similar water use trends and water use ranges are gathered into one category;
step5, visualizing a clustering result;
step 5.1, taking the mean value vector of the daily water consumption vectors of all the industrial and commercial enterprises in each class in the clustering result of the step4 as a class center, calculating the daily water consumption mean value of the industrial and commercial enterprises in each class, and respectively carrying out classification visualization on the water consumption condition, the class center and the daily water consumption mean value of the industrial and commercial enterprises in each class by drawing a two-dimensional coordinate system;
in the specific embodiment, the date is taken as an x axis, the daily water consumption value is taken as a y axis, and the real daily water consumption information, the class center coordinates and the average daily water consumption value of all the industrial and commercial businesses in the class are visualized;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
in the embodiment, all class center coordinate vectors are visualized by taking the date as an x axis and the daily water consumption value as a y axis;
and 5.3, acquiring the industrial and commercial enterprises id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial enterprises so as to visualize the geographical position of the industrial and commercial enterprises in each class.
In specific implementation, the visualization of the geographical position of the class-I industrial and commercial businesses comprises the following steps:
step1: generating a longitude and latitude dictionary according to the id of the industrial and commercial customers in each class and longitude and latitude information corresponding to the id, wherein the key is the id of the industrial and commercial customers, and the value is the longitude and latitude
Step2: drawing a curtain, namely an area for displaying a map, by introducing a Geo packet in a pyecharts drawing tool, displaying the number of industrial and commercial businesses in the class by acquiring the number of pieces of data in each class right above the area, wherein each industrial and commercial business is represented as one piece of data, and setting formats such as the size, background color, font size and the like of the curtain;
step3: using a geo.add () function, setting a parameter maptype = 'joint fertilization', and loading a map resource package of a joint fertilization market in a curtain;
step4: on the fertilizer market map drawn in step3, the longitude and latitude dictionaries obtained in step1 are marked with scattered points one by one, and formats such as the size, shape, color and the like of the scattered points are set;
step5: acquiring name information of each industrial and commercial company according to the industrial and commercial company id, and displaying the name information and longitude and latitude information of the industrial and commercial company on a scatter point in a legend mode;
step6: and storing the maps drawn by step 1-step 5 and the visualization results as html files, so as to complete the visualization of the geographical positions of the industrial and commercial businesses in each class.

Claims (1)

1. A method for clustering industrial and commercial businesses based on time sequence water consumption data is characterized by comprising the following steps:
step1, building daily water consumption data of industrial and commercial businesses;
step 1.1, obtaining remote water meter data of industrial and commercial enterprises, and extracting industrial and commercial enterprise id, water meter updating time, accumulated water flow, industrial and commercial enterprise remote water meter address and industrial and commercial enterprise name in the remote water meter data;
step 1.2, carrying out longitude and latitude conversion on the industrial and commercial tenant remote water meter address to obtain longitude and latitude information of the industrial and commercial tenant;
step 1.3, dividing the remote water meter data according to the industrial and commercial customer id to obtain m parts of water meter data files named by the industrial and commercial customer id, and arranging all data in the water meter data files according to the sequence of water meter updating time; wherein m represents the total number of industrial businesses;
step 1.4, carrying out difference processing on the water consumption accumulated flow value of each industrial and commercial company in the first water meter updating time and the water consumption accumulated flow value of the last water meter updating time every day, and thus constructing daily water consumption vectors of t days of m industrial and commercial companies
Figure FDA0003870288840000011
Wherein the content of the first and second substances,
Figure FDA0003870288840000012
representing the daily water consumption value of the ith industrial and commercial company on the t day, t representing the water consumption days, and recording the sample characteristic set formed by the water consumption vectors of the m industrial and commercial companies on the t day as X = { X = i |i=1,2,...,m};
Step 1.5, carrying out detection and processing on an abnormal value of the sample feature set X to obtain a sample feature set X' after abnormal processing;
step 1.6, processing the missing value of the processed sample feature set X 'to obtain a sample feature set X' subjected to missing processing;
step2, representing the characteristics of time sequence water consumption data based on an LSTM model;
step 2.1, carrying out normalization processing on the sample characteristic set X' subjected to deletion processing to obtain a normalized sample characteristic set which is recorded as
Figure FDA0003870288840000013
Wherein the content of the first and second substances,
Figure FDA0003870288840000014
expressing the normalized daily water consumption value of the ith industrial company t day, and
Figure FDA0003870288840000015
Figure FDA0003870288840000016
expressing the daily water consumption value of the ith industrial company on the tth day after normalization;
step 2.2, pre-training an LSTM model;
the normalized sample characteristic set
Figure FDA0003870288840000017
Dividing the LSTM model into a training set and a verification set, and determining an epoch value, a batch-size value and a predicted step size value of the LSTM model training;
inputting the verification set into the LSTM model to obtain a prediction sequence of the verification set, then calculating an error between the prediction sequence output by the LSTM model and the verification set by adopting a root mean square error so as to complete one training of the LSTM model, and stopping training when the training times reach the epoch value so as to obtain the trained LSTM model and serve as a commercial tenant time sequence water use characteristic extraction model;
step 2.3, inputting the daily water consumption data of all industrial and commercial businesses into the commercial business time sequence water consumption characteristic extraction model, thereby outputting the water consumption characteristic vector Y = { Y } of each industrial and commercial business i I =1,2,. ·, m }; wherein, y i Representing the water use characteristic vector of the ith industrial and commercial company, an
Figure FDA0003870288840000021
Figure FDA0003870288840000022
Representing the nth dimension characteristic value of the ith industrial and commercial company, wherein n represents the dimension of the water use characteristic vector;
step3, adopting a kmeans clustering algorithm to carry out water use eigenvector Y = { Y } on each industrial and commercial company i I =1,2, · m } performing industrial and commercial customer clustering based on water usage trends;
step 3.1, determining the optimal clustering number by combining an elbow method and a contour coefficient method, and recording the optimal clustering number as K;
step 3.2, based on the optimal clustering quantity K, using the water consumption characteristic vector y of the ith industrial and commercial company i The samples to be detected are input into a kmeans algorithm, so that the industrial companies in the water use characteristic vector Y of each industrial company are gathered into K clusters, and the coordinates of the centers of the K clusters are initialized randomly by using the formula (1):
Figure FDA0003870288840000023
in the formula (1), the reaction mixture is,
Figure FDA0003870288840000024
the center of the k-th cluster is indicated,
Figure FDA0003870288840000025
a coordinate value representing the nth dimension of the kth class center;
step 3.3, calculating the sample y to be measured by using the formula (2) i To the kth cluster center
Figure FDA0003870288840000026
European distance of
Figure FDA0003870288840000027
Thereby obtaining a sample y to be measured i Euclidean distance to the center of each cluster:
Figure FDA0003870288840000028
step 3.4, according to the sample y to be measured i Euclidean distance from the center of each cluster to the sample y to be measured i Dividing the cluster into clusters with the shortest Euclidean distance;
step 3.5, dividing all samples to be tested into the clusters to which the samples belong to obtain K classes, and acquiring the set of industrial and commercial businesses in each class as
Figure FDA0003870288840000029
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00038702888400000210
represents the feature vector of the jth industrial business in the kth class, and
Figure FDA00038702888400000211
j=1,2,...,S k ,S k representing the number of industrial businesses in the kth class;
Figure FDA00038702888400000212
representing the characteristic value of the nth dimension of the jth industrial company in the kth class;
calculating the mean value of the feature vectors of the industrial business in the kth class by using the formula (6)
Figure FDA00038702888400000213
Thereby obtaining an updated cluster center of
Figure FDA00038702888400000214
And assign a value to
Figure FDA00038702888400000215
k=1,2,...,K;
Figure FDA00038702888400000216
In the formula (6), the reaction mixture is,
Figure FDA00038702888400000217
a coordinate value representing the nth dimension of the updated kth class center;
step 3.6, repeating the steps 3.3 to 3.5 until the cluster center is not changed any more, outputting the final cluster center and the industrial business id in each class, and accordingly grouping the industrial businesses with similar water use trends into one class;
step4, clustering industrial and commercial customers based on the range of water consumption;
step 4.1, acquiring a set B of industrial and commercial businesses in each class based on the result of the water use trend clustering of the industrial and commercial businesses k ={b j |j=1,2,...,S′ k In which b j Represents the jth industrial business in the kth class, K ∈ {1, 2., K }, S' k Representing the number of industrial and commercial customers in the kth class when the clustering algorithm converges;
step 4.2, obtaining the j industrial and commercial business b in the k category after normalization j The real daily water consumption value on the t day is recorded
Figure FDA0003870288840000031
Figure FDA0003870288840000032
Represents the j industrial business b in the k category after normalization j True daily water usage value on day t, j =1,2. k Repeating the process from the step 3.1 to the step 3.6, and reuniting the real daily water consumption value of each class of industrial and commercial enterprisesThe like, so that industrial and commercial enterprises with similar water use trends and water use amounts are gathered into a category;
step5, visualizing a clustering result;
step 5.1, calculating the average value of the daily water consumption of the industrial and commercial customers in each class by taking the average value vector of the daily water consumption vectors of all the industrial and commercial customers in each class in the clustering result of the step4 as a class center, and respectively classifying and visualizing the water consumption condition, the class center and the daily water consumption average value of the industrial and commercial customers in each class by drawing a two-dimensional coordinate system;
step 5.2, acquiring the class centers in all the clusters calculated in the step 5.1, and simultaneously visualizing the K class centers by drawing a two-dimensional coordinate system;
and 5.3, acquiring the industrial and commercial tenant id in each class in the clustering result of the step 4.2, and drawing a map according to the longitude and latitude information of the industrial and commercial tenant so as to visualize the geographical position of the industrial and commercial tenant in each class.
CN202110569868.3A 2021-05-25 2021-05-25 Industrial and commercial customer clustering method based on time sequence water consumption data Active CN113205368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110569868.3A CN113205368B (en) 2021-05-25 2021-05-25 Industrial and commercial customer clustering method based on time sequence water consumption data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110569868.3A CN113205368B (en) 2021-05-25 2021-05-25 Industrial and commercial customer clustering method based on time sequence water consumption data

Publications (2)

Publication Number Publication Date
CN113205368A CN113205368A (en) 2021-08-03
CN113205368B true CN113205368B (en) 2022-11-29

Family

ID=77023043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110569868.3A Active CN113205368B (en) 2021-05-25 2021-05-25 Industrial and commercial customer clustering method based on time sequence water consumption data

Country Status (1)

Country Link
CN (1) CN113205368B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764517A (en) * 2018-04-08 2018-11-06 中南大学 A kind of blast furnace molten iron silicon content trend method, equipment and storage medium
CN111353523A (en) * 2019-12-24 2020-06-30 中国国家铁路集团有限公司 Method for classifying railway customers
CN111415192A (en) * 2020-02-27 2020-07-14 重庆森鑫炬科技有限公司 Water quality prediction method for user based on big data
CN111722576A (en) * 2020-06-24 2020-09-29 合肥供水集团有限公司 Water supply industry computer lab 3D visual fortune dimension management system
CN112149990A (en) * 2020-09-18 2020-12-29 南京邮电大学 Fuzzy supply and demand matching method based on prediction
CN112433927A (en) * 2020-11-30 2021-03-02 西安理工大学 Cloud server aging prediction method based on time series clustering and LSTM
CN112508275A (en) * 2020-12-07 2021-03-16 国网湖南省电力有限公司 Power distribution network line load prediction method and equipment based on clustering and trend indexes
CN112700068A (en) * 2021-01-15 2021-04-23 武汉大学 Reservoir dispatching rule optimization method based on machine learning fusion of multi-source remote sensing data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550744A (en) * 2015-12-06 2016-05-04 北京工业大学 Nerve network clustering method based on iteration
CN107967542B (en) * 2017-12-21 2021-07-27 国网浙江省电力公司丽水供电公司 Long-short term memory network-based electricity sales amount prediction method
CN109902915A (en) * 2019-01-11 2019-06-18 国网浙江省电力有限公司 A kind of energy behavior analysis method of the electricity-water-gas based on fuzzy C-mean algorithm model
CN110007652B (en) * 2019-03-22 2020-12-29 华中科技大学 Hydroelectric generating set degradation trend interval prediction method and system
CN111260117B (en) * 2020-01-10 2022-03-25 燕山大学 CA-NARX water quality prediction method based on meteorological factors

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764517A (en) * 2018-04-08 2018-11-06 中南大学 A kind of blast furnace molten iron silicon content trend method, equipment and storage medium
CN111353523A (en) * 2019-12-24 2020-06-30 中国国家铁路集团有限公司 Method for classifying railway customers
CN111415192A (en) * 2020-02-27 2020-07-14 重庆森鑫炬科技有限公司 Water quality prediction method for user based on big data
CN111722576A (en) * 2020-06-24 2020-09-29 合肥供水集团有限公司 Water supply industry computer lab 3D visual fortune dimension management system
CN112149990A (en) * 2020-09-18 2020-12-29 南京邮电大学 Fuzzy supply and demand matching method based on prediction
CN112433927A (en) * 2020-11-30 2021-03-02 西安理工大学 Cloud server aging prediction method based on time series clustering and LSTM
CN112508275A (en) * 2020-12-07 2021-03-16 国网湖南省电力有限公司 Power distribution network line load prediction method and equipment based on clustering and trend indexes
CN112700068A (en) * 2021-01-15 2021-04-23 武汉大学 Reservoir dispatching rule optimization method based on machine learning fusion of multi-source remote sensing data

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Water Level Prediction of Community Secondary Water Supply Tank Based on Deep Learning;Han Wu;《IEEE Xplore》;20200213;全文 *
基于层次聚类的LSTM神经网络模型在江苏省降水量预测中的应用;周晓旭;《中国优秀硕士学位论文全文数据库 (基础科学辑)》;20201015;全文 *
基于用水量预测的智慧水务可视化预警系统设计与实现;刘春柳;《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》;20200315;第1-63页 *
基于粗糙集-模糊C均值聚类的Elman神经网络农村需水量预测;李伟等;《科学技术与工程》;20200108(第01期);全文 *
基于聚类LSTM深度学习模型的主动配电网电能质量预测;翁国庆等;《高技术通讯》;20200715(第07期);全文 *
长短期记忆神经网络在多时次土壤水分动态预测中的应用;范嘉智;《土壤》;20210228(第1期);全文 *
集对分析聚类预测法在区域用水量中的应用;袁朝阳;《华北水利水电大学学报(自然科学版)》;20150831;第36卷(第4期);全文 *

Also Published As

Publication number Publication date
CN113205368A (en) 2021-08-03

Similar Documents

Publication Publication Date Title
Kanevski et al. Analysis and modelling of spatial environmental data
Wilks Statistical methods in the atmospheric sciences
CN109492099A (en) It is a kind of based on field to the cross-domain texts sensibility classification method of anti-adaptive
Shen et al. Visual interpretation of recurrent neural network on multi-dimensional time-series forecast
CN112508105A (en) Method for detecting and retrieving faults of oil extraction machine
Wilks The minimum spanning tree histogram as a verification tool for multidimensional ensemble forecasts
CN111949535A (en) Software defect prediction device and method based on open source community knowledge
CN113240518A (en) Bank-to-public customer loss prediction method based on machine learning
Tung et al. Binary classification and data analysis for modeling calendar anomalies in financial markets
Zaidan et al. Predicting atmospheric particle formation days by Bayesian classification of the time series features
Bommer et al. Finding the right XAI method--A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science
CN112181490A (en) Method, device, equipment and medium for identifying function category in function point evaluation method
CN113205368B (en) Industrial and commercial customer clustering method based on time sequence water consumption data
CN111863135B (en) False positive structure variation filtering method, storage medium and computing device
CN113177644A (en) Automatic modeling system based on word embedding and depth time sequence model
CN115293641A (en) Enterprise risk intelligent identification method based on financial big data
Adler et al. Ranking methods within data envelopment analysis
CN109815889A (en) A kind of across resolution ratio face identification method based on character representation collection
CN108960347A (en) A kind of recruitment evaluation system and method for convolutional neural networks handwriting recongnition Ranking Stability
WO2023004632A1 (en) Method and apparatus for updating knowledge graph, electronic device, storage medium, and program
Kegel Feature-based time series analytics
Varshini et al. Stock data analysis with Ulpath automation
Hapsari et al. Fractional Gradient Based Optimization for Nonlinear Separable Data
CN117539920B (en) Data query method and system based on real estate transaction multidimensional data
Wang et al. Research on Daily Tourist Flow Prediction of Scenic Spots Based on Similar Day Clustering and LSSVM Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant