CN117076691A - Commodity resource knowledge graph algorithm model oriented to intelligent communities - Google Patents
Commodity resource knowledge graph algorithm model oriented to intelligent communities Download PDFInfo
- Publication number
- CN117076691A CN117076691A CN202311328941.3A CN202311328941A CN117076691A CN 117076691 A CN117076691 A CN 117076691A CN 202311328941 A CN202311328941 A CN 202311328941A CN 117076691 A CN117076691 A CN 117076691A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- commodity
- user
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 claims abstract description 68
- 238000004458 analytical method Methods 0.000 claims abstract description 51
- 230000005611 electricity Effects 0.000 claims abstract description 28
- 238000007418 data mining Methods 0.000 claims abstract description 24
- 238000005457 optimization Methods 0.000 claims abstract description 17
- 238000004364 calculation method Methods 0.000 claims abstract description 14
- 238000007405 data analysis Methods 0.000 claims abstract description 5
- 238000010276 construction Methods 0.000 claims description 69
- 238000007781 pre-processing Methods 0.000 claims description 57
- 230000008569 process Effects 0.000 claims description 54
- 238000011156 evaluation Methods 0.000 claims description 36
- 238000012549 training Methods 0.000 claims description 36
- 238000012545 processing Methods 0.000 claims description 33
- 239000011159 matrix material Substances 0.000 claims description 27
- 238000001914 filtration Methods 0.000 claims description 21
- 230000009467 reduction Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 15
- 230000006872 improvement Effects 0.000 claims description 15
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 238000007621 cluster analysis Methods 0.000 claims description 9
- 238000002790 cross-validation Methods 0.000 claims description 9
- 238000012544 monitoring process Methods 0.000 claims description 9
- 238000005516 engineering process Methods 0.000 claims description 7
- 230000006399 behavior Effects 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 6
- 238000007619 statistical method Methods 0.000 claims description 6
- 238000012706 support-vector machine Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 230000008520 organization Effects 0.000 claims description 5
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000013145 classification model Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000013480 data collection Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- KJONHKAYOJNZEC-UHFFFAOYSA-N nitrazepam Chemical compound C12=CC([N+](=O)[O-])=CC=C2NC(=O)CN=C1C1=CC=CC=C1 KJONHKAYOJNZEC-UHFFFAOYSA-N 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 235000006694 eating habits Nutrition 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Probability & Statistics with Applications (AREA)
- Animal Behavior & Ethology (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a commodity resource knowledge graph algorithm model oriented to an intelligent community, belongs to the technical field of calculation models, and solves the problems that the existing data analysis and recommendation algorithm is usually independent or random and does not well utilize rich multidimensional data in the community for comprehensive analysis. The method comprises a community nearby entertainment place recommendation algorithm sub-model based on multi-attribute decision, a commodity personalized recommendation algorithm sub-model based on collaborative metric learning, a food personalized recommendation algorithm sub-model based on a self-encoder, a community value comprehensive analysis algorithm sub-model based on a joint classifier, a resident life happiness index analysis algorithm sub-model based on data mining, a resident income prediction analysis algorithm sub-model based on data mining and a resident electricity consumption condition analysis algorithm sub-model based on data mining. The invention provides personalized recommendation and comprehensive analysis through the combination and optimization of a plurality of algorithm submodels, and brings various practical benefits to residents and managers.
Description
Technical Field
The invention belongs to the technical field of calculation models, relates to a multi-model synthesis technology, and particularly relates to a commodity resource knowledge graph algorithm model oriented to an intelligent community.
Background
With the rapid development of information technology and the internet, modern communities have also begun to become "intelligent", i.e., smart communities. The intelligent community integrates various advanced information and communication technologies, and aims to improve the life quality of residents, improve the community management efficiency and promote sustainable development. However, existing data analysis and recommendation algorithms are often independent or random and do not make good use of the rich multidimensional data in communities for comprehensive analysis.
The knowledge graph is used as a data structure capable of expressing complex relations among entities, and is gradually applied to various fields, including social network analysis, semantic search and the like. However, most of the knowledge maps are constructed for specific fields or specific problems at present, and a comprehensive analysis model for a multi-dimensional and multi-level data environment of an intelligent community is lacking.
In summary, the prior art lacks a comprehensive algorithm model, which can comprehensively analyze and recommend commodities and services meeting demands of community residents, lacks an effective comprehensive evaluation mechanism of community value and life quality, and lacks a data basis for a community manager to optimize communities, so that it is necessary to develop a commodity resource knowledge graph algorithm model oriented to intelligent communities to solve the problems.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a commodity resource knowledge graph algorithm model oriented to an intelligent community.
The aim of the invention can be achieved by the following technical scheme: a commodity resource knowledge graph algorithm model oriented to an intelligent community comprises a community nearby entertainment place recommendation algorithm sub-model based on multi-attribute decision, a commodity personalized recommendation algorithm sub-model based on collaborative metric learning, a food personalized recommendation algorithm sub-model based on a self-encoder, a community value comprehensive analysis algorithm sub-model based on a joint classifier, a resident life happiness index analysis algorithm sub-model based on data mining, a resident income prediction analysis algorithm sub-model based on data mining and a resident electricity consumption condition analysis algorithm sub-model based on data mining;
the commodity resource knowledge graph algorithm model facing the intelligent community adopts a ternary structure construction principle of entity-relation-entity;
the total construction process of the commodity resource knowledge graph algorithm model facing the intelligent community is as follows:
s1, data acquisition and preprocessing: acquiring commodity data information from a structured database of a community platform, and preprocessing the commodity data information, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration, and the special character recognition and filtration is completed by adopting a character matching method of a regular expression;
S2, extracting relation: determining a commodity category relationship and a commodity organization structure dependency relationship according to the preprocessed commodity data information, wherein the commodity category relationship is determined based on a Word2Vec synonym discovery method, and the commodity organization structure dependency relationship is defined in a manual mode;
s3, building and correcting an ontology: the commodity knowledge graph is obtained after ontology construction is carried out based on a clustering algorithm, and the quality of the commodity knowledge graph is evaluated from the aspect of target diversity and relationship fine granularity; and correcting and confirming the commodity knowledge graph by adopting an artificial intelligence technology and combining manual operation.
S4, evaluating and optimizing a model: optimizing and adjusting the model according to the evaluation result;
s5, user feedback and updating.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the recommendation algorithm sub-model of the nearby entertainment places of the community based on the multi-attribute decision is constructed according to the ternary configuration construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring data information of all entertainment places nearby a community from a community service platform, wherein the data information comprises distance, people, scores, prices and environments; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, setting attribute quantization and preference functions:
quantifying each attribute according to the superior sequence, and mapping the value of the attribute into a corresponding numerical value; selecting a corresponding criterion function to represent the preference degree according to the standard degree of each attribute of the entertainment place; setting a criterion weight coefficient, and learning and adjusting by using a weight network of a BP algorithm to obtain the weight of each attribute;
c, constructing a multi-attribute decision model:
calculating a multi-attribute weight for each casino using a multi-attribute linear relationship model by training and learning a previously collected data set; calculating a multi-criterion priority index of each entertainment place, and combining the weight of each attribute and the calculation result of the preference function; determining the net flow of each entertainment place through the calculation of the positive flow and the negative flow so as to measure the priority of each entertainment place in multi-attribute decision; comparing the priority intensities of different schemes according to the magnitude of the net flow, and sequencing entertainment places according to the priority intensities;
d, result presentation and interaction:
presenting the ordered entertainment venue list to a user, and arranging the entertainment venue list according to the order of the priority intensity from high to low for the user to select;
e, model evaluation and tuning:
Evaluating the constructed algorithm model, and evaluating the performance and accuracy of the model by using a cross-validation method; taking the evaluation result as a basis for model tuning;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the algorithm model.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the commodity personalized recommendation algorithm sub-model based on collaborative metric learning is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring transaction record data of commodity purchased by a user from a community service platform; preprocessing transaction record data, including scoring and weighted average processing of evaluation data, so as to obtain a user commodity scoring matrix;
b, constructing a collaborative metric learning model:
aiming at the sparsity problem of the scoring matrix, decomposing the scoring matrix into a user matrix and a commodity matrix so as to perform collaborative metric learning; the collaborative metric learning model learns a metric relationship between a user and a commodity to predict a score of the user on the commodity;
c, parameter tuning of cooperative measurement learning model:
when the scoring data is not standard, learning a Pearson correlation coefficient lambda by using a cooperative measurement learning model, and estimating the predicted scoring of the commodity by a user by using a trained Pelson distance calculation formula;
d, personalized recommendation of commodities:
for a given user, calculating a predictive score of the user for the non-purchased commodity according to the metric relation between the user and the commodity learned by the collaborative metric learning model; based on the predictive scores, generating a personalized commodity recommendation list according to a certain rule, and presenting the personalized commodity recommendation list to a user;
e, evaluation and improvement of results:
evaluating the recommendation result, measuring recommendation performance by using accuracy, recall rate and coverage rate, and taking the evaluation result as the basis for algorithm improvement and optimization;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the food personalized recommendation algorithm sub-model based on the self-encoder is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring transaction record data of food purchased by a user from a community service platform; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, constructing a user-project scoring matrix:
constructing a user-project scoring matrix according to the processed transaction record data, wherein each element represents the score of the user on the corresponding project food;
c, design and training of a self-encoder network:
designing a self-encoder network to encode and decode user feature vectors; performing unsupervised training of the self-encoder network, using a user-project scoring matrix as input data, with the goal of minimizing reconstruction errors, and obtaining a coding function after training is completed;
d, secondary coding and hamming distance calculation:
performing secondary coding on the coded high-dimensional sparse feature vector, and converting the high-dimensional sparse feature vector into a low-dimensional dense binarized feature vector; measuring the similarity between users by using a Hamming distance formula;
e, personalized recommendation of food:
calculating the Hamming distance similarity between the target user and other users to obtain a similar user set; predicting the score of the target user on the un-purchased food based on the scoring information of the similar users; generating a personalized food recommendation list according to a certain rule according to the prediction score, and presenting the personalized food recommendation list to a target user;
f, evaluating and optimizing results:
and evaluating the recommendation result, wherein a certain index can be used for measuring the recommendation performance. Optimizing according to the evaluation result, adjusting the network structure of the self-encoder and adjusting the Hamming distance threshold;
g, user feedback and update:
collecting feedback information of users, and knowing satisfaction and improvement opinion of recommendation results; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the community value comprehensive analysis algorithm sub-model based on the joint classifier is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting data related to community values from a community big data environment, wherein the data comprise traffic conditions, educational environments, hardware facilities and greening data information; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b features represent:
performing feature representation on the collected information data, performing semantic analysis on the language descriptive data, and converting the language descriptive data into numerical representation;
c, designing a joint classifier neural network:
the joint classifier neural network comprises a primary softmax classifier and a secondary softmax classifier;
first-order softmax classifier: performing label calibration on the singles by using the training data set, and performing independent classification on the data set;
secondary softmax classifier: reclassifying the results of the primary classifier, and obtaining a final community value grade according to the output of the secondary classifier;
d, training a neural network:
training the joint classifier by using a BP forward propagation neural network algorithm; using the calibrated training data set as input, the goal is to minimize the classification error of the classifier;
e, community value evaluation and prediction:
performing value evaluation and prediction on the new community data by using the trained joint classifier; inputting data to be evaluated, and sequentially processing the data by a primary classifier and a secondary classifier to obtain a final community value grade;
f, evaluating and optimizing results:
evaluating and verifying the model, and optimizing and adjusting the algorithm model according to the evaluation result;
g, user feedback and update:
and collecting new data and user feedback information, and taking the new data and the user feedback as the basis of updating the model.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the residential life happiness index analysis algorithm sub-model based on data mining is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting resident information data from a community service platform, including gender, age, education level, marital, income, and family population index; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
the collected resident information data is subjected to feature selection and conversion, and features related to happiness are extracted; the non-numerical data is coded and converted into numerical representation;
c, cluster analysis:
carrying out cluster analysis on resident data by using a K-means clustering algorithm, dividing residents into different groups or clusters, wherein each cluster represents different happiness level;
in the clustering process, age is taken as one of influencing factors, the age is divided into a plurality of stages, a clustering center is initialized for each stage, the similarity between each resident and the clustering center is calculated, and the similarity is assigned to the nearest clustering cluster; continuously updating a clustering center by using a sample mean value in the clusters until a convergence condition is reached;
d happiness index analysis and assessment:
analyzing the happiness index distribution situation of residents in each cluster, and formulating the implementation frequency of entertainment activities in the corresponding communities according to the analysis result;
e, model evaluation and optimization:
and evaluating the constructed model by using a cross-validation and ROC curve method, and taking an evaluation result as a basis for model optimization and adjustment.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the residential income prediction analysis algorithm sub-model based on data mining is constructed according to the ternary construction principle and the total construction process, and the specific construction process is as follows:
a, data collection and cleaning:
collecting basic information data of community residents, including age, gender, work type, education level, marital status and number of work hours per week; preprocessing the data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
selecting proper characteristics according to field knowledge and data analysis, and coding certain characteristics, wherein the single-heat coding or the label coding is adopted; selecting relevant features by using a statistical method or a feature importance analysis method;
c, data division:
dividing the data set into a training set and a testing set;
d, constructing a model:
model training is carried out by using a multi-classification model TMCL-SVM of a support vector machine, and in the training process, the TMCL-SVM carries out linkage training on the relations among all the classes and the hyperplane;
e, super parameter tuning:
selecting kernel functions and regularization parameter super-parameters of the TMCL-SVM by using a cross-validation technology;
f, evaluating a model:
evaluating the performance of the model by using test set data, evaluating the model effect by using accuracy, precision, recall rate and F1 score index, and evaluating the performance of the model by drawing a confusion matrix and an ROC curve;
g, maintaining and updating a model:
and (3) monitoring the performance of the model by adopting an automatic process, and automatically triggering the model to retrain when the performance is reduced. The model needs to be regularly maintained and updated to cope with changes in new data and degradation in model performance.
In the commodity resource knowledge graph algorithm model facing the intelligent community, the residential electricity consumption situation analysis algorithm sub-model based on data mining is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a, data acquisition and preprocessing:
Acquiring residential electricity data from a community power supply station, wherein the residential electricity data comprises user files and electricity history data; preprocessing data, namely removing redundancy, processing missing values and abnormal values, normalizing the data, and converting the data in different ranges into a uniform numerical range so as to eliminate dimension differences;
b, characteristic engineering:
extracting features for clustering from the integrated data, wherein the features comprise electricity consumption, electricity consumption time period and electricity consumption type; selecting features representative of the user classification using a statistical method;
c, cluster analysis:
adopting a K-means clustering algorithm, combining with an optimized K-means+ model, and selecting an initial clustering center by considering the distance between clusters, the compactness of objects in the clusters and the density of the objects;
d, clustering result analysis:
evaluating the inter-cluster distance of the clustering result to ensure that users of different categories are correctly distinguished; checking the density of the objects in each cluster to ensure that the objects in the clusters meet the requirement of compactness;
and e, extracting user behavior characteristics:
analyzing the user load curve in each cluster, and extracting the power consumption behavior characteristics of typical users;
f, continuous optimization and monitoring:
and periodically monitoring the performance of the algorithm model, collecting new data and user feedback information, and taking the new data and the user feedback as the basis of continuous optimization of the model.
Compared with the prior art, the commodity resource knowledge graph algorithm model oriented to the intelligent community has the following beneficial effects:
1. community casino recommendations: through the algorithm submodel based on multi-attribute decision, personalized recommendation for entertainment venues near communities can be provided, and residents can be helped to quickly find places conforming to the preferences of the residents.
2. Personalized recommendation of commodities: by means of collaborative metric learning-based, self-encoder-based algorithm sub-models, residents can be recommended with goods suited to their personalized needs according to their shopping history and interests.
3. Personalized recommendation of food: through the algorithm submodel based on the self-encoder, personalized food recommendation can be provided according to the taste preference and the eating habit of residents, and the taste requirements of the residents are met.
4. Community value comprehensive analysis: the algorithm submodel based on the joint classifier can be used for comprehensively analyzing the value of the community, so that a community manager is helped to evaluate and improve the overall value of the community.
5. Resident life happiness index analysis: through the algorithm submodel based on data mining, the life happiness index of residents can be analyzed, and a reference basis is provided for community decision.
6. Prediction and analysis of resident income: by means of the algorithm submodel based on data mining, the income situation of residents can be predicted, and a reference basis is provided for community planning and resource allocation.
7. And (3) analyzing residential electricity conditions: through the algorithm submodel based on data mining, the electricity consumption condition of residents can be analyzed, and community managers are helped to optimize energy utilization and plan electricity supply.
In summary, the invention provides the commodity resource knowledge graph algorithm model oriented to the intelligent community, and personalized recommendation and comprehensive analysis can be provided through combination and optimization of a plurality of algorithm sub-models, so that various practical benefits are brought to residents and managers of the intelligent community.
Detailed Description
The following are specific examples of the present invention, and the technical solutions of the present invention are further described, but the present invention is not limited to these examples.
A commodity resource knowledge graph algorithm model oriented to an intelligent community comprises a community nearby entertainment place recommendation algorithm sub-model based on multi-attribute decision, a commodity personalized recommendation algorithm sub-model based on collaborative metric learning, a food personalized recommendation algorithm sub-model based on a self-encoder, a community value comprehensive analysis algorithm sub-model based on a joint classifier, a resident life happiness index analysis algorithm sub-model based on data mining, a resident income prediction analysis algorithm sub-model based on data mining and a resident electricity consumption condition analysis algorithm sub-model based on data mining;
The commodity resource knowledge graph algorithm model facing the intelligent community adopts a ternary structure construction principle of entity-relation-entity;
the total construction process of the commodity resource knowledge graph algorithm model oriented to the intelligent community is as follows:
s1, data acquisition and preprocessing: acquiring commodity data information from a structured database of a community platform, and preprocessing the commodity data information, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration, and the special character recognition and filtration is completed by adopting a character matching method of a regular expression;
s2, extracting relation: determining a commodity category relationship and a commodity organization structure attachment relationship according to the preprocessed commodity data information, wherein the commodity category relationship is determined based on a Word2Vec synonym discovery method, and the commodity organization structure attachment relationship is defined manually;
s3, building and correcting an ontology: the commodity knowledge graph is obtained after ontology construction is carried out based on a clustering algorithm, and the quality of the commodity knowledge graph is evaluated from the aspect of target diversity and relationship fine granularity; and correcting and confirming the commodity knowledge graph by adopting an artificial intelligence technology and combining manual operation.
S4, evaluating and optimizing a model: optimizing and adjusting the model according to the evaluation result;
S5, user feedback and updating.
Preferably, the recommendation algorithm submodel of the nearby entertainment places of the community based on multi-attribute decision is constructed according to a ternary construction principle and a total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring data information of all entertainment places nearby a community from a community service platform, wherein the data information comprises distance, people, scores, prices and environments; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, setting attribute quantization and preference functions:
quantifying each attribute according to the superior sequence, and mapping the value of the attribute into a corresponding numerical value; selecting a corresponding criterion function to represent the preference degree according to the standard degree of each attribute of the entertainment place; setting a criterion weight coefficient, and learning and adjusting by using a weight network of a BP algorithm to obtain the weight of each attribute;
c, constructing a multi-attribute decision model:
calculating a multi-attribute weight for each casino using a multi-attribute linear relationship model by training and learning a previously collected data set; calculating a multi-criterion priority index of each entertainment place, and combining the weight of each attribute and the calculation result of the preference function; determining the net flow of each entertainment place through the calculation of the positive flow and the negative flow so as to measure the priority of each entertainment place in multi-attribute decision; comparing the priority intensities of different schemes according to the magnitude of the net flow, and sequencing entertainment places according to the priority intensities;
d, result presentation and interaction:
presenting the ordered entertainment venue list to a user, and arranging the entertainment venue list according to the order of the priority intensity from high to low for the user to select;
e, model evaluation and tuning:
evaluating the constructed algorithm model, and evaluating the performance and accuracy of the model by using a cross-validation method; taking the evaluation result as a basis for model tuning;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the algorithm model.
Preferably, the commodity personalized recommendation algorithm submodel based on collaborative metric learning is constructed according to a ternary structure construction principle and a total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring transaction record data of commodity purchased by a user from a community service platform; preprocessing transaction record data, including scoring and weighted average processing of evaluation data, so as to obtain a user commodity scoring matrix;
b, constructing a collaborative metric learning model:
aiming at the sparsity problem of the scoring matrix, decomposing the scoring matrix into a user matrix and a commodity matrix so as to perform collaborative metric learning; the collaborative metric learning model learns a metric relationship between a user and a commodity to predict a score of the user on the commodity;
c, parameter tuning of cooperative measurement learning model:
when the scoring data is not standard, learning a Pearson correlation coefficient lambda by using a cooperative measurement learning model, and estimating the predicted scoring of the commodity by a user by using a trained Pelson distance calculation formula;
d, personalized recommendation of commodities:
for a given user, calculating a predictive score of the user for the non-purchased commodity according to the metric relation between the user and the commodity learned by the collaborative metric learning model; based on the predictive scores, generating a personalized commodity recommendation list according to a certain rule, and presenting the personalized commodity recommendation list to a user;
e, evaluation and improvement of results:
evaluating the recommendation result, measuring recommendation performance by using accuracy, recall rate and coverage rate, and taking the evaluation result as the basis for algorithm improvement and optimization;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
Preferably, the food personalized recommendation algorithm submodel based on the self-encoder is constructed according to a ternary structure construction principle and a total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
Acquiring transaction record data of food purchased by a user from a community service platform; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, constructing a user-project scoring matrix:
constructing a user-project scoring matrix according to the processed transaction record data, wherein each element represents the score of the user on the corresponding project food;
c, design and training of a self-encoder network:
designing a self-encoder network to encode and decode user feature vectors; performing unsupervised training of the self-encoder network, using a user-project scoring matrix as input data, with the goal of minimizing reconstruction errors, and obtaining a coding function after training is completed;
d, secondary coding and hamming distance calculation:
performing secondary coding on the coded high-dimensional sparse feature vector, and converting the high-dimensional sparse feature vector into a low-dimensional dense binarized feature vector; measuring the similarity between users by using a Hamming distance formula;
e, personalized recommendation of food:
calculating the Hamming distance similarity between the target user and other users to obtain a similar user set; predicting the score of the target user on the un-purchased food based on the scoring information of the similar users; generating a personalized food recommendation list according to a certain rule according to the prediction score, and presenting the personalized food recommendation list to a target user;
f, evaluating and optimizing results:
and evaluating the recommendation result, wherein a certain index can be used for measuring the recommendation performance. Optimizing according to the evaluation result, adjusting the network structure of the self-encoder and adjusting the Hamming distance threshold;
g, user feedback and update:
collecting feedback information of users, and knowing satisfaction and improvement opinion of recommendation results; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
Preferably, the community value comprehensive analysis algorithm submodel based on the joint classifier is constructed according to a ternary structure construction principle and a total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting data related to community values from a community big data environment, wherein the data comprise traffic conditions, educational environments, hardware facilities and greening data information; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b features represent:
performing feature representation on the collected information data, performing semantic analysis on the language descriptive data, and converting the language descriptive data into numerical representation;
c, designing a joint classifier neural network:
the joint classifier neural network comprises a primary softmax classifier and a secondary softmax classifier;
First-order softmax classifier: performing label calibration on the singles by using the training data set, and performing independent classification on the data set;
secondary softmax classifier: reclassifying the results of the primary classifier, and obtaining a final community value grade according to the output of the secondary classifier;
d, training a neural network:
training the joint classifier by using a BP forward propagation neural network algorithm; using the calibrated training data set as input, the goal is to minimize the classification error of the classifier;
e, community value evaluation and prediction:
performing value evaluation and prediction on the new community data by using the trained joint classifier; inputting data to be evaluated, and sequentially processing the data by a primary classifier and a secondary classifier to obtain a final community value grade;
f, evaluating and optimizing results:
evaluating and verifying the model, and optimizing and adjusting the algorithm model according to the evaluation result;
g, user feedback and update:
and collecting new data and user feedback information, and taking the new data and the user feedback as the basis of updating the model.
Preferably, the resident life happiness index analysis algorithm submodel based on data mining is constructed according to a ternary structure construction principle and a total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting resident information data from a community service platform, including gender, age, education level, marital, income, and family population index; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
the collected resident information data is subjected to feature selection and conversion, and features related to happiness are extracted; the non-numerical data is coded and converted into numerical representation;
c, cluster analysis:
carrying out cluster analysis on resident data by using a K-means clustering algorithm, dividing residents into different groups or clusters, wherein each cluster represents different happiness level;
in the clustering process, age is taken as one of influencing factors, the age is divided into a plurality of stages, a clustering center is initialized for each stage, the similarity between each resident and the clustering center is calculated, and the similarity is assigned to the nearest clustering cluster; continuously updating a clustering center by using a sample mean value in the clusters until a convergence condition is reached;
d happiness index analysis and assessment:
analyzing the happiness index distribution situation of residents in each cluster, and formulating the implementation frequency of entertainment activities in the corresponding communities according to the analysis result;
e, model evaluation and optimization:
and evaluating the constructed model by using a cross-validation and ROC curve method, and taking an evaluation result as a basis for model optimization and adjustment.
Preferably, the residential income prediction analysis algorithm submodel based on data mining is constructed according to a ternary construction principle and a total construction process, and the specific construction process is as follows:
a, data collection and cleaning:
collecting basic information data of community residents, including age, gender, work type, education level, marital status and number of work hours per week; preprocessing the data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
selecting proper characteristics according to field knowledge and data analysis, and coding certain characteristics, wherein the single-heat coding or the label coding is adopted; selecting relevant features by using a statistical method or a feature importance analysis method;
c, data division:
dividing the data set into a training set and a testing set;
d, constructing a model:
model training is carried out by using a multi-classification model TMCL-SVM (Target Multi Classification Linkage-Support Vector Machine) of a support vector machine, and in the training process, the TMCL-SVM (Target Multi Classification Linkage-Support Vector Machine) carries out linkage training on the relations among all the classes and the hyperplane;
e, super parameter tuning:
selecting kernel functions, regularization parameter superparameters of the TMCL-SVM (Target Multi Classification Linkage-Support Vector Machine) using cross-validation techniques;
f, evaluating a model:
evaluating the performance of the model by using test set data, evaluating the model effect by using accuracy, precision, recall rate and F1 score index, and evaluating the performance of the model by drawing a confusion matrix and an ROC curve;
g, maintaining and updating a model:
and (3) monitoring the performance of the model by adopting an automatic process, and automatically triggering the model to retrain when the performance is reduced. The model needs to be regularly maintained and updated to cope with changes in new data and degradation in model performance.
Preferably, the residential electricity consumption situation analysis algorithm submodel based on data mining is constructed according to a ternary structure construction principle and a total construction process, and the specific construction process is as follows:
a, data acquisition and preprocessing:
acquiring residential electricity data from a community power supply station, wherein the residential electricity data comprises user files and electricity history data; preprocessing data, namely removing redundancy, processing missing values and abnormal values, normalizing the data, and converting the data in different ranges into a uniform numerical range so as to eliminate dimension differences;
b, characteristic engineering:
extracting features for clustering from the integrated data, wherein the features comprise electricity consumption, electricity consumption time period and electricity consumption type; selecting features representative of the user classification using a statistical method;
c, cluster analysis:
adopting a K-means clustering algorithm, combining with an optimized K-means+ model, and selecting an initial clustering center by considering the distance between clusters, the compactness of objects in the clusters and the density of the objects;
d, clustering result analysis:
evaluating the inter-cluster distance of the clustering result to ensure that users of different categories are correctly distinguished; checking the density of the objects in each cluster to ensure that the objects in the clusters meet the requirement of compactness;
and e, extracting user behavior characteristics:
analyzing the user load curve in each cluster, and extracting the power consumption behavior characteristics of typical users;
f, continuous optimization and monitoring:
and periodically monitoring the performance of the algorithm model, collecting new data and user feedback information, and taking the new data and the user feedback as the basis of continuous optimization of the model.
The specific embodiments described herein are offered by way of example only to illustrate the spirit of the invention. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions thereof without departing from the spirit of the invention or exceeding the scope of the invention as defined in the accompanying claims.
Claims (8)
1. The commodity resource knowledge graph algorithm model for the intelligent community is characterized by comprising a community nearby entertainment place recommendation algorithm sub-model based on multi-attribute decision, a commodity personalized recommendation algorithm sub-model based on collaborative metric learning, a delicacy personalized recommendation algorithm sub-model based on a self-encoder, a community value comprehensive analysis algorithm sub-model based on a joint classifier, a resident life happiness index analysis algorithm sub-model based on data mining, a resident income prediction analysis algorithm sub-model based on data mining and a resident electricity consumption condition analysis algorithm sub-model based on data mining;
the commodity resource knowledge graph algorithm model facing the intelligent community adopts a ternary structure construction principle of entity-relation-entity;
the total construction process of the commodity resource knowledge graph algorithm model facing the intelligent community is as follows:
s1, data acquisition and preprocessing: acquiring commodity data information from a structured database of a community platform, and preprocessing the commodity data information, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration, and the special character recognition and filtration is completed by adopting a character matching method of a regular expression;
S2, extracting relation: determining a commodity category relationship and a commodity organization structure attachment relationship according to the preprocessed commodity data information, wherein the commodity category relationship is determined based on a Word2Vec synonym discovery method;
s3, building and correcting an ontology: the commodity knowledge graph is obtained after ontology construction is carried out based on a clustering algorithm, and the quality of the commodity knowledge graph is evaluated from the aspect of target diversity and relationship fine granularity;
s4, evaluating and optimizing a model: optimizing and adjusting the model according to the evaluation result;
s5, user feedback and updating.
2. The intelligent community-oriented commodity resource knowledge graph algorithm model according to claim 1, wherein the community-nearby entertainment place recommendation algorithm sub-model based on multi-attribute decision is constructed according to the ternary configuration construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring data information of all entertainment places nearby a community from a community service platform, wherein the data information comprises distance, people, scores, prices and environments; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, setting attribute quantization and preference functions:
quantifying each attribute according to the superior sequence, and mapping the value of the attribute into a corresponding numerical value; selecting a corresponding criterion function to represent the preference degree according to the standard degree of each attribute of the entertainment place; setting a criterion weight coefficient, and learning and adjusting by using a weight network of a BP algorithm to obtain the weight of each attribute;
c, constructing a multi-attribute decision model:
calculating a multi-attribute weight for each casino using a multi-attribute linear relationship model by training and learning a previously collected data set; calculating a multi-criterion priority index of each entertainment place, and combining the weight of each attribute and the calculation result of the preference function; determining the net flow of each entertainment place through the calculation of the positive flow and the negative flow so as to measure the priority of each entertainment place in multi-attribute decision; comparing the priority intensities of different schemes according to the magnitude of the net flow, and sequencing entertainment places according to the priority intensities;
d, result presentation and interaction:
presenting the ordered entertainment venue list to a user, and arranging the entertainment venue list according to the order of the priority intensity from high to low for the user to select;
e, model evaluation and tuning:
Evaluating the constructed algorithm model, and evaluating the performance and accuracy of the model by using a cross-validation method; taking the evaluation result as a basis for model tuning;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the algorithm model.
3. The commodity resource knowledge graph algorithm model for the intelligent community according to claim 1, wherein commodity personalized recommendation algorithm submodels based on collaborative metric learning are constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
acquiring transaction record data of commodity purchased by a user from a community service platform; preprocessing transaction record data, including scoring and weighted average processing of evaluation data, so as to obtain a user commodity scoring matrix;
b, constructing a collaborative metric learning model:
aiming at the sparsity problem of the scoring matrix, decomposing the scoring matrix into a user matrix and a commodity matrix so as to perform collaborative metric learning; the collaborative metric learning model learns a metric relationship between a user and a commodity to predict a score of the user on the commodity;
c, parameter tuning of cooperative measurement learning model:
when the scoring data is not standard, learning a Pearson correlation coefficient lambda by using a cooperative measurement learning model, and estimating the predicted scoring of the commodity by a user by using a trained Pelson distance calculation formula;
d, personalized recommendation of commodities:
for a given user, calculating a predictive score of the user for the non-purchased commodity according to the metric relation between the user and the commodity learned by the collaborative metric learning model; based on the predictive scores, generating a personalized commodity recommendation list according to a certain rule, and presenting the personalized commodity recommendation list to a user;
e, evaluation and improvement of results:
evaluating the recommendation result, measuring recommendation performance by using accuracy, recall rate and coverage rate, and taking the evaluation result as the basis for algorithm improvement and optimization;
f, user feedback and update:
collecting feedback information of users, and knowing satisfaction degree and improvement opinion of the users on the recommendation result; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
4. The commodity resource knowledge graph algorithm model for the intelligent community according to claim 1, wherein the modeling of the food personalized recommendation algorithm sub model based on the self encoder is carried out according to the ternary structure modeling principle and the total modeling process, and the specific modeling process is as follows:
a data acquisition and preprocessing:
acquiring transaction record data of food purchased by a user from a community service platform; preprocessing data, including redundancy removal, noise reduction processing, outlier processing and special character recognition and filtration;
b, constructing a user-project scoring matrix:
constructing a user-project scoring matrix according to the processed transaction record data, wherein each element represents the score of the user on the corresponding project food;
c, design and training of a self-encoder network:
designing a self-encoder network to encode and decode user feature vectors; performing unsupervised training of the self-encoder network, using a user-project scoring matrix as input data, with the goal of minimizing reconstruction errors, and obtaining a coding function after training is completed;
d, secondary coding and hamming distance calculation:
performing secondary coding on the coded high-dimensional sparse feature vector, and converting the high-dimensional sparse feature vector into a low-dimensional dense binarized feature vector; measuring the similarity between users by using a Hamming distance formula;
e, personalized recommendation of food:
calculating the Hamming distance similarity between the target user and other users to obtain a similar user set; predicting the score of the target user on the un-purchased food based on the scoring information of the similar users; generating a personalized food recommendation list according to a certain rule according to the prediction score, and presenting the personalized food recommendation list to a target user;
f, evaluating and optimizing results:
evaluating the recommended result, optimizing according to the evaluated result, adjusting the network structure of the self-encoder and adjusting the Hamming distance threshold;
g, user feedback and update:
collecting feedback information of users, and knowing satisfaction and improvement opinion of recommendation results; and taking the user feedback as the basis for improving and updating the recommendation algorithm model.
5. The commodity resource knowledge graph algorithm model for the intelligent community according to claim 1, wherein the community value comprehensive analysis algorithm sub-model based on the joint classifier is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting data related to community values from a community big data environment, wherein the data comprise traffic conditions, educational environments, hardware facilities and greening data information; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b features represent:
performing feature representation on the collected information data, performing semantic analysis on the language descriptive data, and converting the language descriptive data into numerical representation;
c, designing a joint classifier neural network:
The joint classifier neural network comprises a primary softmax classifier and a secondary softmax classifier;
first-order softmax classifier: performing label calibration on the singles by using the training data set, and performing independent classification on the data set;
secondary softmax classifier: reclassifying the results of the primary classifier, and obtaining a final community value grade according to the output of the secondary classifier;
d, training a neural network:
training the joint classifier by using a BP forward propagation neural network algorithm; using the calibrated training data set as input, the goal is to minimize the classification error of the classifier;
e, community value evaluation and prediction:
performing value evaluation and prediction on the new community data by using the trained joint classifier; inputting data to be evaluated, and sequentially processing the data by a primary classifier and a secondary classifier to obtain a final community value grade;
f, evaluating and optimizing results:
evaluating and verifying the model, and optimizing and adjusting the algorithm model according to the evaluation result;
g, user feedback and update:
and collecting new data and user feedback information, and taking the new data and the user feedback as the basis of updating the model.
6. The commodity resource knowledge graph algorithm model for the intelligent community according to claim 1, wherein the residential life happiness index analysis algorithm sub-model based on data mining is constructed according to the ternary construction principle and the total construction process, and the specific construction process is as follows:
a data acquisition and preprocessing:
collecting resident information data from a community service platform, including gender, age, education level, marital, income, and family population index; preprocessing data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
the collected resident information data is subjected to feature selection and conversion, and features related to happiness are extracted; the non-numerical data is coded and converted into numerical representation;
c, cluster analysis:
carrying out cluster analysis on resident data by using a K-means clustering algorithm, dividing residents into different groups or clusters, wherein each cluster represents different happiness level;
in the clustering process, age is taken as one of influencing factors, the age is divided into a plurality of stages, a clustering center is initialized for each stage, the similarity between each resident and the clustering center is calculated, and the similarity is assigned to the nearest clustering cluster; continuously updating a clustering center by using a sample mean value in the clusters until a convergence condition is reached;
d happiness index analysis and assessment:
analyzing the happiness index distribution situation of residents in each cluster, and formulating the implementation frequency of entertainment activities in the corresponding communities according to the analysis result;
e, model evaluation and optimization:
and evaluating the constructed model by using a cross-validation and ROC curve method, and taking an evaluation result as a basis for model optimization and adjustment.
7. The intelligent community-oriented commodity resource knowledge graph algorithm model according to claim 1, wherein the residential income prediction analysis algorithm sub-model based on data mining is constructed according to the ternary construction principle and the total construction process, and the specific construction process is as follows:
a, data collection and cleaning:
collecting basic information data of community residents, including age, gender, work type, education level, marital status and number of work hours per week; preprocessing the data, wherein the preprocessing comprises redundancy removal, noise reduction, outlier processing and special character recognition and filtration;
b, characteristic engineering:
selecting proper characteristics according to field knowledge and data analysis, and coding certain characteristics, wherein the single-heat coding or the label coding is adopted; selecting relevant features by using a statistical method or a feature importance analysis method;
c, data division:
dividing the data set into a training set and a testing set;
d, constructing a model:
model training is carried out by using a multi-classification model TMCL-SVM of a support vector machine, and in the training process, the TMCL-SVM carries out linkage training on the relations among all the classes and the hyperplane;
e, super parameter tuning:
selecting kernel functions and regularization parameter super-parameters of the TMCL-SVM by using a cross-validation technology;
f, evaluating a model:
evaluating the performance of the model by using test set data, evaluating the model effect by using accuracy, precision, recall rate and F1 score index, and evaluating the performance of the model by drawing a confusion matrix and an ROC curve;
g, maintaining and updating a model:
and (3) monitoring the performance of the model by adopting an automatic process, and automatically triggering the model to retrain when the performance is reduced.
8. The commodity resource knowledge graph algorithm model for the intelligent community according to claim 1, wherein the residential electricity consumption situation analysis algorithm sub-model based on data mining is constructed according to the ternary structure construction principle and the total construction process, and the specific construction process is as follows:
a, data acquisition and preprocessing:
acquiring residential electricity data from a community power supply station, wherein the residential electricity data comprises user files and electricity history data; preprocessing data, namely removing redundancy, processing missing values and abnormal values, normalizing the data, and converting the data in different ranges into a uniform numerical range so as to eliminate dimension differences;
b, characteristic engineering:
Extracting features for clustering from the integrated data, wherein the features comprise electricity consumption, electricity consumption time period and electricity consumption type; selecting features representative of the user classification using a statistical method;
c, cluster analysis:
adopting a K-means clustering algorithm, combining with an optimized K-means+ model, and selecting an initial clustering center by considering the distance between clusters, the compactness of objects in the clusters and the density of the objects;
d, clustering result analysis:
evaluating the inter-cluster distance of the clustering result to ensure that users of different categories are correctly distinguished; checking the density of the objects in each cluster to ensure that the objects in the clusters meet the requirement of compactness;
and e, extracting user behavior characteristics:
analyzing the user load curve in each cluster, and extracting the power consumption behavior characteristics of typical users;
f, continuous optimization and monitoring:
and periodically monitoring the performance of the algorithm model, collecting new data and user feedback information, and taking the new data and the user feedback as the basis of continuous optimization of the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311328941.3A CN117076691A (en) | 2023-10-16 | 2023-10-16 | Commodity resource knowledge graph algorithm model oriented to intelligent communities |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311328941.3A CN117076691A (en) | 2023-10-16 | 2023-10-16 | Commodity resource knowledge graph algorithm model oriented to intelligent communities |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117076691A true CN117076691A (en) | 2023-11-17 |
Family
ID=88717392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311328941.3A Pending CN117076691A (en) | 2023-10-16 | 2023-10-16 | Commodity resource knowledge graph algorithm model oriented to intelligent communities |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117076691A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117473431A (en) * | 2023-12-22 | 2024-01-30 | 青岛民航凯亚系统集成有限公司 | Airport data classification and classification method and system based on knowledge graph |
CN118379116A (en) * | 2024-06-24 | 2024-07-23 | 南京信息工程大学 | Deep learning-based Internet data deep mining method and system |
CN118628325A (en) * | 2024-08-15 | 2024-09-10 | 四川民望科技集团有限公司 | Intelligent community comprehensive service system based on gridding management |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850629A (en) * | 2015-05-21 | 2015-08-19 | 杭州天宽科技有限公司 | Analysis method of massive intelligent electricity-consumption data based on improved k-means algorithm |
CN112819299A (en) * | 2021-01-21 | 2021-05-18 | 上海电力大学 | Differential K-means load clustering method based on center optimization |
CN114399202A (en) * | 2022-01-17 | 2022-04-26 | 青岛文达通科技股份有限公司 | Big data visualization system for urban community |
CN114693404A (en) * | 2022-04-11 | 2022-07-01 | 青岛文达通科技股份有限公司 | Collaborative measurement-based commodity personalized recommendation method and system |
CN115221402A (en) * | 2022-07-15 | 2022-10-21 | 青岛文达通科技股份有限公司 | Food personalized recommendation method and system based on self-encoder |
CN116150489A (en) * | 2023-02-20 | 2023-05-23 | 青岛文达通科技股份有限公司 | Entertainment place recommendation method and system based on multi-attribute decision |
CN116739626A (en) * | 2022-02-28 | 2023-09-12 | 北京沃东天骏信息技术有限公司 | Commodity data mining processing method and device, electronic equipment and readable medium |
-
2023
- 2023-10-16 CN CN202311328941.3A patent/CN117076691A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850629A (en) * | 2015-05-21 | 2015-08-19 | 杭州天宽科技有限公司 | Analysis method of massive intelligent electricity-consumption data based on improved k-means algorithm |
CN112819299A (en) * | 2021-01-21 | 2021-05-18 | 上海电力大学 | Differential K-means load clustering method based on center optimization |
CN114399202A (en) * | 2022-01-17 | 2022-04-26 | 青岛文达通科技股份有限公司 | Big data visualization system for urban community |
CN116739626A (en) * | 2022-02-28 | 2023-09-12 | 北京沃东天骏信息技术有限公司 | Commodity data mining processing method and device, electronic equipment and readable medium |
CN114693404A (en) * | 2022-04-11 | 2022-07-01 | 青岛文达通科技股份有限公司 | Collaborative measurement-based commodity personalized recommendation method and system |
CN115221402A (en) * | 2022-07-15 | 2022-10-21 | 青岛文达通科技股份有限公司 | Food personalized recommendation method and system based on self-encoder |
CN116150489A (en) * | 2023-02-20 | 2023-05-23 | 青岛文达通科技股份有限公司 | Entertainment place recommendation method and system based on multi-attribute decision |
Non-Patent Citations (3)
Title |
---|
张桂颖等: "乡村振兴背景下吉林省农村居民幸福感差异分析", 《通化师范学院学报》, pages 1 - 3 * |
郭鑫: "机器学习分类算法在居民收入预测中的应用", 《中国优秀硕士学位论文全文库》, pages 3 * |
阿图罗著: "《互联电力系统广域监测技术》", pages: 86 - 87 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117473431A (en) * | 2023-12-22 | 2024-01-30 | 青岛民航凯亚系统集成有限公司 | Airport data classification and classification method and system based on knowledge graph |
CN118379116A (en) * | 2024-06-24 | 2024-07-23 | 南京信息工程大学 | Deep learning-based Internet data deep mining method and system |
CN118628325A (en) * | 2024-08-15 | 2024-09-10 | 四川民望科技集团有限公司 | Intelligent community comprehensive service system based on gridding management |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110070145B (en) | LSTM hub single-product energy consumption prediction based on incremental clustering | |
CN117076691A (en) | Commodity resource knowledge graph algorithm model oriented to intelligent communities | |
CN111324642A (en) | Model algorithm type selection and evaluation method for power grid big data analysis | |
CN112561156A (en) | Short-term power load prediction method based on user load mode classification | |
CN109685277A (en) | Electricity demand forecasting method and device | |
CN117151870B (en) | Portrait behavior analysis method and system based on guest group | |
CN110826237B (en) | Wind power equipment reliability analysis method and device based on Bayesian belief network | |
CN111815054A (en) | Industrial steam heat supply network short-term load prediction method based on big data | |
CN113627735A (en) | Early warning method and system for safety risk of engineering construction project | |
CN112418476A (en) | Ultra-short-term power load prediction method | |
CN112446509A (en) | Complex electronic equipment prediction maintenance method | |
CN112884570A (en) | Method, device and equipment for determining model security | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN117952456A (en) | Comprehensive intelligent evaluation method and system based on enterprise-related contracts | |
CN115481841A (en) | Material demand prediction method based on feature extraction and improved random forest | |
CN112347162A (en) | Multivariate time sequence data rule mining method based on online learning | |
Tang et al. | Leveraging socioeconomic information and deep learning for residential load pattern prediction | |
CN110928924A (en) | Power system customer satisfaction analyzing and predicting method based on neural network | |
CN114548212A (en) | Water quality evaluation method and system | |
Jiang et al. | SRGM decision model considering cost-reliability | |
CN114372835A (en) | Comprehensive energy service potential customer identification method, system and computer equipment | |
CN112633622B (en) | Smart power grid operation index screening method | |
Liu et al. | Short-term Load Forecasting Approach with SVM and Similar Days Based on United Data Mining Technology | |
Zhu et al. | Identification of Related Factors of Users’ Power Consumption and Prediction Model of Power Consumption Based on Random Forest Algorithm | |
CN118133051B (en) | Construction method and device of element evaluation model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20231117 |
|
RJ01 | Rejection of invention patent application after publication |